347. Top K Frequent Elements

Medium

Problem:

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]
Input: nums = [1,2], k = 2
Output: [1,2]

What to learn:

zip()

The zip() function creates a new tuple by pairing items with the same index from two or more sequences, based on the shortest length.

In Python 3+, it returns a generator, so if you want to extract the actual values, you can wrap it with list().

>>> a = [1,2,3,4,5]
>>> b = [2,3,4,5]
>>> c = [3,4,5]
>>> zip(a,b)
<zip object at 0x105b6d9b0>

>>> list(zip(a,b))
[(1, 2), (2, 3), (3, 4), (4, 5)]
>>> list(zip(a,b,c))
[(1, 2, 3), (2, 3, 4), (3, 4, 5)]

*(Asterisk)

"In Python, * is for unpacking. It's a sequence unpacking operator, which mainly used for unpacking tuples or lists."

>>> fruits = ["lemon", "pear", "watermelon", "tomato"]
>>> fruits
["lemon", "pear", "watermelon", "tomato"]

>>> print(fruits[0], fruits[1], fruits[2], fruits[3])
lemon pear watermelon tomato

>>> print(*fruits)
lemon pear watermelon tomato

Not only for unpacking, but when used as a parameter in a function, it can also be used for packing.

>>> def f(*params):
...    print(params)
>>> f('a', 'b', 'c')
('a', 'b', 'c')

Variable assignment can also be done using *, which takes all the remaining values.

>>> a, b* = [1,2,3,4]
>>> a
1
>>> b
[2, 3, 4]
>>> *a, b = [1,2,3,4]
>>> a
[1, 2, 3]
>>> b
4

** is used to unpack key/value pairs.

>>> date_info = {'year': '2023', 'month': '08', 'day': '16'}
>>> new_info = {**date_info, 'day': '14'}
>>> new_info
{'year': '2023', 'month': '08', 'day': '14'}

Solution:

Counter

  1. Create a hash table using the element value as the key.

  2. Store the frequency count in it and then extract k times using a priority queue.

Here, the frequency is used as the key, and the key from freqs is used as the value. Since heaps are sorted by their keys, the frequency needs to be used as the key. In other words, the key and value are swapped when adding to the heap. Also, since Python's heapq module only supports min-heaps, the value is stored as a negative number.

import collections
import heapq

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        freqs = collections.Counter(nums)
        freqs_heap = []
        
        for f in freqs:
            heapq.heappush(freqs_heap, (-freqs[f], f))

        topk = list()
        for _ in range(k):
            topk.append(heapq.heappop()[1])
            
        return topk

Bucket Sort

Instead of considering the given number as the key, we can think of the counted value as the key. We can then append the list of values.

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        # Bucket Sort: {count: value}
        count = {}
        # 1부터 시작해야 하니까
        freq = [[] for i in range(len(nums) + 1)]
        
        # 버킷 만들기
        for n in nums:
            # get(key, value): a value to return if the specified key does not exist
            count[n] = 1 + count.get(n, 0)
        for n, c in count.items():
            freq[c].append(n)
        
        # Ensure elements in each bucket are sorted
        for f in freq:
            f.sort()
        
        # Top k Frequent Elements 뽑아내기
        res = []
        for f in reversed(freq):
            for n in f:
                res.append(n)
                if len(res) == k:
                    return res
       

Pythonic way

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        return list(zip(*))[0] # tuple
When not unpacking:
>>> list(zip(collections.Counter(nums).most_common(k)))
[(1, 3),), ((2, 2),)]

Last updated