Small knowledge, big challenge! This paper is participating in theEssentials for programmers”Creative activities

The TopK problem of the number of occurrences of a string

Problem description

Given an array of strings and an integer k, return the string with the name k preceding the occurrence and the corresponding number. The answers returned should be sorted in order of frequency of occurrence of the string. If different strings have the same frequency, sort them lexicographically.

Example:

Enter: [” A “,” B “,”c”,”b”], 2

Output: [[” b “, “2”], [” a “, “1”]]

To analyze problems

The most intuitive solution to this problem is to iterate over the string data, count the number of occurrences of the elements in a dictionary, sort the elements in the dictionary by value, and then by key if the values are equal, and return the first K values.

Class Solution: def topKstrings(self, strings, k): # write code here dic={} Return sorted(dic. Items (),key=lambda kv:(-kv[1],kv[0]))[0:k]Copy the code

After counting the number of occurrences of each string, we can also use the priority queue to figure out the topk elements.

  1. We start by adding the first K elements of the dictionary to the priority queue (the default is the small top heap).
  2. If num of the remaining nodes in the map is greater than num of the top node, the heap is added to the map.
  3. Output K elements from back to front, the largest K elements.
import heapq class Word: def __init__(self, word, count): self.word = word self.count = count def __lt__(self, other): If count is the same, the smaller the lexicographical order, the higher the order. = other.count: return self.count < other.count else: return self.word > other.word class Solution: Def topKstrings(self, strings, k): # write code here dic = {} Dic [s] = dic[s] + 1 else: dic[s] = 1 pq = [] # heapq.heappush(pq, Word(word, dic[word])) if len(pq) > k: Pq.sort (reverse=True) return [[x.woord, x.ount] for x in pq]Copy the code