This article is participating in Python Theme Month. See the link to the event for more details

Topic describes

This is 451 on LeetCode. Sorted by character occurrence, difficulty is medium.

Tag: mock, bucket sort, hash table, array, priority queue (heap)

Given a string, arrange the characters in descending order of occurrence.

Example 1:

Input: "tree" Output: "eert" Explanation: 'e' occurs twice, 'r' and 't' both occur once. So 'e' must come before 'r' and 't'. In addition, "eetr" is a valid answer.Copy the code

Example 2:

Input: "CCCAaa" Output: "CCCAAA" Explanation: 'c' and 'a' both occur three times. In addition, "AAACCC" is also a valid answer. Note that "cacaca" is incorrect because the same letters must be put together.Copy the code

Example 3:

Input: "Aabb" Output: "bbAa" Explanation: In addition, "bbAa" is also a valid answer, but "Aabb" is incorrect. Note that 'A' and 'A' are considered two different characters.Copy the code

Data structure + simulation

This is a simulation that examines the use of data structures.

Specific practices are as follows:

  1. First use the “hash table” for word frequency statistics;
  2. Walk through the frequency-counted hash table and pair each key with each value{character, word frequency}Is stored in the priority queue (heap). And the “priority queue (heap)” ordering logic is:
    • ifWord frequencyDifferent, according toWord frequencyReverse order.
    • ifWord frequencySame, according toCharacter lexicographical orderAscending (since this topic uses the Special Judge mechanism, this sorting strategy can be adjusted at will. However, in order to ensure that the ordering logic satisfies the “full order relationship”, this place can be written in the front and back, but theoretically cannot be written, otherwise cannot ensure the same sorting result every time);
  3. Construct the answer from the priority queue (heap).

Code:

class Solution {
    class Node {
        char c; 
        int v;
        Node(char _c, int_v) { c = _c; v = _v; }}public String frequencySort(String s) {
        char[] cs = s.toCharArray();
        Map<Character, Integer> map = new HashMap<>();
        for (char c : cs) {
            map.put(c, map.getOrDefault(c, 0) + 1);
        }
        PriorityQueue<Node> q = new PriorityQueue<>((a,b)->{
            if(b.v ! = a.v)return b.v - a.v;
            return a.c - b.c;
        });
        for (char c : map.keySet()) {
            q.add(new Node(c, map.get(c)));
        }
        StringBuilder sb = new StringBuilder();
        while(! q.isEmpty()) { Node poll = q.poll();int k = poll.v;
            while (k-- > 0) sb.append(poll.c);
        }
        returnsb.toString(); }}Copy the code
class Solution:
    def frequencySort(self, s: str) - >str:
        return "".join(char * repeats for char,repeats in sorted(Counter(s).items(), key=lambda x:-x[1]))
Copy the code
  • Time complexity: Set the character set size to CCC. The complexity of counting word frequency using “hash table” is O(n)O(n)O(n); In the worst case, all characters in the character set are present, and up to CCC nodes are added to the priority queue (heap), the complexity is O(Clog⁡C)O(C\log{C})O(ClogC); Constructing the answer requires taking elements from the priority queue (heap) and concatenating them with complexity O(n)O(n)O(n). The overall complexity of O (Max ⁡ (n, Clog ⁡ C)) O (\ Max (n, C \ log {C})) O (Max (n, ClogC))
  • Space complexity: O(n)O(n)O(n)

Array implementation + simulation

The basic idea remains the same. The data structure used in the above procedure is replaced by an array.

Specifically, the use of the ASCII character set 128128128 bits, pre-established a size of 128128128 array, the use of “bucket sort” ideas instead of “hash table” and “priority queue (heap)” role.

Code:

class Solution {   
    public String frequencySort(String s) {
        int[][] cnts = new int[128] [2];
        char[] cs = s.toCharArray();
        for (int i = 0; i < 128; i++) cnts[i][0] = i;
        for (char c : cs) cnts[c][1] + +; Arrays.sort(cnts, (a, b)->{if (a[1] != b[1]) return b[1] - a[1];
            return a[0] - b[0];
        });
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 128; i++) {
            char c = (char)cnts[i][0];
            int k = cnts[i][1];
            while (k-- > 0) sb.append(c);
        }
        returnsb.toString(); }}Copy the code
class Solution:
    def frequencySort(self, s: str) - >str:
        map = defaultdict(int)
        for c in s:
            map[c] += 1
        pq = []
        for k,v in map.items():
            heapq.heappush(pq, (-v, ord(k), k))
        ans = []
        while pq:
            repeats, _, char = heapq.heappop(pq)
            ans.append(char*-repeats)
        return "".join(ans)
Copy the code
  • Time complexity: Set the character set size to CCC. Complexity of O (Max ⁡ (n, Clog ⁡ C)) O (\ Max (n, C \ log {C})) O (Max (n, ClogC))
  • O(n+C+log⁡C)O(n +C+ \log{C})O(n+C+logC)

The last

This is the 451 article in our “Brush through LeetCode” series, which began on 2021/01/01. As of the start date, there are 1916 questions on LeetCode, some with locks, and we will first brush through all the questions without locks.

In this series of articles, in addition to explaining how to solve the problem, I’ll give you the most concise code possible. If the general solution is involved, the corresponding code template will be provided.

In order to facilitate the students to debug and submit the code on the computer, I set up a related warehouse: github.com/SharingSour… .

In the repository address, you can see the solution to the series, the corresponding code to the series, the LeetCode link, and other preferred solutions to the series.