This article is participating in Python Theme Month. See the link to the event for more details

Topic describes

This is the number of 726 atoms in LeetCode, and the difficulty is difficult.

Tag: simulation, data structure application, Stack, hash table, priority queue

Given a chemical formula (as a string), return the number of each atom.

Atoms always begin with a capital letter, followed by zero or any lowercase letters that indicate the name of the atom.

If the number is greater than one, the number of atoms will be followed by a number. If the number is equal to 1, it doesn’t follow the number. For example, H2O and H2O2 are feasible, but the expression H1O2 is not.

The two formulas joined together are a new formula. For example, H2O2He3Mg4 is also a chemical formula.

A chemical expression and optionally a number in parentheses are also chemical expressions. For example, (H2O2) and (H2O2)3 are chemical formulas.

Given a chemical formula, output the number of all atoms. The format is: the name of the first atom (in lexicographical order), followed by its number (if the number is greater than 1), then the name of the second atom (in lexicographical order), followed by its number (if the number is greater than 1), and so on.

Example 1:

Input: formula = "H2O" Output: "H2O" Explanation: The number of atoms is {'H': 2, 'O': 1}.Copy the code

Example 2:

Input: formula = "Mg (OH) 2" output: "H2MgO2" explanation: the number of atoms is {' H ', 2 'Mg: 1,' O ': 2}.Copy the code

Example 3:

Input: formula = "K4 (ON (SO3) 2) 2" output: "K4N2O14S4" explanation: the number of atoms is {' K ', 4, 'N' : 2, 'O' : 14, 'S' : 4}.Copy the code

Note:

  • The first letter of all atoms is uppercase and the rest is lowercase.
  • formulaThe length of is between [1, 1000].
  • formulaContains only letters, numbers, and parentheses, and is given a legal chemical formula.

Data structure + simulation

A comprehensive simulation problem.

227. Compared with the expression calculation problem of the basic calculator II, the difficulty of designing the simulation process in this problem is much lower, because more data structures are used in the so-called locating difficulty estimation.

For convenience, we agree on the following names:

  • Call a complete sequence of letters “atoms.”
  • Call a complete sequence of numbers a “number”
  • According to()Is “symbol”

The basic implementation idea is as follows:

  • In the process of processing the input parameter S, a hash table map is always maintained. In the map, the actual “value” corresponding to an “atom” is maintained in real time (that is, the storage format is {H:2, s :1}).

    Since the same atom can be placed in different positions in S, and the multiplicity of atoms is repeated for a certain number, we use a trick here: add a number suffix to each atom. That is, the actual storage is {H_1:2, S_2:1, H_3:1}.

  • Discuss according to the current processing of characters:

    • Symbol: direct push;
    • Atom: Continue to fetch until you get the full atom name. Push the full atom name on the stack whilemapAdd in the count
      1 1
      ;
    • Value: Continue to take until the full value is obtained and parsed, and then based on whether the top element of the stack is)Symbol to determine which atoms the value is applied to:
      • If the top element of the stack is not), indicating that this value applies only to the top atom of the stack
      • If the top stack element is), the current value can be applied to “a contiguous segment” of atoms
  • The map of atoms do “merge” operation: {H_1:2, S_2:1, H_3:1} = > {H: 3, S: 1};

  • Use a “priority queue (heap)” for lexicographical sorting (you can also use a List directly and then sort through collections.sort) and construct the answers.

Java code:

class Solution {
    class Node {
        String s; int v;
        Node (String _s, int_v) { s = _s; v = _v; }}public String countOfAtoms(String s) {
        int n = s.length();
        char[] cs = s.toCharArray();
        Map<String, Integer> map = new HashMap<>();
        Deque<String> d = new ArrayDeque<>();
        int idx = 0;
        for (int i = 0; i < n; ) {
            char c = cs[i];
            if (c == '(' || c == ') ') {
                d.addLast(String.valueOf(c));
                i++;
            } else {
                if (Character.isDigit(c)) {
                    // Get the complete number and parse the corresponding value
                    int j = i;
                    while (j < n && Character.isDigit(cs[j])) j++;
                    String numStr = s.substring(i, j);
                    i = j;
                    int cnt = Integer.parseInt(String.valueOf(numStr));  

                    // If the top element of the stack is), the current value can be applied to a "contiguous segment" of atoms
                    if(! d.isEmpty() && d.peekLast().equals(")")) {
                        List<String> tmp = new ArrayList<>();

                        d.pollLast(); // pop )
                        while(! d.isEmpty() && ! d.peekLast().equals("(")) {
                            String cur = d.pollLast();
                            map.put(cur, map.getOrDefault(cur, 1) * cnt);
                            tmp.add(cur);
                        }
                        d.pollLast(); // pop (

                        for (int k = tmp.size() - 1; k >= 0; k--) {
                            d.addLast(tmp.get(k));
                        }

                    // If the top element is not), the current value can only be applied to the top atom of the stack
                    } else {
                        String cur = d.pollLast();
                        map.put(cur, map.getOrDefault(cur, 1) * cnt); d.addLast(cur); }}else {
                    // Get the full atom name
                    int j = i + 1;
                    while (j < n && Character.isLowerCase(cs[j])) j++;
                    String cur = s.substring(i, j) + "_" + String.valueOf(idx++);
                    map.put(cur, map.getOrDefault(cur, 0) + 1); i = j; d.addLast(cur); }}}// Merge the same atoms with different numbers
        Map<String, Node> mm = new HashMap<>();
        for (String key : map.keySet()) {
            String atom = key.split("_") [0];
            int cnt = map.get(key);
            Node node = null;
            if (mm.containsKey(atom)) {
                node = mm.get(atom);
            } else {
                node = new Node(atom, 0);
            }
            node.v += cnt;
            mm.put(atom, node);
        }

        // Use the priority queue (heap) to lexicographically sort the nodes and construct the answers
        PriorityQueue<Node> q = new PriorityQueue<Node>((a,b)->a.s.compareTo(b.s));
        for (String atom : mm.keySet()) q.add(mm.get(atom));

        StringBuilder sb = new StringBuilder();
        while(! q.isEmpty()) { Node poll = q.poll(); sb.append(poll.s);if (poll.v > 1) sb.append(poll.v);
        }

        returnsb.toString(); }}Copy the code

Python 3 code:

class Solution:
    def countOfAtoms(self, formula: str) - >str:
        n = len(formula)
        map = defaultdict(lambda: 1)
        d = deque([])
        i = idx = 0
        while i < n:
            c = formula[i]
            if c == '(' or c == ') ':
                d.append(c)
                i += 1
            else:
                if str.isdigit(c):
                    # Get the complete number and parse out the corresponding value
                    j = i
                    while j < n and str.isdigit(formula[j]):
                        j += 1
                    cnt = int(formula[i:j])
                    i = j
                    If the top element of the stack is), the current value can be applied to a contiguous segment of atoms
                    if d and d[-1] = =') ':
                        tmp = []
                        d.pop()
                        while d and d[-1] != '(':
                            cur = d.pop()
                            map[cur] *= cnt
                            tmp.append(cur)
                        d.pop()

                        for k in range(len(tmp) - 1, -1, -1):
                            d.append(tmp[k])
                    # If the top element is not), the current value can only be applied to the top atom
                    else:
                        cur = d.pop()
                        map[cur] *= cnt
                        d.append(cur)
                else:
                    # Get the full atomic name
                    j = i + 1
                    while j < n and str.islower(formula[j]):
                        j += 1
                    cur = formula[i:j] + "_" + str(idx)
                    idx += 1
                    map[cur] = 1
                    i = j
                    d.append(cur)

        # Merge the same atoms with different numbers
        mm = defaultdict(int)
        for key, cnt in map.items():
            atom = key.split("_") [0]
            mm[atom] += cnt

        # Sort the key in mm as the answer
        ans = []
        for key in sorted(mm.keys()):
            if mm[key] > 1:
                ans.append(key+str(mm[key]))
            else:
                ans.append(key)
        return "".join(ans)
Copy the code
  • Time complexity: in the worst case, every time a value is processed, an element needs to be pulled from the stack for application and processingsThe complexity of is
    O ( n 2 ) O(n^2)
    ; In the worst case, each atom is distributed independently, and the complexity of the merge is
    O ( n ) O(n)
    ; The complexity of storing the merged content into the priority queue and fetching the constructed answer is
    O ( n log n ) O(n\log{n})
    ; The overall complexity is
    O ( n 2 ) O(n^2)
  • Space complexity: O(n)O(n)O(n)

The last

This is article No.726 in our “Brush through LeetCode” series, which began on 2021/01/01. As of the start date, there are 1916 questions on LeetCode, some with locks, and we will first brush through all the questions without locks.

In this series of articles, in addition to explaining how to solve the problem, I’ll give you the most concise code possible. If the general solution is involved, the corresponding code template will be provided.

In order to facilitate the students to debug and submit the code on the computer, I set up a related warehouse: github.com/SharingSour… .

In the repository address, you can see the solution to the series, the corresponding code to the series, the LeetCode link, and other preferred solutions to the series.