preface

When we type a string into a search engine, this word is associated with this. The dictionary tree is a data structure designed for this scenario.

1. Data structure

The dictionary tree, i.e.,Trie, also known as word lookup tree or key tree, is a tree structure. The typical application is to count and sort large numbers of strings (but not limited to strings), so it is often used for text word frequency statistics by search engine systems.

  • It minimizes unnecessary string comparisons and makes queries more efficient than Hash tables.

  • Essentially trading space for time

  • Each node can store some additional information. For example, you can store how many times this node has been queried. According to the number of times, you can recommend some relevant information to the user.

Two, basic nature

  • The node itself does not store the full word.
  • From the root node to a node, the characters passing through the path are connected to the string corresponding to the node.
  • The paths of all children of each node represent different characters

Three, node internal implementation

  1. Each node is an array of characters. For example, if you only have English, this is an array of letters of length 26. Of course, if it were real, it would be more than 26 letters long.

  2. Each element in the current node array stores the character position of the next node, pointing to NULL if the element in the array has no next character.

  3. When we search for a word, we start from the first character of the word and search each node in turn. If the search character corresponding to the array in the node points to NULL, it means that the character does not exist. For example, when we look up the word ABC using the Trie above, the a subscript of the root node points to the B subscript of the next node, and the B subscript points to the C subscript of the wash node. So at this point we can find the node of a, B, and C, so the Trie contains the character ABC.

Achieve topK word retrieval

As in the screenshot of the introduction, when we type certain characters into each search engine. We all have popular search terms related to these words.

The idea is to store the number of searches for the word in each node of the Trie. When we type a character, the search engine takes the last node of the character we typed as the root node, traversing a range of children to find topK words.

Five, simple implementation

Here we simply implement a dictionary tree with only 26 letters of words in Java

* Implement a Trie (prefix tree) with insert, search, and startsWith operations. * *@author walker
 * @date2020/9/8 * /
public class Trie {

    private final TrieNode root;

    /** * constructor */
    public Trie(a) {
        root = new TrieNode(' ');
    }

    / * * *@paramSTR is the string to insert */
    public void insert(String str) {
        TrieNode node = root;
        char[] chars = str.toCharArray();
        for (char c : chars) {
            if (node.children[c - 'a'] = =null) {
                node.children[c - 'a'] = new TrieNode(c);
            }
            node = node.children[c - 'a'];
        }
        node.isWord = true;
    }

    /** * finds the value ** in the dictionary tree@paramWord is looking for the value *@returnWhether */ exists
    public boolean search(String word) {
        TrieNode node = root;
        char[] chars = word.toCharArray();
        for (char c : chars) {
            if (node.children[c - 'a'] = =null) {
                return false;
            }
            node = node.children[c - 'a'];
        }
        return node.isWord;
    }

    /** * Whether there is a string ** prefixed with the target string@paramThe prefix prefix *@returnWhether */ exists
    public boolean startWith(String prefix) {
        TrieNode node = root;
        char[] chars = prefix.toCharArray();
        for (char c : chars) {
            if (node.children[c - 'a'] = =null) {
                return false;
            }
            node = node.children[c - 'a'];
        }
        return true; }}class TrieNode {
    char val;
    boolean isWord;
    TrieNode[] children = new TrieNode[26];

    TrieNode() {
    }

    TrieNode(char c) {
        TrieNode node = newTrieNode(); node.val = c; }}Copy the code