Simple implementation of dictionary tree, prefix tree and Trie tree [insert, search and find prefix]

Posted by Courbois on Fri, 04 Mar 2022 21:07:43 +0100

Simple implementation of dictionary tree, prefix tree and Trie tree [insert, search and find prefix]

1.Trie tree

Trie tree, namely prefix tree, also known as word lookup tree and dictionary tree, is a tree structure and a variant of hash tree. The typical application is to count and sort a large number of strings (but not limited to strings), so it is often used by search engine system for text word frequency statistics. Its advantages are: to minimize unnecessary string comparison, and the query efficiency is higher than that of hash table.

2. Data structure

Trie is an atypical multi tree model, that is, the number of branches per node may be multiple.

Each node contains the following fields:

  • Pointer array to child node next. For lowercase English words, the array length is 26, that is, the number of lowercase English letters. At this time, next[0] corresponds to lowercase letter a, next[1] corresponds to lowercase letter b,..., and next[25] corresponds to lowercase letter z.
  • Boolean field isEnd, indicating whether the node is the end of the string.

come from link Small examples:
The nodes in Trie are as follows: (assuming only the characters in 'a' ~ 'z')

struct TrieNode {
    bool isEnd; //Is the end of the string a node
    TrieNode* next[26]; //Alphabet mapping table
};

The TrieNode node does not directly store the data members of character values, but uses the following table of the next array to map characters. TrieNode* next[26] saves the links of all characters that may appear next for the current node.

Trie containing three words "sea", "sellers" and "she", as follows:

When you see this picture, the implementation of Trie tree is very simple.

3. Realization

  • Insert:
    public void insert(String word) {
         Trie node = this;  // Node points to the current node
         for(int i=0; i< word.length();i++){ // Traverse every character in word
             char c = word.charAt(i);
             if(node.next[c - 'a'] == null){  
             // When there is no corresponding character on the chain, a new node is opened
                 node.next[c-'a'] = new Trie();
             }
             node = node.next[c-'a']; // The current node points to the next node
         }
         node.isEnd = true;// Mark the end of insertion, indicating the end of the word
    }
  • Search:
    Start the traversal from the root node. If word is the end of the traversal, but the node is empty, return false, otherwise return isEnd.
 public boolean search(String word) {
        Trie node = this;
        for(int i = 0; i < word.length(); i++){
            char c = word.charAt(i);
            node = node.next[c-'a'];
            if(node == null){
                return false;
            }
        }
        return node.isEnd;
    }
  • Find prefix:
public boolean startsWith(String prefix) {
         Trie node = this;
         for(int i = 0 ;i< prefix.length();i++){
             char c = prefix.charAt(i);
             node = node.next[c-'a'];
             if(node == null){
                 return false;
             }
         }
         return true;
    }

4. Complete code

#JAVA
class Trie {
    
    private Trie[] next;
    private boolean isEnd;

    /** Initialize your data structure here. */
    public Trie() {
         next = new Trie[26];
         isEnd = false;
    }
    
    /** Inserts a word into the trie. */
    public void insert(String word) {
         Trie node = this;
         for(int i=0; i< word.length();i++){
             char c = word.charAt(i);
             if(node.next[c - 'a'] == null){
                 node.next[c-'a'] = new Trie();
             }
             node = node.next[c-'a'];
         }
         node.isEnd = true;
    }
    
    /** Returns if the word is in the trie. */
    public boolean search(String word) {
        Trie node = this;
        for(int i = 0; i < word.length(); i++){
            char c = word.charAt(i);
            node = node.next[c-'a'];
            if(node == null){
                return false;
            }
        }
        return node.isEnd;
    }
    
    /** Returns if there is any word in the trie that starts with the given prefix. */
    public boolean startsWith(String prefix) {
         Trie node = this;
         for(int i = 0 ;i< prefix.length();i++){
             char c = prefix.charAt(i);
             node = node.next[c-'a'];
             if(node == null){
                 return false;
             }
         }
         return true;
    }
}
#C++
class Trie {
private:
    bool isEnd;
    Trie* next[26];
public:
    /** Initialize your data structure here. */
    Trie() {
        isEnd = false;
        memset(next, 0, sizeof(next));
    }
    
    /** Inserts a word into the trie. */
    void insert(string word) {
        Trie* node = this;  // Node points to the current node
        for(char c : word){  // Traverse every character in word
            if(node->next[c-'a'] == NULL){  
                node->next[c-'a'] = new Trie(); 
            }
            node = node->next[c-'a'];  // The current node points to the next node
        }
        node->isEnd = true;  // End of ID insertion
    }
    
    /** Returns if the word is in the trie. */
    bool search(string word) {
        Trie* node = this;
        for(char c : word){
            node = node->next[c-'a'];
            if(node == NULL){
               return false;
            }
        }
        return node->isEnd;

    }
    
    /** Returns if there is any word in the trie that starts with the given prefix. */
    bool startsWith(string prefix) {
        Trie* node = this;
        for(char c : prefix){
            node = node->next[c-'a'];
            if(node == NULL){
                return false;
            }
        }
        return true;
    }
};

Reference link:

Topics: Java Algorithm data structure leetcode string