Dictionary tree (prefix tree)

Posted by raker7 on Sun, 02 Jan 2022 23:51:42 +0100

Dictionary tree (prefix tree)


summary:

The dictionary tree (Trie) is used to determine whether a string exists or has a string prefix.

Why do we need a dictionary tree to solve this kind of problem? If we have a dictionary that stores nearly 10000 words, even if we use hash, the actual cost of searching for a word in it is very large, and we can't easily support searching for word prefixes. However, since the length n of an English word is usually less than 10, if we use a dictionary tree, we can complete the search in O(n) - approximate O(1), and the additional overhead is very small.



Trie (dictionary tree), also known as prefix tree, is a rooted tree. Each node of trie contains the following fields:

  • Pointer array children pointing to child nodes. The length of the array is 26, that is, the number of lowercase English letters. At this time, children[0] corresponds to lowercase letter a, children[1] corresponds to lowercase letter b,..., and children[25] corresponds to lowercase letter z.
  • Boolean field isEnd, indicating whether the node is the end of the string.


Insert string

We start at the root of the dictionary tree and insert a string. For the child node corresponding to the current character, there are two cases:

  • Child nodes exist. Move along the pointer to the child node to continue processing the next character.
  • Child node does not exist. Create a new child node, record it in the corresponding position of the children array, and then move along the pointer to the child node to continue searching for the next character.

Repeat the above steps until the last character of the string is processed, and then mark the current node as the end of the string.


Find prefix

We start at the root of the dictionary tree and look for the prefix. For the child node corresponding to the current character, there are two cases:

  • Child nodes exist. Move along the pointer to the child node and continue searching for the next character.
  • Child node does not exist. Description the dictionary tree does not contain the prefix and returns a null pointer.
    Repeat the above steps until a null pointer is returned or the last character of the prefix is searched.

If the search reaches the end of the prefix, it indicates that the prefix exists in the dictionary tree. In addition, if the isEnd of the corresponding node at the end of the prefix is true, it indicates that the string exists in the dictionary tree.


java code implementation:

class Trie {
    TrieNode root;

    public Trie() {
        this.root = new TrieNode();
    }

    // Insert a word into the dictionary tree
    void insert(String word) {
        TrieNode temp = root;
        for (int i = 0; i < word.length(); ++i) {
            if (temp.childNode[word.charAt(i) - 'a'] == null) {
                temp.childNode[word.charAt(i) - 'a'] = new TrieNode();
            }
            temp = temp.childNode[word.charAt(i) - 'a'];
        }
        temp.isVal = true;
    }

    // Judge whether there is a word in the dictionary tree
    boolean search(String word) {
        TrieNode temp = root;
        for (int i = 0; i < word.length(); ++i) {
            if (temp == null) {
                break;
            }
            temp = temp.childNode[word.charAt(i) - 'a'];
        }
        return temp != null ? temp.isVal : false;
    }

    // Determines whether the dictionary tree has a prefix that begins with a word
    boolean startsWith(String prefix) {
        TrieNode temp = root;
        for (int i = 0; i < prefix.length(); ++i) {
            if (temp == null) {
                break;
            }
            temp = temp.childNode[prefix.charAt(i) - 'a'];
        }
        return temp != null;
    }
}

class TrieNode {
    public TrieNode[] childNode;
    boolean isVal;

    public TrieNode() {
        childNode = new TrieNode[26];
        isVal = false;
        for (int i = 0; i < 26; ++i) {
            childNode[i] = null;
        }
    }
}


C + + code implementation:

class TrieNode {
public:
	TrieNode* childNode[26];
	bool isVal;
	TrieNode(): isVal(false) {
		for (int i = 0; i < 26; ++i) {
			childNode[i] = nullptr;
		}
	}
};
class Trie {
	TrieNode* root;
public:
	Trie(): root(new TrieNode()) {}

	// Insert a word into the dictionary tree
	void insert(string word) {
		TrieNode* temp = root;
		for (int i = 0; i < word.size(); ++i) {
			if (!temp->childNode[word[i]-'a']) {
				temp->childNode[word[i]-'a'] = new TrieNode();
			}
			temp = temp->childNode[word[i]-'a'];
		}
		temp->isVal = true;
	}
	
	// Judge whether there is a word in the dictionary tree
	bool search(string word) {
	TrieNode* temp = root;
		for (int i = 0; i < word.size(); ++i) {
			if (!temp) {
				break;
			}
			temp = temp->childNode[word[i]-'a'];
		}
		return temp? temp->isVal: false;
	}
	
	// Determines whether the dictionary tree has a prefix that begins with a word
	bool startsWith(string prefix) {
		TrieNode* temp = root;
		for (int i = 0; i < prefix.size(); ++i) {
			if (!temp) {
				break;
			}
			temp = temp->childNode[prefix[i]-'a'];
		}
		return temp;
	}
};


practice:
208. Implement Trie (Prefix Tree)





Reference: LeetCode 101

Topics: Algorithm data structure pointer string