Aim: Searching for a string in a tree in O(L) time (where L is the length of the string), Performing Partial String Operations, Prefix Queries, etc.
Option 1: Just use a Tree!
If we store a string at every node of a tree, in which all the nodes are sorted lexicographically, then how much time do we take to find a string in the tree?
It takes O(L) to compare two strings of length L. So, in the worst case it would take O(hL)time to find a string in a tree, where h is the height of the tree. We can do better than this!
Using Tries!
Store a letter (instead of a string) at every node of the tree. Then, you just have to compare each letter at every level of the trie. You need to have an end-of-string character to indicate that a string ends at that particular node. This can be done simply by using a variable at every node. Each node stores an array of its children. If you wish to have only lower-case strings in your trie, each node will have an array of length 26 for its children.
Note that it takes O(1) to find the child since you maintain an array of children at each node. For example, if you are searching for “c” as the next character, you know that if the node exists, it would be at index 2 of the children array. No need to loop through the entire array. Thus, O(1) lookup time.
Time
Tries are much faster than trees for string-comparisons (and other cool stuff too!).
Does not depend on the size of the total text.
Does not depend on the number of strings in the trie.
Space
Trie tends to use more space
BST and Trie use O(text size) space.
Trie has more nodes and more overhead.
Applications of Tries
Searching, sorting and enumerating strings in a “dictionary”
Performing partial string operations inlcluding but not limited to:
Prefix queries: find all the strings that start with a specific substring
Long prefix: what is the longest prefix of a given string in the trie
Wildcards: find a string of the form pi??e where ? could be any letter
importjava.util.ArrayList;publicclassTrie {// Wildcardsfinalchar WILDCARD ='.';TrieNode root; //Each trie needs to have a root of the trieprivateclassTrieNode {// 26 (Uppercase) + 26 (Lowecase) + 10 (Numbers) = 62int[] presentChars =newint[62];/* 0 - 9 correspond to the numbers 0 - 9 10 - 35 correspond to the uppercase alphabets A - Z 36 - 61 correspond to the lowercase alphabets a - z */TrieNode[] children =newTrieNode[62];boolean endOfString =false;String c;TrieNode(String c) {this.c= c; }TrieNode() {} }publicTrie() {// TODO: Initialise a trie class here.this.root=newTrieNode(""); } /** * Inserts string s into the Trie. * * @param s string to insert into the Trie */voidinsert(String s) {// TODOinsert_helper(s,this.root,0); } /** * Inserts the ith character of the string into the trie at node * @param s * @param node * @param i */publicvoidinsert_helper(String s,TrieNode node,int i) {if (i >=s.length()) {node.endOfString=true;return; }char character =s.charAt(i);int ascii = (int) character;if (ascii >=48&& ascii <=57) { ascii -=48; //Since ascii = 48 corresponds to index 0 in the children array } elseif (ascii >=65&& ascii <=90) { ascii -=55; //Since ascii = 65 corresponds to index 10 in the children array } elseif (ascii >=97&& ascii <=122) { ascii -=61; }if (node.children[ascii] ==null) node.children[ascii] =newTrieNode(Character.toString(character));insert_helper(s,node.children[ascii], i +1); } /** * Checks whether string s exists inside the Trie or not. * * @return whether string s is inside the Trie */booleancontains(String s) {// TODOreturncontains_helper(s,this.root,0); }publicbooleancontains_helper(String s,TrieNode node,int i) {if (i >=s.length()) returnnode.endOfString;char character =s.charAt(i);int ascii = (int) character;if (ascii >=48&& ascii <=57) { ascii -=48; //Since ascii = 48 corresponds to index 0 in the children array } elseif (ascii >=65&& ascii <=90) { ascii -=55; //Since ascii = 65 corresponds to index 10 in the children array } elseif (ascii >=97&& ascii <=122) { ascii -=61; }if (node.children[ascii] ==null) returnfalse;elsereturncontains_helper(s,node.children[ascii], i +1); } /** * Searches for strings with prefix matching the specified pattern sorted by lexicographical order. This inserts the * results into the specified ArrayList. Only returns at most the first limit results. * * @param s pattern to match prefixes with * @param results array to add the results into * @param limit max number of strings to add into results */voidprefixSearch(String s,ArrayList<String> results,int limit) {// TODOif (results.size() >= limit) return;prefixSearchHelper(s, results, limit,0,this.root,new StringBuilder()); }voidprefixSearchHelper(String s,ArrayList<String> results,int limit,int index,TrieNode node,StringBuilder curr) {if (node ==null||results.size() > limit) return;// If you are at a node, it means that you are meant to be there. So, add that letter to curr without checking// Then continue checking for the restcurr.append(node.c);if (node.endOfString&& index >=s.length()) results.add(String.valueOf(curr));// we have already finished the prefix, just find all possible strings in the trie rooted at node// sort of DFS// We exploit the lazy evaluation of logical operators in java belowif (index >=s.length() ||s.charAt(index) == WILDCARD) {for (TrieNode child :node.children) {if (results.size() > limit) return;prefixSearchHelper(s, results, limit, index +1, child,new StringBuilder(curr)); } }else {// we still need to stick to finding the prefixchar character =s.charAt(index);int ascii = (int) character;if (ascii >=48&& ascii <=57) { ascii -=48; //Since ascii = 48 corresponds to index 0 in the children array } elseif (ascii >=65&& ascii <=90) { ascii -=55; //Since ascii = 65 corresponds to index 10 in the children array } elseif (ascii >=97&& ascii <=122) { ascii -=61; }if (node.children[ascii] ==null) return;prefixSearchHelper(s, results, limit, index +1,node.children[ascii], curr); } }// Simplifies function call by initializing an empty array to store the results.String[] prefixSearch(String s,int limit) {ArrayList<String> results =newArrayList<String>();prefixSearch(s, results, limit);returnresults.toArray(newString[0]); }publicstaticvoidmain(String[] args) {Trie t =newTrie();t.insert("peter");t.insert("piper");t.insert("picked");t.insert("a");t.insert("peck");t.insert("of");t.insert("pickled");t.insert("peppers");t.insert("pepppito");t.insert("pepi");t.insert("pik");System.out.println(t.contains("peter"));// String[] result1 = t.prefixSearch("pe", 10);// for (String s : result1) {// System.out.println(s);// }// String[] result2 = t.prefixSearch("pe.", 10);// for (String s : result2) {// System.out.println(s);// }// String[] result3 = t.prefixSearch(".e.p", 10);// for (String s : result3) {// System.out.println(s);// }// result1 should be:// ["peck", "pepi", "peppers", "pepppito", "peter"]// result2 should contain the same elements with result1 but may be ordered arbitrarily }}