流水沉微

208. Implement Trie (Prefix Tree)

Explain

Implement a trie with insert, search, and startsWith methods.

Example
Trie trie = new Trie();

trie.insert("apple");
trie.search("apple");   // returns true
trie.search("app");     // returns false
trie.startsWith("app"); // returns true
trie.insert("app");
trie.search("app");     // returns true
Note

You may assume that all inputs are consist of lowercase letters a-z. All inputs are guaranteed to be non-empty strings.

Solve

class Node:
    def __init__(self):
        self.children = {}
        self.is_word = False

class Trie:

    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.root = Node()
  
    def insert(self, word: str) -> None:
        """
        Inserts a word into the trie.
        """
        node=self.root
        for i in word:
            if i not in node.children:
                node.children[i]=Node()
            node=node.children[i]
        node.is_word=True

    def search(self, word: str) -> bool:
        """
        Returns if the word is in the trie.
        """
        node = self.root
        for i in word:
            if i not in node.children:
                return False
            node = node.children[i]
        return node.is_word
  
    def startsWith(self, prefix: str) -> bool:
        """
        Returns if there is any word in the trie that starts with the given prefix.
        """
        node = self.root
        for i in prefix:
            if i not in node.children:
                return False
            node = node.children[i]
        return True
  
# Your Trie object will be instantiated and called as such:
# obj = Trie()
# obj.insert(word)
# param_2 = obj.search(word)
# param_3 = obj.startsWith(prefix)

主要思路是每个节点都用一个dict来包含所有的子节点,同时节点上的is_word字段表示到节点为止是否是一个完整的单词

Performance

Runtime: 244 ms, faster than 48.31% of Python3 online submissions for LRU Cache.

Memory Usage: 23.2 MB, less than 6.06% of Python3 online submissions for LRU Cache.

Improvement I

运行速度还行,但是内存明显还有可以优化的地方,主要是每个节点都生成了一个新的Node来存数据,实际上我们并不需要一个新的Node,需要的是一个dict,那如何判断当前节点是否是一个完整的word呢?多加一个标志节点即可,这里使用了#

Improvemed solution

class Trie:

    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.root = {}
  
    def insert(self, word: str) -> None:
        """
        Inserts a word into the trie.
        """
        node = self.root
        for i in word:
            if i not in node:
                node[i] = {}
            node = node[i]
        node['#'] = True
        
    def search(self, word: str) -> bool:
        """
        Returns if the word is in the trie.
        """
        node = self.root
        for i in word:
            if i not in node:
                return False
            node = node[i]
        return '#' in node
        
    def startsWith(self, prefix: str) -> bool:
        """
        Returns if there is any word in the trie that starts with the given prefix.
        """
        node = self.root
        for i in prefix:
            if i not in node:
                return False
            node = node[i]
        return True   

Improved Performance

Runtime: 124 ms, faster than 99.62% of Python3 online submissions for Implement Trie (Prefix Tree).

Memory Usage: 27.4 MB, less than 66.67% of Python3 online submissions for Implement Trie (Prefix Tree).

Improvement II

去看了下别人的解答,还有一种更加极端的优化运行时间的做法,做两个set:element_set/prefix_set,插入时把word直接插入到element_set,然后遍历word,把word的所有prefix都放到prefix_set,这样不管是search还是startsWith都是O(1),但是空间非常大

Improvemed solution

class Trie:
    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.prefix_set = set()
        self.element_set = set()
        
    def insert(self, word: str) -> None:
        """
        Inserts a word into the trie.
        """
        if word:
            if word not in self.element_set:
                self.element_set.add(word)
                for i in range(len(word)+1):
                    self.prefix_set.add(word[0:i])

    def search(self, word: str) -> bool:
        """
        Returns if the word is in the trie.
        """
        return word in self.element_set
        
    def startsWith(self, prefix: str) -> bool:
        """
        Returns if there is any word in the trie that starts with the given prefix.
        """
        return prefix in self.prefix_set

Improved Performance

Runtime: 172 ms, faster than 79.67% of Python3 online submissions for Implement Trie (Prefix Tree).

Memory Usage: 24.5 MB, less than 85.19% of Python3 online submissions for Implement Trie (Prefix Tree).

速度比想象得慢,内存比想象得少,大概是测试数据集的问题吧,在leetcode的测试结果里这个是内存占用最少的

Improvement III

试了下把 set 改为 dict,居然速度和内存都有一点点提升,有点意思

Improvemed solution

class Trie:
    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.prefix_set = {}
        self.element_set = {}
        
    def insert(self, word: str) -> None:
        """
        Inserts a word into the trie.
        """
        if word:
            if word not in self.element_set:
                self.element_set[word] = 1
                for i in range(len(word)+1):
                    self.prefix_set[word[0:i]] = 1

    def search(self, word: str) -> bool:
        """
        Returns if the word is in the trie.
        """
        return word in self.element_set
        

    def startsWith(self, prefix: str) -> bool:
        """
        Returns if there is any word in the trie that starts with the given prefix.
        """
        return prefix in self.prefix_set

Improved Performance

Runtime: 168 ms, faster than 80.12% of Python3 online submissions for Implement Trie (Prefix Tree).

Memory Usage: 23.6 MB, less than 96.30% of Python3 online submissions for Implement Trie (Prefix Tree).