Prefix Tree Optimization

To all lua gurus including Ansca staff:

I’m trying to implement a prefix tree (trie) to be used as my dictionary for a word game. It loads around 260k+ words from a flat file. Although I was able to build a program that works, it is slow. Running via simulator takes around 2.6 seconds to load and build the trie. In my iPod Touch 2G, it takes 40 seconds. I want this number to go down to about <10 seconds of loading time in the device.

So why am I doing this and not just use a simple SELECT statement in SQLite to check if words exist? Yes I could do that but I also want a find-all-possible-words solver at the end of each level. Got the idea from a discussion at StackOverflow.

Questions:

  • How do I optimize the trie to reduce loading time?
  • Does the bottleneck occur in the trie algorithm? Or is the io.lines() too slow? Or is the list just too big?

Many thanks for the help! Get the SOWPODS list here.

trie.lua
[lua]module(…, package.seeall)

function createTrie()

– PRIVATE
local trie = {}

local function addNode(word, position, node, wordLength)
if position <= wordLength then
local letter = string.sub(word, position, position)
local child = node[letter]

if child == nil then
child = {}
node[letter] = child
end

if position < wordLength then
addNode(word, (position + 1), child, wordLength)
else
child.isWord = true
end
else
return
end
end

– PUBLIC
function trie:addWord(word)
local letter = string.sub(word, 1, 1)
local node = trie[letter]
local wordLength = string.len(word)

if node == nil then
node = {}
trie[letter] = node
end

addNode(word, 2, node, wordLength)
end

return trie
end[/lua]

main.lua
[lua]local trie = require(‘trie’)
local prefixTree = trie.createTrie()
local path = system.pathForFile(“sowpods.txt”, system.ResourceDirectory)

for line in io.lines(path) do
prefixTree:addWord(line)
end[/lua]

Jon Danao [import]uid: 3544 topic_id: 2500 reply_id: 302500[/import]

jon…don’t parse this words file at all…let Lua do it natively by storing it in table format as so:

return (word1 = true, word2 = true, … word260000 = true}

when you require the above file, you instantly have a lua table in memory with zero code to run.

You can’t get any faster than that [import]uid: 6175 topic_id: 2500 reply_id: 7233[/import]

dgaedcke, though I was not convinced by your example, your explanation was spot on and gave me a very good idea. Dictionary now loading 5.3 seconds on the device! Woot!

Thanks!

Jon [import]uid: 3544 topic_id: 2500 reply_id: 7245[/import]

Please share your solution…if you’ve found a faster way than my suggestion, I’ll bow at your feet

btw, I just noticed that my first parenthesis was ( when it should have been { (a table constructor) [import]uid: 6175 topic_id: 2500 reply_id: 7246[/import]

You were absolutely right that the dictionary should be a lua table to begin with. But I still need it as a trie structure. Your example was not what I was looking for. So what I did was still parse the text file, create the trie and physically write that trie into another lua file. This new file needs no extra processing and can be imported as a native lua table as you suggested.

Here is a snippet of what my trie looks like:
[lua]local trie = {A={A={L={I={I={S={isWord=true},isWord=true}},S={isWord=true},isWord=true},H={E={D={isWord=true}}…[/lua]
[import]uid: 3544 topic_id: 2500 reply_id: 7247[/import]

Mine sharing the code that you used to create that? I’m also making a word game and need the same exact thing. [import]uid: 13780 topic_id: 2500 reply_id: 41153[/import]