pathterminuspages/projects/aboutcontactabout me

Complexity Analysis



The Algorithm
@Complexity Analysis

Exploding Search has two stages: a construction stage and a searching stage. For the analysis assume that a symbol is a letter of the English alphabet. Furthermore assume lowercase letters only. The last assumption is not that unreasonable. Often when searching we don't want to distinguish between for example "eaRth", "earth" and "Earth".

Searching Stage

The searching stage is done in constant time when inputting a single symbol at a time. For each input symbol we have a lookup in the $children$ dictionary/array in order to find the next node. Hence constant time. For every input symbol we just return the whole $results$ list. That is we do not have to traverse it.

Construction Stage - Time Complexity

We traverse every word in any key in any record once. That is we do $$ \sum_{r \in C} \sum_{word \in r.key} word.length $$ iterations.

Construction Stage - Space Complexity

This one is the main reason I did this article. Given that the alphabet size is 26, we have for the tree a worst case space usage of $26^{|longest\_word|}$. But given that we search in a space where keys are taken from the English dictionary, we have a constant bound of the longest English word. Further more say we disregard the chemical names and use that disease of length 45. If this appears several times, it is only recorded as one path since every letter overlaps. So in order to reach $26^{45}$ space usage for the tree itself, we need 45 words of length 45 where no two words have the same symbol in same position. And even in this situation the space usage is constant for the tree.

However we need to add the $results$ list to each node. For each symbol in each word in each key we store a pointer. So the space complexity here is $$ \left ( \sum_{r \in C} \sum_{word \in r.key} word.length \right ) \cdot ptr\_size $$

And that is that.

CommentsGuest Name:Comment: