1000x Faster Spelling Correction algorithm | FAROO Blog
3. Symmetric Delete Spelling Correction (FAROO)
Generate terms with an edit distance <=2 (deletes only) from each dictionary term and add them together with the original term to the dictionary. This has to be done only once during a pre-calculation step.
Generate terms with an edit distance <=2 (deletes only) from the input term and search them in the dictionary.
For a word of length n, an alphabet size of a, an edit distance of 1, there will be just n deletions, for a total of n terms at search time.
Comparison to other approaches
BK-Trees have a search time of O(log dictionary_size), whereas our algorithm is constant time ( O(1) time ), i.e. independent of the dictionary size.
Tries have a comparable search performance to our approach. But a Trie is a prefix tree, which requires a common prefix. This makes it suitable for autocomplete or search suggestions, but not applicable for spell checking. If your typing error is e.g. in the first letter, than you have no common prefix, hence the Trie will not work for spelling correction.
http://blog.faroo.com/2012/06/24/1000x-faster-spelling-correction-source-code-released/
https://github.com/gpranav88/symspell/blob/master/SymSpell.java
Read full article from 1000x Faster Spelling Correction algorithm | FAROO Blog
3. Symmetric Delete Spelling Correction (FAROO)
Generate terms with an edit distance <=2 (deletes only) from each dictionary term and add them together with the original term to the dictionary. This has to be done only once during a pre-calculation step.
Generate terms with an edit distance <=2 (deletes only) from the input term and search them in the dictionary.
For a word of length n, an alphabet size of a, an edit distance of 1, there will be just n deletions, for a total of n terms at search time.
Comparison to other approaches
BK-Trees have a search time of O(log dictionary_size), whereas our algorithm is constant time ( O(1) time ), i.e. independent of the dictionary size.
Tries have a comparable search performance to our approach. But a Trie is a prefix tree, which requires a common prefix. This makes it suitable for autocomplete or search suggestions, but not applicable for spell checking. If your typing error is e.g. in the first letter, than you have no common prefix, hence the Trie will not work for spelling correction.
http://blog.faroo.com/2012/06/24/1000x-faster-spelling-correction-source-code-released/
https://github.com/gpranav88/symspell/blob/master/SymSpell.java
Read full article from 1000x Faster Spelling Correction algorithm | FAROO Blog