Re: Corpora: Fast edit distance algorithms

Bruce L. Lambert (lambertb@uic.edu)
Fri, 29 Oct 1999 10:59:36 -0500

Mark,

Edit distance is inherently (algorithmically) complex, so it will never be
really FAST. I've optimized my edit distance code (in Lisp) as much as I
can, and it's still not fast. The best thing to do if you really need speed
is NOT to use edit distance, but use ngram methods instead. They can be
made really fast with special indexing.

Still, have a look at

Stephen's "String Searching Algorithms" or
Aoe's "Computer Algorithms: String pattern matching strategies"

for references.

-bruce

At 11:26 AM 10/29/99 -0400, you wrote:
>Hello, corpora list-
> Can anyone point me to relatively recent work (roughly
>the last five years) on fast methods of computing edit distance?
>I'm interested in algorithms that compare pairs of individual words
>(not algorithms that search long text strings for approximate matches
>of words). Opinions on fastest methods welcome also. I'll post a
>summary of responses to the list. Thanks-
>
>Mark
>
>-------------------------
>Dr. Mark Lewellen
>L&H-AppTek
>mlewellen@apptek.com
>