Yeah, it was a lot of time since I was writing here; I'm reading now about
Pattern Classification (last week) and now I'm reading about Information Retrieval, in a good introductory book (the first two chapter are a bit messy and without a goal, but explores the theory bases for IR):
Modern Information Retrieval, Ten minutes ago I was coding the Levinshtein Distance Algorithm, but it wasn't working because I was comparing the strings in differents places, so I looked some implementations and I get this:
http://www.merriampark.com/ld.htm, coded in C++, VB and Java.
My PHP implementation according to the IR Book is here:
http://www.danguer.com/code/ir/levenshtein.php?string1=surgery&string2=survey, of course, you must substitute the
string1
and
string2
with the values you want (note, my implementation doesn't allow empty string which can be sanned in the code and my implementation returns all the matrix, if you want only the distance, you have to do change:
return $C;
for
return $C[$n][$m];
).
The source code is here:
http://www.danguer.com/code/ir/levenshtein.phpsThere's a lot of others implementations and changes only a bit (for example they use the
Kronecker delta function).
Only to occupy more space here is my code:
function LD($string1, $string2)
{
$m = strlen($string1);
$n = strlen($string2);
$C = array();
for ($i=0;$i<=$n;$i++)
for ($j=0;$j<=$m;$j++)
{
if ($i == 0)
$C[0][$j] = 0;
else if ($j == 0)
$C[$i][0] = $i;
else
{
if ($string1[$j-1] == $string2[$i-1])
$value = $C[$i-1][$j-1];
else
$value = 1 + min($C[$i-1][$j], $C[$i][$j-1], $C[$i-1][$j-1]);
$C[$i][$j] = $value;
}
}
return $C;
}
I'm also submiting photos into
deviantart.com, you can find my artistic side in there:
http://danguer.deviantart.com