Re: String comparisons

new topic     » goto parent     » topic index » view thread      » older message » newer message

This is by Jeremy Cowgar

You count the number of steps to transform s1 to s2.

 
-- Levenshtein string distance calculation 
-- 
-- Author: 
--     Jeremy Cowgar < jeremy at cowgar dot com > 
-- 
-- Created: 
--     August 16, 2008 
-- 
-- Updated: 
--     August 16, 2008 
-- 
-- Copyright: 
--     This file is in the public domain 
-- 
 
include std/math.e 
 
--** 
-- Computes the minimum number of operations needed to transform ##s1## into ##s2## 
-- 
-- Example: 
--    
--   ? levenshtein("day", "may") -- 1 
--   ? levenshtein("cat", "dog") -- 3 
--    
-- 
--   Now, those are easy examples, but take this one for example: 
-- 
--    
--   ? levenshtein("Saturday", "Sunday")  -- 3 
--    
-- 
--   Yes, 3. Delete "a" and "t" in Saturday, (Surday) then change the "r" to a 
--   "n" and you have Sunday! 
-- 
-- See Also: 
--   http://en.wikipedia.org/wiki/Levenshtein_distance 
 
export function levenshtein(sequence s1, sequence s2) 
    integer n = length(s1) + 1, m = length(s2) + 1 
 
    if n = 1  then 
        return m-1 
    elsif m = 1 then 
        return n-1 
    end if 
 
    sequence d = repeat(repeat(0, m), n) 
    for i = 1 to n do 
        d[i][1] = i-1 
    end for 
 
    for j = 1 to m do 
        d[1][j] = j-1 
    end for 
 
    for i = 2 to n do 
        for j = 2 to m do 
            d[i][j] = min({ 
                d[i-1][j] + 1, 
                d[i][j-1] + 1, 
                d[i-1][j-1] + (s1[i-1] != s2[j-1]) 
            }) 
        end for 
    end for 
 
    return d[n][m] 
end function 

_tom

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu