Re: Text File Comparison

new topic     » goto parent     » topic index » view thread      » older message » newer message

Please find below a corrected and improved version of my previous algorithm
on the subject.

--Program to show the differences between two files
--Author: R. M. Forno - Version 0.1 - 2002/02/10

function diff(sequence s, sequence t)
    integer i, lens, lent, max, min, a, b, z, n, m, topi
    sequence r, x
    lens = length(s)
    lent = length(t)
    max = lens + lent
    r = {}
    i = 1
    z = 1
    while i <= lens do
 min = max
 topi = lens + 1
 m = i
 while m < topi do
     x = s[m] --Speedwise
     n = min - m + i + z - 1
     if n > lent then
  n = lent
     end if
     for j = z to n do --Search for minimum slack
  if equal(x, t[j]) then
      min = m - i + j - z --Update minimum slack
      topi = min + i --Update upper limit for m
      a = m
      b = j
      exit
  end if
     end for
     m += 1
 end while
 if min < max then
     for k = i to a - 1 do
  r = append(r, {1, k})
     end for
     for k = z to b - 1 do
  r = append(r, {-1, k})
     end for
     r = append(r, {0, a})
     i = a + 1 --Update starting points
     z = b + 1
 else
     exit
 end if
    end while
    for k = i to lens do --Last part
 r = append(r, {1, k})
    end for
    for k = z to lent do
 r = append(r, {-1, k})
    end for
    return r
end function

function read_in(sequence fn)
    sequence s
    object x
    integer f
    s = {}
    f = open(fn, "r")
    if f < 0 then
 puts(2, "Error - cannot open " & fn)
 abort(1)
    end if
    while 1 do
 x = gets(f)
 if atom(x) then
     exit
 end if
 s = append(s, x)
    end while
    return s
end function

procedure out_diff(sequence r, sequence a, sequence b)
    sequence x
    integer n, z
    for i = 1 to length(r) do
 x = r[i]
 n = x[1]
 z = x[2]
 if n < 0 then
     puts(1, "< " & b[z])
 elsif n > 0 then
     puts(1, "> " & a[z])
 else
     puts(1, "  " & a[z])
 end if
    end for
end procedure

--Example of usage

sequence a1, a2, r
a1 = read_in("file1")
a2 = read_in("file2")
r = diff(a1, a2)
out_diff(r, a1, a2)

----- Original Message -----
From: <petelomax at blueyonder.co.uk>
To: "EUforum" <EUforum at topica.com>
Sent: Friday, February 08, 2002 9:40 PM
Subject: Text File Comparison



Looking for a source file comparison utility - must be written in
Euphoria or a source I can translate.

Thinking out loud, it seems non-trivial to report the smallest
possible number of changed lines, which is what I want.

At the moment I'm struggling with DOS fc utility but I'd like an
output similar to:

function fred()
sequence result
>integer i
result={}
< for i = 1 to 10
< if skip[i]=0 then
> i=1
> while i <= 10
> if skip[i]>0 then
> i+=skip[i]
> else
result&=i
> i+=1
end if
< end for
> end while
return result
end procedure

whereby ">" lines have been added & "<" removed.

Hopefully someone out there in the Linux world has the source of
"diff" I think it is which I suspect handles this alot better than I
could starting from scratch.

Using fc I get alot of false realigns on "end if" causing the output
to be much larger than it ought to be. Raw performance is unlikely to
be an issue.

Pete

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu