Re: Check if files equal

new topic     » goto parent     » topic index » view thread      » older message » newer message

Hi, Kat wrote:

> On 7 Jul 2002, at 1:51, 10963508 at europeonline.com wrote:
                          ^^^^^^^^
It would be nice, to see a name here. (Just my opinion.)

>> What is the fastest way of checking if two very large files (~500 MB) are
>> equal?

I assume you mean equal content, not equal name, equal date, ..

>> I was thinking about this:
>> -name

Name doesn't matter concerning the content.

>> -size
>> -date last modified

Date doesn't matter concerning the content.

>> -pick about 10 random positions and check if bytes at those positions in
>> both files match.

> I would not trust those tests at all.

First, I would compare the size of the files, this is very fast.
Whether this comparison can be trusted or not, depends on it's result!
If both files don't have the same size, it's 100% sure that they are
not equal. If they have the same size, further testing is needed.
The same logic goes for CRC tests and the comparison of random bytes.

>> Is there any better and faster way that I'm not aware of?

I think it would be the best, first to make some _fast_ tests, that
will find unequal files in some probability. (I don't know how fast CRC
testing is.)
Then, if these tests didn't prove that the files are unequal, more
precise tests must follow. Of course, the most precise test is this:

> Open file
> while not eof do
> Read them in, one buffer size at a time, 
> compare, 
> if not equal { tell me it's not equal, abort}
> end while

> Kat

Best regards,
   Juergen

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu