Euphoria Ticket #259: filesys:checksum() fails

When comparing files that are identical (or even the same file with multiple calls) and a size > 0, the checksums don't match. When size = 1, checksum correctly identifies the files as the same.

Failing tests were added to t_filesys.e in svn:3664.

Details

Type: Bug Report Severity: Normal Category: Library Routine
Assigned To: DerekParnell Status: Fixed Reported Release: 3665
Fixed in SVN #: 3716, 3691 View VCS: 3716, 3691 Milestone: 4.0.0RC1

1. Comment by jimcbrown Oct 30, 2010

This appears to be a bug in hash(), demonstrated by this example program:

atom jx = 2514801793 
sequence data = "itial_direc" 
jx = hash(jx, data) ? jx 
jx = hash(jx, data) ? jx 
jx = hash(jx, data) ? jx 
jx = hash(jx, data) ? jx 

2. Comment by jimcbrown Oct 30, 2010

I changed checksum() not to use hash() in svn:3691. This works around the bug, but hash() itself still needs to be fixed. Leaving this bug open until hash() is fixed (and then we can revert filesys.e ...)

3. Comment by DerekParnell Oct 30, 2010

That example is not a bug because each time you call hash(), you are supplying it with a different hashing key, so the result is different for each call.

What were you expecting?

4. Comment by jimcbrown Oct 30, 2010

I was NOT expecting to see this:

first run $ eui example.e 1763789629 1404177190 1379719974 1398610726

second run $ eui example.e 1763789629 1182927653 1192870693 1189561125

third run $ eui example.e 1763789629 3759278887 3760266023 3774716711

fourth run $ eui example.e 1763789629 1074924326 1101183782 1104395046

Why does the result of hash() change each time on the second call? The input is identical.

5. Comment by DerekParnell Oct 30, 2010

When I run this I get ...

c:\temp>eui hasher 
1763789629 
2620525367 
2619396919 
2626900791 
 
c:\temp>eui hasher 
1763789629 
2620525367 
2619396919 
2626900791 
 
c:\temp>eui hasher 
1763789629 
2620525367 
2619396919 
2626900791 
 
c:\temp>eui hasher 
1763789629 
2620525367 
2619396919 
2626900791 

But there is a problem with hash(). Still looking into it.

6. Comment by jimcbrown Oct 30, 2010

Ok, let me fix up the formatting...

I was NOT expecting to see this:

-- first run 
$ eui example.e 
1763789629 
1404177190 
1379719974 
1398610726 
 
--second run 
$ eui example.e 
1763789629 
1182927653 
1192870693 
1189561125 
 
--third run 
$ eui example.e 
1763789629 
3759278887 
3760266023 
3774716711 
 
--fourth run 
$ eui example.e 
1763789629 
1074924326 
1101183782 
1104395046 

Why does the result of hash() change each time on the second call? The input is identical.

7. Comment by DerekParnell Oct 30, 2010

Because there's a bug in hash() smile

But you see that when I ran your example, the output was consistent. However, other tests I've done just now do show different outputs for the same input and as you say, that is not right.

8. Comment by DerekParnell Oct 30, 2010

The bug is actually in the parser or backend. The problem happens when assigning the output from a function to one of the parameters to that function call.

eg. jx = func(jx)

In some circumstances, the returned value is not assigned correctly, or the something like that.

The fix for this ticket will be a workaround in filesys.e:checksum() to avoid this type of construct, but another ticket has to be created for the underlying issue.

9. Comment by jimcbrown Oct 30, 2010

Confirmed. New filesys.e passes all tests.

Search



Quick Links

User menu

Not signed in.

Misc Menu