OpenEuphoria: Ticket #609: Interprete...

Euphoria Ticket #609: Interpreter does not handle scientific notation correctly

Reported by jaygade Feb 08, 2011
23 comments

When assigning a variable or constant as follows, the interpreter freezes up:

constant MIN_VAL1 = 2.0e0

This should be valid code for the number 2.0.

Further testing finds that 2.0e1 works, but 2.00e1 fails. 2.00e2 passes, but 2.000e2 fails. 12.000e2 passes but 12.0000e2 fails. Note also that 2e0 works.

The bug occurs whether the coefficient is positive or negative, but negative exponents seem to work. Seems to be a digit counting issue.

---- 
## 
Version 
---------------------------- 
4.0.0  (115714454200, 2010-12-22) 
 
Operating System 
---------------------------- 
Platform: Linux, Build: Mars, 2.6.32-28-generic:0 
 
Include Directories 
---------------------------- 
1: /usr/share/euphoria/include/ 
2: /home/jason/euphoria/projects 
3: /usr/share/euphoria/include 
 
PATH 
---------------------------- 
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games

Details

Type:	Bug Report	Severity:	Normal	Category:	Interpreter
Assigned To:	unknown	Status:	Fixed	Reported Release:	4.0.0
Fixed in SVN #:		View VCS:	none	Milestone:	4.0.1

1. Comment by jaygade Feb 16, 2011

I've made a couple of changes in my (4.0.0 release) source which fixes this bug, ensures that no number can be greater than 23 digits including 'e' and signs.

The code still doesn't handle scientific notation numbers which are out of range, though. For example, running a test program setting a variable to 2.000000000000000e507 results in an output of 6.1886920947651566115458e-110.

I haven't committed anything yet because I still can compile the tip with what is currently in eubins. the compile process is freezing up on me.

2. Comment by DerekParnell Feb 21, 2011

The problem looks like it's in Matt's decimals_to_bits() function. I can't work out what its supposed to be doing so I'm not sure how to fix it.

3. Comment by jaygade Feb 21, 2011

I have a fix, but it's not complete. It fixes just this error but there is not bounds checking to error out if the user enters an out-of-range scientific number.

I can patch in my fix I guess, and leave the bounds checking for later. That way I'm not just sitting on it.

I've been trying to think about workflow and how to write a good test & etc. In other words, stalling.

4. Comment by mattlewis Feb 21, 2011

Yeah, go ahead and commit the fix. I've been thinking that code needs some attention as far as extending it to parse 80-bit floating point numbers, and possibly moved to the standard library (though that stuff will be for 4.1, not 4.0.x).

5. Comment by jaygade Feb 21, 2011

I'm actually working on it now. Be up soon.

I'm still working on a good test but I guess I can upload the fix first and the test later. (I've already verified that the infinite loop is fixed interpreted.)

6. Comment by jaygade Feb 21, 2011

Okay, finally got my sources uploaded. The diff output on the euphoria site looks like it has more changes than I personally put in and I don't understand that. Actually, thinking about it, it may be because I copied my changes over instead of patching them, but I had thought at the time that they were identical. If that's a problem then I or someone will have to fix that. Sorry.

Please confirm that you can have a variable, constant, or literal set to "1.0e0" or the like. See t_scientific.e for the specific values for which I tested. The original bug is that decimals_to_bits() function in scinot.e called by scientific_to_float64() in scinot.e called by my_sscanf() in scanner.e was going into an infinite loop when the decimal part of a scientific notation number contained only zeros, and the exponent was less than a certain number.

I don't know if there is a way to test for failure since a failure actually hangs the front end.

I don't know if the zero-checking code could have been better integrated into the existing code or if I did the "right thing".

Also, some limit-checking needs to be done to ensure that the front end chokes if the user enters an invalid number.

7. Comment by mattlewis Feb 22, 2011

I think some code changes were omitted. In particular, the variable repl does not appear to ever be defined. Definitely not fixed yet...

8. Comment by jimcbrown Feb 22, 2011

I recognize that code. I wrote that originally. I think I know exactly what happened here - in short some code from the 4.1 branch seems to have gotten mixed into this last commit.

I'll write up an email about this soon.

9. Comment by jaygade Feb 22, 2011

Okay, sorry about that.

10. Comment by jaygade Feb 22, 2011

If no one else fixes my error then I will fix it tonight. I think I know how to do it, plus it might be good practice with Mercurial. (Or more frustration blink .)

Plus I need to change my own change a little bit - the max length of a number string should be 24, not 23 in my_sscanf. Leading sign, 17 significant figures, decimal point, e, sign, 3 digit exponent. And it might even be longer than that with leading zeroes although that's kind of a stretch.

I guess that I can't count...

11. Comment by jaygade Feb 23, 2011

See: hg:euphoria/rev/18d942fc40f4

changeset: 4666:18d942fc40f4 branch: 4.0 tag: tip parent: 4664:b9d08630810a user: jaygade <jaygade@yahoo.com> date: Tue Feb 22 21:48:21 2011 -0800 files: docs/release/4.0.1.txt source/scanner.e source/scinot.e tests/t_scientific.e description: Fix ticket:609 handling zeroes in scientific notation

The front end was going into an infinite loop if the decimal part of a scientific number contained only zeroes, such as "2.0e0"

12. Comment by jaygade Feb 23, 2011

Okay, I think I did it right now. It looks like the changes went into 4.0 branch only, not yet merged into 4.1 (I'll leave that to others). No REPL stuff. I did miss a comment correction on my_sscanf() but that's minor.

My change does seem to break some tests which try and create numbers with greater than 17 significant figures; I didn't log them though.

I don't know if my erroneous branch needs to be closed or if it should just be hanging there as the mistake that it was.

13. Comment by mattlewis Feb 23, 2011

See: hg:euphoria/rev/e4f3efb7a34f

changeset: 4669:e4f3efb7a34f tag: tip parent: 4668:2a8964fed315 parent: 4666:18d942fc40f4 user: Matt Lewis date: Wed Feb 23 06:08:37 2011 -0500 files: docs/release/4.0.1.txt source/scanner.e source/scinot.e description:

merge fix for ticket 609 into trunk

14. Comment by mattlewis Feb 25, 2011

I haven't investigated the changes, but the status should at least be "Fixed, Please Confirm."

15. Comment by DerekParnell Feb 25, 2011

It seems that scientific notation is working, but the other form of floating point notation is now failing.

 constant x = 2.8844991406148167646432766215602

<0121>:: number not formed correctly 
 constant x = 2.8844991406148167646432766215602 
                                              ^

This has to be shortened to ...

 constant x = 2.8844991406148167646432

before it works.

16. Comment by jaygade Feb 25, 2011

That was on purpose.

You can't make a number with more than 17 significant figures because the excess digits are lost. I assume that it is an error if you try to cram more significant figures into an atom than it can hold. Until we get arbitrary precision numbers however.

Right now scanner.e/my_sscanf rejects any string longer than 24 characters, which includes sign, 17 digits, decimal point, e, sign, and 3-digit exponent.

If that is wrong then it can be changed back to no limit or set to a different value. I'm not wedded to the idea.

17. Comment by jaygade Feb 25, 2011

Or maybe just silently truncate the extra digits so the code doesn't waste time converting insignificant figures?

18. Comment by jimcbrown Feb 25, 2011

The 64bit version already allows larger values. I think we're better off just silently losing precision / truncating values if we have to.

19. Comment by mattlewis Feb 25, 2011

There are also numbers like: 0.000000000000000000000000000000001. Only one digit of precision, but very small.

20. Comment by jaygade Feb 25, 2011

That makes sense, I guess, even though I would really hope that the author would use scientific notation instead in that case.

I'll remove the upper limit boundary.

21. Comment by jaygade Feb 26, 2011

See: hg:euphoria/rev/c2fd775fddce

changeset: 4675:c2fd775fddce branch: 4.0 tag: tip parent: 4673:e38ce6c0f0cc user: jaygade <jaygade@yahoo.com> date: Fri Feb 25 20:41:17 2011 -0800 files: source/scanner.e description: Part of ticket:609 . Removes upper limit of 24 characters from scanner.e/my_sscanf.

22. Comment by jaygade Feb 26, 2011

Uploaded for 4.0 branch only, not the default.

23. Comment by DerekParnell Feb 26, 2011

I added some more unittests and allowed coders to express the value zero in scientific notation.

General S/N has the form a x 10^b which means that when a is zero, the result is zero regardless of the value of b.

Note that Normalized S/N requires a to have a minimum value of 1 (one).

So now, Euphoria allows literals in the form 0eN, where N can be any value, which evaluates to zero.

OpenEuphoria