1. 2.4 Official Release - machine level exception
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jul 09, 2003
- 536 views
Rob,=20 You are correct, I was running sanity.exw, probably some copy I made 18 months ago so I could double click it to run it with exw. Quite the detective! I've deleted that file now. exw sanity.ex runs fine (I was kinda hoping it wouldn't I get a machine-level exception during the statement pos[P_yto]-=3Drely where pos =3D {24646,7901,32476,9990} rely =3D -3543 and P_yto is the constant 4 If I just insert eg 'if integer(rely) then end if' before that statement it works fine. Worrying. If making such a tiny alteration makes the problem go away, it will probably work fine on most other machines (or any with a slightly different kernel32.dll etc)... Any ideas? Pete
2. Re: 2.4 Official Release - machine level exception
- Posted by Robert Craig <rds at RapidEuphoria.com> Jul 09, 2003
- 539 views
Pete Lomax wrote: > I get a machine-level exception during the statement > pos[P_yto]-=rely > where > pos = {24646,7901,32476,9990} > rely = -3543 > and P_yto is the constant 4 > > If I just insert eg 'if integer(rely) then end if' before that > statement it works fine. > > Worrying. > > If making such a tiny alteration makes the problem go away, it will > probably work fine on most other machines (or any with a slightly > different kernel32.dll etc)... > > Any ideas? Your little code fragment above works fine for me. I assume it's part of a much bigger program that has pokes, calls to C routines etc. Chances are, you have gone out of bounds with a poke, passed a bad value to a C routine, etc. The program could work fine on one run, but fail mysteriously on another, after any kind of trivial and seemingly irrelevant change. In this particular case you may be clobbering a Euphoria variable, stored in memory near one of your allocated blocks. Perhaps this is showing up now with 2.4, since the memory is organized differently. When using Win32Lib, I think you are now more likely to trash a Euphoria variable when you go out of bounds, rather than trashing another memory block that you allocated. I would at least try safe.e (see instructions at top of safe.e). Since Win32Lib has gone back to using Euphoria's allocate(), safe.e is now more effective. By default, in 2.4 Official, safe.e will only check the edges of blocks allocated by allocate(), so you shouldn't get any false alarms, but a lot of memory accesses will not be checked. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
3. Re: 2.4 Official Release - machine level exception
- Posted by Robert Craig <rds at RapidEuphoria.com> Jul 09, 2003
- 499 views
Andy Serpa wrote: > I still the get a mystery crash once in a while. I can never isolate or > duplicate it precisely (usually random numbers are involved and I can > never find a seed that reproduces the problem), but it seems to happen > in programs that sum up a large number of floating point values, often > incredibly small values, and then do something like invert it. Can you > think of a reason adding up a bunch of floats (there are probably > precision issues here) would ever cause a problem? No, I haven't come across anything like that. > Sometimes a number > that isn't quite zero is taken to be zero in some operations but not > others (precision again, I assume). What happens if I do > 1/not_quite_zero ? There was a problem in 2.3, fixed in 2.4... "bug fixed: Translated code compiled with Borland C was not producing INF's and NAN's, like Watcom and Lcc. Rather, it was crashing when a floating-point overflow (over 1e308), or an undefined f.p. result was calculated. The Interpreter Source Code was also corrected for those who wish to compile exw.exe using Borland. Thanks to Andy Serpa." I know you are still using 2.3 for some things. I also noticed that Windows ME provides less detail when a crash occurs than Win98 did, so you might not see that it was a floating-point exception. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
4. Re: 2.4 Official Release - machine level exception
- Posted by Robert Craig <rds at RapidEuphoria.com> Jul 12, 2003
- 498 views
> Pete Lomax wrote: > I get a machine-level exception during the statement > pos[P_yto]-=rely > ... > If I just insert eg 'if integer(rely) then end if' before that > statement it works fine. > > Worrying. > > If making such a tiny alteration makes the problem go away, it will > probably work fine on most other machines (or any with a slightly > different kernel32.dll etc)... > > Any ideas? I've found the bug. If you use a statement with subscripting on the left, and an assignment op, e.g. +=, -= etc., and the internal representation of the statement occurs at just the wrong point in my internal data structure, then you may get a machine exception on that statement. The chances of this happening are zero for programs under a couple of thousand statements, and very remote, maybe 1 program in 1000, for larger programs that use subscripted assignment ops. (simple stuff like x += 1 is fine). Any change that shifts the statement slightly in memory will make the bug go away. e.g. adding or removing some earlier code, turning trace or type_check on or off etc. If the bug is there, it will always crash on that statement. If it's not there, it will never crash. It's not random in that sense. The Translator is not affected. The bug has been there ever since assignment operators were introduced in version 2.1, January 1999, and was never detected before now. The new ability in 2.4 to catch machine exceptions obviously helped a lot here. The fix will be in the next release. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
5. Re: 2.4 Official Release - machine level exception
- Posted by gertie at visionsix.com Jul 12, 2003
- 502 views
On 12 Jul 2003, at 1:27, Robert Craig wrote: > > > > Pete Lomax wrote: > > I get a machine-level exception during the statement > > pos[P_yto]-=rely > > ... > > If I just insert eg 'if integer(rely) then end if' before that > > statement it works fine. > > > > Worrying. > > > > If making such a tiny alteration makes the problem go away, it will > > probably work fine on most other machines (or any with a slightly > > different kernel32.dll etc)... > > > > Any ideas? > > I've found the bug. If you use a statement with > subscripting on the left, and an assignment op, > e.g. +=, -= etc., and the internal representation > of the statement occurs at just the wrong point > in my internal data structure, then you may get a > machine exception on that statement. The chances of > this happening are zero for programs under a couple of > thousand statements, and very remote, maybe > 1 program in 1000, for larger programs that use > subscripted assignment ops. (simple stuff like x += 1 is fine). > Any change that shifts the statement slightly in memory will > make the bug go away. e.g. adding or removing some earlier code, > turning trace or type_check on or off etc. If the bug is > there, it will always crash on that statement. Hmm, that might explain why some programs only run with trace, and crash without trace. I think i'll delete all those assignment ops. Kat > If it's not there, it will never crash. > It's not random in that sense. The Translator is not affected. > The bug has been there ever since assignment operators > were introduced in version 2.1, January 1999, and was never > detected before now. The new ability in 2.4 to catch > machine exceptions obviously helped a lot here. > The fix will be in the next release. > > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com > > > > TOPICA - Start your own email discussion group. FREE! > >
6. Re: 2.4 Official Release - machine level exception
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jul 12, 2003
- 491 views
On Sat, 12 Jul 2003 00:43:52 -0500, gertie at visionsix.com wrote: >Hmm, that might explain why some programs only run with trace, and crash= =20 >without trace. I think i'll delete all those assignment ops. BTW, It's only subscripted assignment ops, Kat. I spent pretty much all of Thursday on this, anyone running on 2.4 should not be worried at all (IMHO), since it identifies the line it is having a problem with interpreting correctly (2.4 tells you about the machine level exception), and virtually any change you make to the code makes it go away. In contrast, a global edit (which I accept may be a valid option for pre-2.4 users) just might introduce a much more annoying bug, or indeed move a previously working subscripted assignment op in 3rd party code onto a critical boundary. (rare, but possible) Same deal, I think with library authors: even if we assume zero introduced typos, if they chose to remove subscripted assignment ops they are just as likely (being highly unlikely in either case) to stop users' code working as to help them any. Just my tuppenceworth, Pete