Re: Rob's going to hate me... (Remainder bug)
- Posted by kbochert at copper.net Nov 07, 2003
- 606 views
On 7 Nov 2003 at 15:11, Robert Craig wrote: > > > kbochert at copper.net wrote: > >>From a number of sources: > > "The fmod() function returns the value x - i * y, for some > > integer i, such that, if y is non-zero, the result has the > > same sign as x and magnitude less than the > > magnitude of y. > > > > > > in Dremainder(): > > > > double x, y; > > if (b->dbl < 1) { // fmod() seems to sometimes goof this! > > x = a->dbl < 0? -a->dbl: a->dbl; > > y = b->dbl < 0? -b->dbl: b->dbl; > > x = x - floor(x / y) * y; > > if (a->dbl < 0) // sign matches a > > return (object)NewDouble(-x); > > return (object)NewDouble(x); > > } > > > > Works for the (100, .01) case (and others I have tried). > > This leads me to believe that it is a bug in the implementation > > of fmod() (In Intel's hardware?). > > I doubt it. > > > Maybe remainders of fractional divisors are seldom done?? > > > > Can also be done in Eu code, of course. > > > > Ktb > > > > Note that the above code is invoked for both fractional and negative > > divisors -- both work fine. > > I'm sure, by changing the way something is calculated, > you can often eliminate anomalies for some cases. > But you can't eliminate the fundamental mathematical > problem with finite-precision floating-point. > > Because it only uses a finite number of digits, > Intel (and most other) floating-point hardware can't > represent all numbers with perfect accuracy. > As humans who use the decimal system we are > surprised to find that numbers like .1 .01 etc. > can't be represented perfectly in Intel's > binary floating-point system. > Instead, the hardware uses something like .01 +/- epsilon to > represent .01, where epsilon is a very tiny positive number. > > In remainder(100, .01), > suppose the hardware uses .01 + epsilon. > To calculate the remainder after dividing 100 by > (.01 + epsilon), we see that > 99 * (.01 + epsilon) = .99 + 99 * epsilon > which is as high as we can get before we exceed 100. > So we will say: > 100 = 99 * (.01 + epsilon) > with a remainder of .01 - 99 * epsilon > When we print this remainder to (say) 10 digits, > we will see: 0.01, a far cry from the 0 we expected. > > If on the other hand, the machine uses .01 - epsilon, > We will have: > 100 = 100 * (.01 - epsilon) > with a remainder of 100 * epsilon > When we print this remainder, we will see 0 as we expected. > > I still like the simpler example: > > if 1.0 != .1 + .1 + .1 + .1 + .1 + > .1 + .1 + .1 + .1 + .1 then > puts(1, "Not accurate!!!\n") > end if > > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com > It seems to me that given that fmod(a,b) is defined as returning a value which is less than b, failure to do so is not just inaccurate but a real bug. >From a Google search: http://www.boic.com/b1mnum.pdf. (dated Nov 3 2003 !!) QUOTE: Appendix Microsoft Modulo Bugs Over the years of testing of the Number Class, we've discovered nasty bugs in Microsoft's fmod() function, which you can easily confirm for yourself. For example, using the Windows 95 Calculator accessory (evidently built on the same fmod function) try this simple calculation: 5.5 mod 1.1 Instead of giving the correct result, which should be 0, it shows the result as 1.1. Similar errors are not hard to find, e.g. try 5.1 mod 1.7, 49 mod 9.8, 21.9 mod 7.3, and 36 mod 7.2. Interestingly, this problem occurs only with certain choices of values and the pattern of failures is not trivially discerned, but there are obviously many more cases. In fact we found it increasingly rare not to encounter such errors with more decimal places. We were unable to find any reference to this disturbing bug in the Microsoft Knowledge Base. Considering how much flack Intel took over an even more obscure arithmetic bug in its Pentium processor, one can only wonder how Microsoft ever got away with this one. Fortunately, the problem seems to have been quietly fixed in Windows 98, but that's not much consolation to those of us who expect our programs to work properly under Windows 95 and early versions of Windows NT 4.0. In any case, rest assured that the Base/1 Number Class computes the modulo function correctly under any version of Windows. The modulo bug described above seems to have been fixed in the latest operating systems and service packs from Microsoft. However, here's a new one we discovered in recent testing of the Number Class: d1 = 8853.41959899; d2 = 2951.13986633; fmod( d1, d2 ) returns the same value as that of d2 instead of returning zero. In this example, d1 is a perfect multiple of d2 (d1 = d2 * 3). We got this error on Windows NT Service Pack 5 and 6a and Windows 2000 server. For some versions, the bug did NOT appear in the calculator program, but using fmod() ALWAYS produces this bug. The Number Class properly returns the correct remainder of zero UNQUOTE Ktb