1. routine_ids and benchmarking
- Posted by "Boehme, Gabriel" <gboehme at MUSICLAND.COM> Apr 21, 1999
- 321 views
I have a question. I've been running a benchmark program which relies upon the use of routine_ids. I have recently discovered that close results can be swayed significantly, simply by moving the test routines to different physical locations in the program. I have even managed to obtain results *consistently* showing one routine to be faster than another, when both routines are identical! I do not believe I have seen anything about this mentioned either here on the list, or in the official Euphoria documentation. Is this truly a "feature" of Euphoria, or is my computer just playing with my mind? Thanks in advance, Gabriel Boehme ------ A dead thing can go with the stream, but only a living thing can go against it. G.K. Chesterton ------
2. Re: routine_ids and benchmarking
- Posted by Robert Craig <rds at ATTCANADA.NET> Apr 21, 1999
- 336 views
- Last edited Apr 22, 1999
Gabriel Boehme writes: > I have recently discovered that close results can be > swayed significantly, simply by moving the test routines > to different physical locations in the program... > ...Is this truly a "feature" of Euphoria, or is my computer just > playing with my mind? The short answer is: Your computer is playing with your mind. Modern CPU's depend heavily on caching. The gap in speed between the CPU chip with its on-chip cache, and the main DRAM memory, has been widening steadily for many years. (Most machines also have a secondary cache between the on-chip cache and DRAM) . A cache "miss" causes a severe penalty. Generally speaking, the most recently accessed data is kept in cache, while older data is kicked out. In some cases however, recent data may be kicked out if the address of that data conflicts with the addresses of other data in the cache. For example, on a Pentium there is a constraint that allows a maximum of 2 addresses with the same 2nd and 3rd-last hex digits in the address (+++++xx+). When a third such address is fetched, one of the existing two must be kicked out, even if it was accessed recently. The (original) Pentium data cache is said to be "2-way set associative". (I believe the Pentium II and III data cache is 4-way set associative) If I've lost you, don't worry, the point is that by varying the address of Euphoria code and data in a trivial way, you can sometimes cause a significant change in the speed of the code. This is true for any language running on a Pentium. A 486 is less susceptible to this, but has other *sensitivities* to code alignment etc. In one extreme case that I checked out very carefully, a small for-loop in a subroutine started running *6* times slower when I declared a new variable in an *unrelated* routine. I couldn't believe it at first, but after a day or so of machine-level debugging I managed to clearly show that there were 3 variables in the for-loop competing for two available slots in the cache. As each variable was fetched, it bumped one of the others out of cache, so that none of the 3 variables was *ever* in cache when it was needed. Almost *any* trivial change to the program, prior to the for-loop, would make it run fast again. Regards, Rob Craig Rapid Deployment Software http://members.aol.com/FilesEu/
3. Re: routine_ids and benchmarking
- Posted by Arthur Adamson <euclid at ISOC.NET> Apr 22, 1999
- 326 views
Rob, thanks for the enlightenment. Art At 09:27 PM 4/21/99 -0400, you wrote: >Modern CPU's depend heavily on caching. The gap in speed >between the CPU chip with its on-chip cache, and the main >DRAM memory, has been widening steadily >for many years. (Most machines also have a secondary cache >between the on-chip cache and DRAM) . A cache >"miss" causes a severe penalty. Art Adamson, The Cincinnati Engine Man, permanent address euclid2 at email.com