Re: Euphoria bug (Robert)
- Posted by Robert Craig <rds at RapidEuphoria.com> Sep 16, 2005
- 503 views
Vincent wrote: > It's pretty strange how adding "s[1][1] = {} or 0" makes the bug disappear. > > Robert, have you tracked down the location of the bug in the backend? > Any chance of finding other bugs in the same league as this one? The bug isn't in the backend. I wouldn't call it a bug. It's more of a missed opportunity for optimization in the front end. Here's an example of code that runs slow:
sequence s atom t object data constant repeats = 1000000, len = 1000 s = repeat(repeat({}, len), len) t = time() for n = 1 to repeats do s[999][999] = {1, 1} data = s[999][999] end for ?data ?time()-t
Almost any change you make to this code will cause it to run fast. You can insert some code before the loop. You can reorder the two statements in the loop. You can assign {1,n} instead of {1,1} etc. (The code itself is rather silly if you think about it.) One sure-fire way to speed it up is to set data to s[998][999], thereby removing a reference from s[999] and avoiding the need to make a copy of s[999] when s[999][999] is overwritten. However when this code runs fast, you can keep data=s[999][999] in there, no problem. Why is that? Take a look at the intermediate code (IL) that the front end generates in the slow case, for the two statements inside the loop: s[999][999] = {1, 1} 1. LHS_SUBS1 90 101 *98* 102 2. PASSIGN_SUBS *98* 101 103 data = s[999][999] 3. RHS_SUBS 90 101 -102- 4. RHS_SUBS_CHECK -102- 101 92 98 and 102 are two temp variables used by the interpreter. On line 3, 102 is assigned s[999]. On line 1, at the start of the loop, 98 is dereferenced and assigned a new value. When the code runs fast, the interpreter uses the same temp in both places. This causes the reference to s[999] to go away before s[999][999] is assigned a new value. For example, when s[1][1] = 0 is inserted before the loop, it causes slightly different temps to be used inside the loop: s[999][999] = {1, 1} 1. LHS_SUBS1 90 103 *100* 98 2. PASSIGN_SUBS *100* 103 104 data = s[999][999] 3. RHS_SUBS 90 103 *100* 4. RHS_SUBS_CHECK *100* 103 92 Now 100 is used in both places, where 98 and 102 were used before. This causes the reference to s[999] created on line 3, to be deleted on line 1 just before s[999][999] is assigned to. I've tested a few other changes that cause it to run fast, and they all share this property of using the same temp rather than two different temps. Translated code also runs fast sometimes and slow other times, for much the same reason. The front end tries to recycle temps when they are no longer in use, in order to minimize the number of them. I guess what I've learned here is that this is probably even more important than I thought. I'll consider if there are any good ways of tightening this up. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com