Re: Euphoria bug (Robert)

new topic     » topic index » view thread      » older message » newer message

Vincent wrote:
> It's pretty strange how adding "s[1][1] = {} or 0" makes the bug disappear.
> 
> Robert, have you tracked down the location of the bug in the backend?
> Any chance of finding other bugs in the same league as this one?

The bug isn't in the backend.
I wouldn't call it a bug. It's more of a missed opportunity
for optimization in the front end. Here's an example of
code that runs slow:

sequence s
atom t
object data
constant repeats = 1000000, len = 1000

s = repeat(repeat({}, len), len)

t = time()
for n = 1 to repeats do
    s[999][999] =  {1, 1}
    data = s[999][999]
end for
?data
?time()-t


Almost any change you make to this code will 
cause it to run fast. You can insert some code before the loop.
You can reorder the two statements in the loop.
You can assign {1,n} instead of {1,1} etc.
(The code itself is rather silly if you think about it.)

One sure-fire way to speed it up is to set data to s[998][999],
thereby removing a reference from s[999] and avoiding the
need to make a copy of s[999] when s[999][999] is overwritten.
However when this code runs fast, you can keep data=s[999][999]
in there, no problem. Why is that? 
Take a look at the intermediate code (IL) that the front end
generates in the slow case, for the two statements inside the loop:
   
   s[999][999] =  {1, 1}

1. LHS_SUBS1 90 101 *98* 102
2. PASSIGN_SUBS *98* 101 103
   
   data = s[999][999]

3. RHS_SUBS 90 101 -102-
4. RHS_SUBS_CHECK -102- 101 92

98 and 102 are two temp variables used by the interpreter.
On line 3, 102 is assigned s[999]. 
On line 1, at the start of the loop, 98 is dereferenced
and assigned a new value. When the code runs fast,
the interpreter uses the same temp in both places.
This causes the reference to s[999] to go away before
s[999][999] is assigned a new value. For example,
when s[1][1] = 0 is inserted before the loop, it causes
slightly different temps to be used inside the loop:

   s[999][999] =  {1, 1}

1. LHS_SUBS1 90 103 *100* 98
2. PASSIGN_SUBS *100* 103 104

   data = s[999][999]

3. RHS_SUBS 90 103 *100*
4. RHS_SUBS_CHECK *100* 103 92

Now 100 is used in both places, where 98 and 102 were used before.
This causes the reference to s[999] created on line 3,
to be deleted on line 1 just before s[999][999] is assigned to.

I've tested a few other changes that cause it to run fast,
and they all share this property of using the same temp
rather than two different temps. Translated code also runs
fast sometimes and slow other times, for much the same reason.

The front end tries to recycle temps when they are no longer
in use, in order to minimize the number of them. I guess 
what I've learned here is that this is probably even more
important than I thought. I'll consider if there are any good
ways of tightening this up.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu