1. Sequence test

Forked from Re: Edita on 4.0a3

petelomax said...

Actually, my tests show it 13,800 times slower (0.03 vs 414 seconds), probably an exponential slowdown due to messed up refcounts. It has nothing to do with the new and bizarre include system, try this:

atom t 
    t = time() 
 
sequence text 
integer lineno, count 
 
function Clone(sequence intxt) 
integer lt 
    lt = length(intxt) 
    return intxt[1..lt] 
end function 
 
    text = {repeat("1234",60000)} 
    count = 0 
    lineno = 0 
    while 1 do 
	if lineno>=length(text[1]) then exit end if 
	lineno += 1 
	text[1][lineno] = Clone(text[1][lineno]) 
	count += 1 
	if count>2000 then 
	    printf(1,"unpacking %d\n",{lineno}) 
	    count = 0 
	end if 
    end while 
 
    printf(1,"done, t=%3.2f\n",time()-t) 

I tried this code in Windows (Vista) with exwc of rev 1351 and 1352.
My result for 1351 is t=0.13 and for 1352 t=155.99. There must be a change in rev 1352 (introduces inline) which creates this strange behaviour. This must not necessarily be because of inlining.
If I add: without inline in the code and run it with rev 3025 I get the same result with or without this line: t=194.5.
Btw if I translate with euc -con the result is: t=0.09

Roland

new topic     » topic index » view message » categorize

2. Re: Sequence test

RStowasser said...

Forked from Re: Edita on 4.0a3

petelomax said...

Actually, my tests show it 13,800 times slower (0.03 vs 414 seconds), probably an exponential slowdown due to messed up refcounts. It has nothing to do with the new and bizarre include system, try this:

atom t 
    t = time() 
 
sequence text 
integer lineno, count 
 
function Clone(sequence intxt) 
 
 
George 
integer lt 
    lt = length(intxt) 
    return intxt[1..lt] 
end function 
 
    text = {repeat("1234",60000)} 
    count = 0 
    lineno = 0 
    while 1 do 
	if lineno>=length(text[1]) then exit end if 
	lineno += 1 
	text[1][lineno] = Clone(text[1][lineno]) 
	count += 1 
	if count>2000 then 
	    printf(1,"unpacking %d\n",{lineno}) 
	    count = 0 
	end if 
    end while 
 
    printf(1,"done, t=%3.2f\n",time()-t) 

I tried this code in Windows (Vista) with exwc of rev 1351 and 1352.
My result for 1351 is t=0.13 and for 1352 t=155.99. There must be a change in rev 1352 (introduces inline) which creates this strange behaviour. This must not necessarily be because of inlining.
If I add: without inline in the code and run it with rev 3025 I get the same result with or without this line: t=194.5.
Btw if I translate with euc -con the result is: t=0.09

Roland

I'm still running 2.4 and t is .04 running interpretive.

George

new topic     » goto parent     » topic index » view message » categorize

3. Re: Sequence test

gwalters said...

I'm still running 2.4 and t is .04 running interpretive.

George

Yes, but this is relative in Windows. If I do several runs with exw of Euphoria 2.5 I will get results in the range from t=0.4 to t=0.11. Running the code with exw of Eu 3.1.1 gives results with t=0.06 to t=0.15. The time differences between Eu 2.4, Eu 3.1.1 and Eu4 rev 1351 are not so significant as the difference between Eu rev 1351 and eu rev 1352.

Roland

new topic     » goto parent     » topic index » view message » categorize

4. Re: Sequence test

RStowasser said...

I tried this code in Windows (Vista) with exwc of rev 1351 and 1352.
My result for 1351 is t=0.13 and for 1352 t=155.99. There must be a change in rev 1352 (introduces inline) which creates this strange behaviour. This must not necessarily be because of inlining.
If I add: without inline in the code and run it with rev 3025 I get the same result with or without this line: t=194.5.
Btw if I translate with euc -con the result is: t=0.09

The issue has to do with the way that temps are handled. Here is the IL code for the key line (the call to Clone):

        [seq.ex:28]  
    40: 163 144 137 163 164  # LHS_SUBS1: [text:144], [LIT 1:137] => [_temp_:163] (UNUSED -  
                             #     [_temp_:164]) 
    45: 025 144 137 165      # RHS_SUBS: [text:144] sub [LIT 1:137] => [_temp_:165] 
    49: 092 165 145 166      # RHS_SUBS_CHECK: [_temp_:165] sub [lineno:145] => [_temp_:166] 
    53: 027 147 166 167      # FUNC: [Clone:147][_temp_:166] => [_temp_:167] 
    57: 164 163 145 167      # PASSIGN_SUBS: [_temp_:163], [lineno:145] <= [_temp_:167] 
    61: 210 163              # DEREF_TEMP: [_temp_:163]  
    63: 210 164              # DEREF_TEMP: [_temp_:164]  
    65: 210 165              # DEREF_TEMP: [_temp_:165]  
    67: 210 166              # DEREF_TEMP: [_temp_:166]  
    69: 210 167              # DEREF_TEMP: [_temp_:167]  


The key issue here is that _temp_:163 and _temp_:165 both point to the same thing: text[1]. They each increase the ref count on that sequence. Now, the way that temps are currently dereferenced is that after the complete statement, they are dereferenced. We need to make sure that they aren't dereference too soon, and that's an easy way to accomplish it.

However, in the case of RHS_SUBS ops, we are done with the first operand (in this case, _temp_:165) immediately after the call. This causes an extra, needless copy to be done of this large sequence. There's also the minor issue of _temp_:164, which is not used except for LHS_SUBS1_COPY ops, so we don't need to later deref it. That's not really noticeable, though, since it's nearly a NOP.

With that in mind, here is what the adjusted IL looks like:

        [seq.ex:28]  
    40: 163 144 137 163 164  # LHS_SUBS1: [text:144], [LIT 1:137] => [_temp_:163] (UNUSED -  
                             #     [_temp_:164]) 
    45: 025 144 137 165      # RHS_SUBS: [text:144] sub [LIT 1:137] => [_temp_:165] 
    49: 092 165 145 166      # RHS_SUBS_CHECK: [_temp_:165] sub [lineno:145] => [_temp_:166] 
    53: 210 165              # DEREF_TEMP: [_temp_:165]  
    55: 027 147 166 167      # FUNC: [Clone:147][_temp_:166] => [_temp_:167] 
    59: 164 163 145 167      # PASSIGN_SUBS: [_temp_:163], [lineno:145] <= [_temp_:167] 
    63: 210 166              # DEREF_TEMP: [_temp_:166]  
    65: 210 167              # DEREF_TEMP: [_temp_:167]  


I'm having trouble building a fully updated binary (related to the memory changes that Derek made), but when I run using int.ex and version r3020 of eui (linux), my time to run goes from ~17s to ~0.01. A workaround for now is to avoid the extra subscripts at the beginning, caching text[1] to another variable and operating directly upon it. This is probably only noticeable in large sequences inside of a loop like the test code.

Matt

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu