1. EUPHORIA BUG!!!! & Test
Hi Robert Craig & all Interested,
I still have no clue as to the cause of this bug and if it is being
dealt with. As far as i know this is a very very Nusty Bug. That is
causing
delays in many people's code.
x = x&x1&x2 -- is not at all optimized(very very slow) compared to
x&= x1 & x2 -- or
x = x & x1
x = x & x2 -- or
x&=x1
x&=x2
-- START
-- Start Test Code!
include get.e
object data
data = repeat("Euphoria",100000)--
{"Euphoria","Euphoria",......"Euphoria"}
-- Deparse retokenizes a sequence based on the delimeter c.
-- if c was 32(space) data would become
-- {"Euphoria Euphoria Euphoria Euphoria.."}
global function deparse1(sequence list, integer c)
sequence s
object t t = time()
if length(list) then
s = list[1]
for i = 2 to length(list) by 1 do
s &=c&list[i]
end for
puts(1,sprintf("Time using s&=c&list[i]: Deparse1:
%.4f\n",time()-t))
return s
end if
return ""
end function
global function deparse2(sequence list, integer c)
sequence s
object t t = time()
if length(list) then
s = list[1]
for i = 2 to length(list) by 1 do
s =s&c&list[i]
end for
puts(1,sprintf("Time using s= s&c&list[i]: Deparse2:
%.4f\n",time()-t))
return s
end if
return ""
end function
-- Start Test
object result1,result2
result1 = deparse1(data,32)
result2 = deparse2(data,32)
puts(1,"\n**************** Test Finished *****************\nPress any
key to quit\n")
result1 = wait_key()
-- END
The above is not the best test to give but atleast it exposes the bug
and its
extent of damage.
Jordah Ferguson
aka Sir LoJik
2. Re: EUPHORIA BUG!!!! & Test
Hi Jordah,
try x=x1&{1}&x2, in stead
it will always be more troublesome to concatenate a sequence,
an integer and a sequence, or a variable, a direct value and a variable again.
I always make types of variables equal before doing concatenation.
It is no longer slow then.
EUrs a@t
3. Re: EUPHORIA BUG!!!! & Test
Jordah Ferguson writes:
> I still have no clue as to the cause of this bug and if it is being
> dealt with. As far as i know this is a very very Nusty Bug. That is
> causing delays in many people's code.
>
> x = x & x1 & x2 -- is not at all optimized(very very slow) compared to
>
> x &= x1 & x2
In your example it appears that you are looping 100,000 times,
while creating a sequence that grows from 1 to 200,000 in size.
Keep in mind that the time to perform a string concatenation,
c = a & b
will normally be proportional to the sum of the lengths of a and b,
since what will typically happen is that space will be allocated equal
to length(a) + length(b), and then all the elements of a plus all the
elements of b will be copied into the new space, c. Only the top-level
elements are actually copied, e.g. only a 4-byte pointer is copied for
a string or other sequence.
That means if you perform a concatenation inside a loop, n times,
growing a large sequence, the time to perform that loop will grow
in proportion to n-squared. for example, looping 100,000 times
should take 100x (not 10x) longer, than looping only 10,000 times.
You should not be asking why the first form is so slow.
You should be asking why the second form is so incredibly fast.
Seriously, there are some additional cases of concatenation that can be
optimized. For example:
a = b & c & d
can be made faster by copying b and c and d into a new space for a,
rather than, as happens now, b and c are copied into a temp, and
then the temp and d are copied into a space for a. This means that
the data for b and c are copied *twice*. This will only give you
modest speed-up however. The real speed-ups come when Euphoria
reserves some extra space at the end of a large sequence,
and inserts the second sequence, rather than making a whole
new sequence by copying.
e.g.
s = s & t
where t is much smaller than s, will simply insert t at the end of
s, provided there is enough space, and provided s only has
one reference count.
Anyway, I'll look into optimizing concatenation some more.
Thanks,
Rob Craig
Rapid Deployment Software
http://www.RapidEuphoria.com