Re: which is faster?
- Posted by Fernando Bauer <fmbauer at h?tm?il.com> Nov 10, 2007
- 641 views
Hi, I think the reason of relative low speed of variant 0 is because, commonly, the computed mean is not an Euphoria integer. Then, the subtraction needs to use floating point which is slower. Also, in squaring the deviations, floating point operations again occur. But if the sequence (data) contains mainly Euphoria integers (as in the showed case), it's possible to round the mean and obtain more rapidly the integer deviations and their squares. In the end, the effect of using the rounded mean is compensated. Below, variants 6, 7, 8 and 9 use this trick and are compared with the other implementations. The magnitude of the mean (offset) can influence the speed of the variants (1,2,3,4,5) that don't subtract the mean. Finally, in the case of non-integers, the simulations show that it's better not use the trick.
function mean(sequence s) -- adds verything up in s, returns average atom result result=s[1] for i=2 to length(s) do result+=s[i] end for return result/length(s) end function global function st_dev0(sequence s, atom X) atom m, sum, sd integer l sum = 0 l = length(s) m = mean(s) s -= m for i = 1 to length(s) do sum += power(s[i], 2) end for sd = sqrt(sum/(l-1)) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev1(sequence s, atom X) atom m, sum, sd integer l sum = 0 l = length(s) m = mean(s) for i = 1 to length(s) do sum += power(s[i], 2) end for sum -= m*m*l sd = sqrt(sum/(l-1)) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev2(sequence s, atom X) atom m, sum, sd integer l sum = 0 l = length(s) m = mean(s) for i = 1 to length(s) do sd = s[i] sum += sd*sd end for sum -= m*m*l sd = sqrt(sum/(l-1)) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev3(sequence s, atom X) atom m, sum, sd , m2 integer l sum = 0 l = length(s) m = mean(s) m2 = m*m for i = 1 to length(s) do sd = s[i] sum += sd*sd end for sum -= m2 sd = sqrt((sum/(l-1)) - m2) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev4(sequence s, atom X) atom m, sum, sum_sq, sd , m2 integer l sum_sq = 0.0 sum = 0.0 l = length(s) for i = 1 to length(s) do sd = s[i] sum += sd sum_sq += sd*sd end for m = sum/l m2 = m*m sum_sq -= m2 sd = sqrt((sum_sq/(l-1)) - m2) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev5(sequence s, atom X) atom m, sum, sum_sq, sd , m2 integer l object x sum_sq = 0.0 sum = 0.0 l = length(s) for i = 1 to length(s) do x = s[i] sum += x sum_sq += x*x end for m = sum/l m2 = m*m sum_sq -= m2 sd = sqrt((sum_sq/(l-1)) - m2) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev6(sequence s, atom X) atom m, sum, sd integer l sum = 0 l = length(s) m = mean(s) s -= floor(m) for i = 1 to length(s) do sum += power(s[i], 2) end for sum -= l*power(m-floor(m),2) sd = sqrt(sum/(l-1)) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev7(sequence s, atom X) atom m, sum, sd, m1 integer l sum = 0 l = length(s) m = mean(s) m1 = floor(m) for i = 1 to length(s) do sum += power(s[i]-m1, 2) end for sum -= l*power(m-m1,2) sd = sqrt(sum/(l-1)) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev8(sequence s, atom X) atom m, sum, sd integer l sum = 0 l = length(s) m = mean(s) s -= floor(m) s *= s for i = 1 to length(s) do sum += s[i] end for sum -= l*power(m-floor(m),2) sd = sqrt(sum/(l-1)) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function global function st_dev9(sequence s, atom X) atom m, sum, sd integer l sum = 0 l = length(s) m = mean(s) s = power(s-floor(m),2) for i = 1 to length(s) do sum += s[i] end for sum -= l*power(m-floor(m),2) sd = sqrt(sum/(l-1)) if X = 0 then return sd else -- return the z-score return (X - m)/sd end if end function atom t, t0 procedure show(integer i) printf(1,"Variant form #%d took %5.2fs. (%5.1f%%)\n", {i, t, 100*t/t0}) end procedure procedure test(sequence title, sequence s) atom junk printf(1,"%s\n", {title}) t=time() for i=1 to length(s) do junk=st_dev0(s[i],0.0) end for t = time()-t t0 = t show(0) t=time() for i=1 to length(s) do junk=st_dev1(s[i],0.0) end for t = time()-t show(1) t=time() for i=1 to length(s) do junk=st_dev2(s[i],0.0) end for t = time()-t show(2) t=time() for i=1 to length(s) do junk=st_dev3(s[i],0.0) end for t = time()-t show(3) t=time() for i=1 to length(s) do junk=st_dev4(s[i],0.0) end for t = time()-t show(4) t=time() for i=1 to length(s) do junk=st_dev5(s[i],0.0) end for t = time()-t show(5) t=time() for i=1 to length(s) do junk=st_dev6(s[i],0.0) end for t = time()-t show(6) t=time() for i=1 to length(s) do junk=st_dev7(s[i],0.0) end for t = time()-t show(7) t=time() for i=1 to length(s) do junk=st_dev8(s[i],0.0) end for t = time()-t show(8) t=time() for i=1 to length(s) do junk=st_dev9(s[i],0.0) end for t = time()-t show(9) puts(1,'\n') end procedure sequence s,s1 -- data constant num_runs=5000, len_data=6000 s=repeat(0,num_runs) s1=repeat(0,len_data) -- generate random data for k=1 to num_runs do for i=1 to len_data do s1[i]=rand(12345+k) end for s[k]=s1 end for puts(1,"Data initialised, starting benchmark...\n") test("Original random data",s) test("Data 50%=0 50%=1",repeat((repeat(0,len_data/2)&repeat(1,len_data/2)), num_runs)) test("Data 50%=1000 50%=1001",repeat((repeat(1000,len_data/2)&repeat(1001,len_data/2)), num_runs)) test("Data 50%=10000 50%=10001",repeat((repeat(10000,len_data/2)&repeat(10001,len_data/2)), num_runs)) test("Data 50%=0.1 50%=1.1",repeat((repeat(0.1,len_data/2)&repeat(1.1,len_data/2)), num_runs)) ?machine_func(26,0)
My results: Data initialised, starting benchmark... Original random data Variant form #0 took 4.47s. (100.0%) Variant form #1 took 2.24s. ( 50.1%) Variant form #2 took 2.84s. ( 63.5%) Variant form #3 took 2.86s. ( 64.0%) Variant form #4 took 2.84s. ( 63.5%) Variant form #5 took 2.41s. ( 53.9%) Variant form #6 took 2.51s. ( 56.2%) Variant form #7 took 2.35s. ( 52.6%) Variant form #8 took 2.83s. ( 63.3%) Variant form #9 took 2.84s. ( 63.5%) Data 50%=0 50%=1 Variant form #0 took 4.47s. (100.0%) Variant form #1 took 1.48s. ( 33.1%) Variant form #2 took 2.10s. ( 47.0%) Variant form #3 took 2.09s. ( 46.8%) Variant form #4 took 2.00s. ( 44.7%) Variant form #5 took 1.64s. ( 36.7%) Variant form #6 took 1.59s. ( 35.6%) Variant form #7 took 1.49s. ( 33.3%) Variant form #8 took 1.98s. ( 44.3%) Variant form #9 took 2.00s. ( 44.7%) Data 50%=1000 50%=1001 Variant form #0 took 4.46s. (100.0%) Variant form #1 took 2.06s. ( 46.2%) Variant form #2 took 2.69s. ( 60.3%) Variant form #3 took 2.70s. ( 60.5%) Variant form #4 took 2.64s. ( 59.2%) Variant form #5 took 2.27s. ( 50.9%) Variant form #6 took 1.59s. ( 35.7%) Variant form #7 took 1.50s. ( 33.6%) Variant form #8 took 1.97s. ( 44.2%) Variant form #9 took 1.98s. ( 44.4%) Data 50%=10000 50%=10001 Variant form #0 took 4.49s. (100.0%) Variant form #1 took 2.18s. ( 48.6%) Variant form #2 took 2.83s. ( 63.0%) Variant form #3 took 2.82s. ( 62.8%) Variant form #4 took 2.79s. ( 62.1%) Variant form #5 took 2.46s. ( 54.8%) Variant form #6 took 1.62s. ( 36.1%) Variant form #7 took 1.52s. ( 33.9%) Variant form #8 took 1.98s. ( 44.1%) Variant form #9 took 2.00s. ( 44.5%) Data 50%=0.1 50%=1.1 Variant form #0 took 5.34s. (100.0%) Variant form #1 took 4.30s. ( 80.5%) Variant form #2 took 4.66s. ( 87.3%) Variant form #3 took 4.68s. ( 87.6%) Variant form #4 took 4.52s. ( 84.6%) Variant form #5 took 4.16s. ( 77.9%) Variant form #6 took 5.31s. ( 99.4%) Variant form #7 took 5.45s. (102.1%) Variant form #8 took 5.14s. ( 96.3%) Variant form #9 took 5.32s. ( 99.6%) Regards, Fernando