1. Faster lower() command
- Posted by Jesse Adkins <Tassadar29 at lycos.com> Jan 17, 2007
- 597 views
- Last edited Jan 18, 2007
It's been a while since i've posted here. I recently put this together, from code inside of the guru.ex file, and the current lower() command. The one inside of guru.ex is used to handle sequence lowering, and atom lowering is handled by the current function. I chose to use the current command for atom lowering using the guru.ex method is quite slow with atoms.
function lower1( sequence x ) integer c for i = 1 to length(x) do c = x[i] if c <= 'Z' then if c >= 'A' then x[i] = c + 32 end if end if end for return x end function function lower(object x) if sequence(x) then return lower1(x) else return x + (x >= 'A' and x <= 'Z') * TO_LOWER end if end function
I put the sequence first, since I expect that lower() will mostly be used on sequences. Any comments/questions/etc.?
2. Re: Faster lower() command
- Posted by jacques deschênes <desja at globetrotter.net> Jan 19, 2007
- 553 views
here is the power of euphoria
function lower(object o) return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence end function
regards, Jacques D. Jesse Adkins wrote: > > It's been a while since i've posted here. I recently put this together, from > code inside of the guru.ex file, and the current lower() command. > > The one inside of guru.ex is used to handle sequence lowering, and atom > lowering is handled by the current function. I chose to use the current > command for atom lowering using the guru.ex method is quite slow with atoms. > }}} <eucode> > function lower1( sequence x ) > integer c > for i = 1 to length(x) do > c = x[i] > if c <= 'Z' then > if c >= 'A' then > x[i] = c + 32 > end if > end if > end for > return x > end function > > function lower(object x) > if sequence(x) then > return lower1(x) > else > return x + (x >= 'A' and x <= 'Z') * TO_LOWER > end if > end function > </eucode> {{{ > > I put the sequence first, since I expect that lower() will mostly be used on > sequences. > > Any comments/questions/etc.?
3. Re: Faster lower() command
- Posted by Jesse Adkins <Tassadar29 at lycos.com> Jan 20, 2007
- 557 views
jacques deschênes wrote: > > here is the power of euphoria > > }}} <eucode> > function lower(object o) > return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence > end function > </eucode> {{{ > > > regards, > Jacques D. I'm well aware of that command, since it's what's in place already. The command that I put up for display processes a sequence much faster than the current code for lower(). A test I ran, with all letters in caps, ran about twice as fast as the current code. I kept the original code for doing atoms, since the guru.ex code doesn't work so well with atoms for some reason I'm hoping the code I posted earlier could become part of Euphoria some day, despite the fact that it's a good deal larger than the current lower() command. -- Jesse Adkins.
4. Re: Faster lower() command
- Posted by jacques deschênes <desja at globetrotter.net> Jan 20, 2007
- 539 views
Hi didn't run any test but I'm surprised that a for loop written in euphoria is faster than a loop internal to the interpreter! Jesse Adkins wrote: > > jacques deschênes wrote: > > > > here is the power of euphoria > > > > }}} <eucode> > > function lower(object o) > > return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence > > end function > > </eucode> {{{ > > > > > > regards, > > Jacques D. > > I'm well aware of that command, since it's what's in place already. The > command that I put up for display processes a sequence much faster than the > current code for lower(). A test I ran, with all letters in caps, ran about > > twice as fast as the current code. > > I kept the original code for doing atoms, since the guru.ex code doesn't work > so well with atoms for some reason > > I'm hoping the code I posted earlier could become part of Euphoria some day, > > despite the fact that it's a good deal larger than the current lower() > command. > > -- Jesse Adkins.
5. Re: Faster lower() command
- Posted by jacques deschênes <desja at globetrotter.net> Jan 20, 2007
- 550 views
Hi Jesse, As I was quite skeptic I made my own test here
function lower1( sequence x ) integer c for i = 1 to length(x) do c = x[i] if c <= 'Z' then if c >= 'A' then x[i] = c + 32 end if end if end for return x end function function lower(object o) return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence end function atom t constant txt=repeat('A',1000000) sequence l t = time() l = lower1(txt) ? time()-t t = time() l = lower(txt) ? time()-t
on my laptop lower1() give .12sec lower() give .09sec As I expected the for loop is slower than the interpreter internal loop. I can't figure how lower1() can be faster on your machine. regards, Jacques Deschênes Jesse Adkins wrote: > > jacques deschênes wrote: > > > > here is the power of euphoria > > > > }}} <eucode> > > function lower(object o) > > return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence > > end function > > </eucode> {{{ > > > > > > regards, > > Jacques D. > > I'm well aware of that command, since it's what's in place already. The > command that I put up for display processes a sequence much faster than the > current code for lower(). A test I ran, with all letters in caps, ran about > > twice as fast as the current code. > > I kept the original code for doing atoms, since the guru.ex code doesn't work > so well with atoms for some reason > > I'm hoping the code I posted earlier could become part of Euphoria some day, > > despite the fact that it's a good deal larger than the current lower() > command. > > -- Jesse Adkins.
6. Re: Faster lower() command
- Posted by Larry Miller <larrymiller at sasktel.net> Jan 20, 2007
- 551 views
jacques deschênes wrote: > > Hi Jesse, > As I was quite skeptic I made my own test here > }}} <eucode> > function lower1( sequence x ) > integer c > for i = 1 to length(x) do > c = x[i] > if c <= 'Z' then > if c >= 'A' then > x[i] = c + 32 > end if > end if > end for > return x > end function > > function lower(object o) > return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence > end function > > > atom t > constant txt=repeat('A',1000000) > sequence l > > t = time() > l = lower1(txt) > ? time()-t > t = time() > l = lower(txt) > ? time()-t > > </eucode> {{{ > on my laptop > lower1() give .12sec > lower() give .09sec > As I expected the for loop is slower than the interpreter internal loop. > I can't figure how lower1() can be faster on your machine. > > regards, > Jacques Deschênes > > I can confirm Jesse Adkins findings. On my machine I get the following results: Time for lower1() .06 seconds Time for lower() .24 seconds Reversing lower() and lower1() in the program made little difference. I tried changing the statement "constant tct=repeat('A',1000000)" to "constant tct=repeat('A',10000000)" to get more accurate timings. The time for lower1() was reported as .61 seconds as expected. The time for lower() was taking much longer than expected, then windows reported I was low on virtual memory. It eventually finished after 82 seconds. There was almost continuous disk activity during this time. Possibly not related but it would seem that lower() requires considerably more memory. An incident like this is very rare for me. This is with a 1.3GHZ Pentium4, Windows2000 Professional, 256MB RAM. The lower1() function is very similar to lower_case() found in wild.e by Colin Taylor back in Nov 25 2000. Search for "Faster Wildcard". I have done similar tests in the past, all with similar results. Larry Miller
7. Re: Faster lower() command
- Posted by jacques deschênes <desja at globetrotter.net> Jan 20, 2007
- 536 views
reran the test with a 100 times longer string (10000000) and got 0.66 sec for lower1() 0.94 sec for lower() effectively the for loop is faster on very long string!
8. Re: Faster lower() command
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jan 20, 2007
- 538 views
jacques deschênes wrote: > on my laptop > lower1() give .12sec > lower() give .09sec I not only agree with other posters but am downright jaw-dropping astonished that you got lower() to appear faster that lower1() in ANY circumstances! My analysis was that lower() must create 5 or 6 * 1,000,000 - long sequences before finally calculating the result. > I can't figure how lower1() can be faster on your machine. Are you by any chance on a Linux box while we're all on Windows? Puzzled, Pete
9. Re: Faster lower() command
- Posted by Jesse Adkins <Tassadar29 at lycos.com> Jan 20, 2007
- 535 views
To clarify for everyone, this is what I used for benchmarking.
constant ITERATIONS = 1000000 integer p atom t0, loop_overhead t0 = time() for i = 1 to ITERATIONS do -- time an empty loop end for loop_overhead = time() - t0 t0 = time() for i = 1 to ITERATIONS do p = power(2, 20) end for ? (time() - t0 - loop_overhead)/ITERATIONS -- calculates time (in seconds) for one call to power
It's the code directly from the documentation under the time() command. I swapped out 'p = power(2, 20)' for my own command. Quoting the mail I sent to Rob Craig... Converting the sequence ("A REALLY LONG SEQUENCE WITH JUNK INSIDE.") Old -> (2.26 to 2.28 e-06) New -> (1.34 to 1.36 e-06) The benchmarks were done on Windows XP Home. -- Jesse Adkins
10. Re: Faster lower() command
- Posted by jacques deschênes <desja at globetrotter.net> Jan 20, 2007
- 553 views
I'm using windows xp pro running on DELL inpiron 1501 with 1Go Ram CPU AMD turion 64x2 1.60Ghz Why euphoria interpreter should create 5 or 6 * 1000000 long sequence? each character is represented by an integer so its a 4000000 bytes array new test
function lower1( sequence x ) integer c for i = 1 to length(x) do c = x[i] if c <= 'Z' then if c >= 'A' then x[i] = c + 32 end if end if end for return x end function function lower(object o) return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence end function atom t constant txt=repeat('A',1000000) sequence l t = time() l = lower1(txt) ? time()-t t = time() l = lower(txt) ? time()-t t = time() for i = 1 to 1000 do l = lower1(txt) end for ? time() - t t = time() for i = 1 to 1000 do l = lower(txt) end for ? time() - t
result 0.13 0.11 64.2 95.41 This time its clear that lower1() is faster than lower() I don't know how the loop is implemented inside euuphoria interpreter but there is place for optimization! regards, Jacques Deschênes Pete Lomax wrote: > > jacques deschênes wrote: > > on my laptop > > lower1() give .12sec > > lower() give .09sec > I not only agree with other posters but am downright jaw-dropping astonished > that you got lower() to appear faster that lower1() in ANY circumstances! > > My analysis was that lower() must create 5 or 6 * 1,000,000 - long sequences > before finally calculating the result. > > > I can't figure how lower1() can be faster on your machine. > > Are you by any chance on a Linux box while we're all on Windows? > > Puzzled, > Pete
11. Re: Faster lower() command
- Posted by Jason Gade <jaygade at yahoo.com> Jan 20, 2007
- 522 views
- Last edited Jan 21, 2007
Just a note--timing stuff on my laptop is unreliable because the hardware and software adjusts the CPU speed depending on load. Try it on a desktop and see if you get a difference. -- "Any programming problem can be solved by adding a level of indirection." --anonymous "Any performance problem can be solved by removing a level of indirection." --M. Haertel "Premature optimization is the root of all evil in programming." --C.A.R. Hoare j.
12. Re: Faster lower() command
- Posted by Chris Bensler <bensler at nt.net> Jan 21, 2007
- 536 views
jacques deschênes wrote: > > I'm using windows xp pro running on DELL inpiron 1501 with 1Go Ram > CPU AMD turion 64x2 1.60Ghz > > Why euphoria interpreter should create 5 or 6 * 1000000 long sequence? > each character is represented by an integer so its a 4000000 bytes array let x be "ABC" and our expression is: x = x + ((x >= 'A') and (x <= 'Z')) * 32 Break down the expression into simplest terms.. tmp1 = (x >= 'A') -- {1,1,1} = ({65,66,67} >= 65) tmp2 = (x <= 'Z') -- {1,1,1} = ({65,66,67} <= 90) tmp1 = (tmp1 and tmp2) -- {1,1,1} = ({1,1,1} and {1,1,1}) tmp1 = tmp1 * 32 -- {32,32,32} = {1,1,1} * 32 x = x + tmp1 -- {97,98,99} = {65,66,67} + {32,32,32} 2 tmp sequences would be required. Chris Bensler ~ The difference between ordinary and extraordinary is that little extra ~ http://empire.iwireweb.com - Empire for Euphoria
13. Re: Faster lower() command
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jan 21, 2007
- 553 views
jacques deschênes wrote: > Why euphoria interpreter should create 5 or 6 * 1000000 long sequence? > return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence
tmp1 = (o>='A') -- a 1000000 sequence of 0 and 1 tmp2 = (o<='Z') -- a 1000000 sequence of 0 and 1 tmp1 = tmp1 and tmp2 -- "" "" tmp1 *= 32 -- "" of 0 and 32 tmp1 = o+tmp1 -- "" (the result) return tmp1
I get far more consistent results when called in a for loop, and I have noticed tests like this tend to stongly favour whatever comes second, best to run them in separate programs in the name of equality. There might be a margin of say 5% for the 1-liner, but it is nothing like 4 or 5 times faster as you might at first expect. The more you play with the above two code snippets, the more you realise they are essentially the same. > I don't know how the loop is implemented inside euuphoria interpreter but > there is place for optimization! I could enter a long rant about why sequence ops are such a bad idea, but I think I'll spare you today. Chris Bensler wrote: >Pete: peek() will accept peek({lp,0}) which will result in a null string Thanks, I'll edit my copy of that routine. Regards, Pete
14. Re: Faster lower() command
- Posted by Robert Craig <rds at rapideuphoria.com> Jan 21, 2007
- 537 views
Pete Lomax wrote: > jacques deschênes wrote: > > Why euphoria interpreter should create 5 or 6 * 1000000 long sequence? > > return o + (o>='A' and o<='Z')*32 -- work as well with atom or sequence > }}} <eucode> > tmp1 = (o>='A') -- a 1000000 sequence of 0 and 1 > tmp2 = (o<='Z') -- a 1000000 sequence of 0 and 1 > tmp1 = tmp1 and tmp2 -- "" "" > tmp1 *= 32 -- "" of 0 and 32 > tmp1 = o+tmp1 -- "" (the result) > return tmp1 > </eucode> {{{ I took a quick look at the IL. (First bind the program. Then run ex \euphoria\source\showil lower.exe and see icode.lst) It does only use two temps, so it isn't wasteful in that regard. > I get far more consistent results when called in a for loop, and I have > noticed > tests like this tend to stongly favour whatever comes second, best to run them > in separate programs in the name of equality. > There might be a margin of say 5% for the 1-liner, but it is nothing like 4 > or 5 times faster as you might at first expect. The more you play with the > above > two code snippets, the more you realise they are essentially the same. It shouldn't make much difference if you have one complicated statement, or 5 simple ones. The IL will be about the same. > > I don't know how the loop is implemented inside euuphoria interpreter but > > there is place for optimization! > > I could enter a long rant about why sequence ops are such a bad idea, but I > think I'll spare you today. I think a major hidden factor here, is the on-chip CPU cache. The sequence-ops version can't make good use of the cache. It reads and writes 1000000-long sequences many times, and the data is far too big to be stored in the on-chip data cache. The sequence data must always be stored/retrieved from RAM, since even a secondary cache would not be big enough in this case. The for-loop version does not make good use of the cache either, but it only goes through the sequence data once. I suspect the sequence-ops version would look much better if the text sequences were of the length of typical text files, such as 80 characters or less. However, if you have very small sequences, the overhead of allocating and deallocating space in the sequence-ops version will become significant. And when you have really huge sequences, the operating system might have to struggle a bit to find that much contiguous RAM, at least on the first iteration. So there is probably a happy medium somewhere. If you studied this carefully, with a range of sizes, you might find that the sequence ops version has sharp degradation at the point where the data can't fit in the on-chip (32K-bytes?) cache anymore, and another degradation at the point where the secondary cache (1Mb?) becomes too small for all the data. I personally use the for-loop version for scanning typical text files, which average maybe 25-40 characters per line. The Translator can speed up the for-loop version more than it can the sequence-ops version. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com