1. RE: poke - peek Speed?
- Posted by Don Phillips <EuNexus at yahoo.com> Nov 13, 2002
- 447 views
::CODE SNIP:: > I will use it to write to a file values less than 32768 in byte format. > That > is why I use integer and not atom arguments. > Why does the poke - peek version take more time than the other, that > uses > division? Using allocate_low instead of allocate, it even takes more > time! > TIA. Is your algorithm for to2bytes correct? Looking into machine.e I see the fuction int_to_bytes is declared as: global function int_to_bytes(atom x) -- returns value of x as a sequence of 4 bytes -- that you can poke into memory -- {bits 0-7, (least significant) -- bits 8-15, -- bits 16-23, -- bits 24-31} (most significant) -- This is the order of bytes in memory on 386+ machines. integer a,b,c,d a = remainder(x, #100) x = floor(x / #100) b = remainder(x, #100) x = floor(x / #100) c = remainder(x, #100) x = floor(x / #100) d = remainder(x, #100) return {a,b,c,d} end function You will notice it returns >>> {a,b,c,d} so the first two bytes would be a = remainder(x, #100) x = floor(x / #100) b = remainder(x, #100) However your version written as: function to2bytes(integer u) return and_bits(u, #FF) & floor(u / 256) end function Only does the a and x calculation... what happened to b? If I am correct and your algorithm is slightly off, that will explain the speed difference. Don
2. RE: poke - peek Speed?
- Posted by rforno at tutopia.com Nov 13, 2002
- 436 views
- Last edited Nov 14, 2002
Yes, it is correct, because I will provide to it numbers in the range 0 - 32767. But you are right: I am performing less calculations than the poke4 - peek version. However, the poke4 - peek version surely works by shifting instead of dividing, or merely moving bytes around, so it should be faster. Don't you think so? ----- Original Message ----- From: Don Phillips <EuNexus at yahoo.com> Sent: Wednesday, November 13, 2002 3:25 PM Subject: RE: poke - peek Speed? > > ::CODE SNIP:: > > > I will use it to write to a file values less than 32768 in byte format. > > That > > is why I use integer and not atom arguments. > > Why does the poke - peek version take more time than the other, that > > uses > > division? Using allocate_low instead of allocate, it even takes more > > time! > > TIA. > > Is your algorithm for to2bytes correct? Looking into machine.e I see > the fuction int_to_bytes is declared as: > > global function int_to_bytes(atom x) > -- returns value of x as a sequence of 4 bytes > -- that you can poke into memory > -- {bits 0-7, (least significant) > -- bits 8-15, > -- bits 16-23, > -- bits 24-31} (most significant) > -- This is the order of bytes in memory on 386+ machines. > integer a,b,c,d > > a = remainder(x, #100) > x = floor(x / #100) > b = remainder(x, #100) > x = floor(x / #100) > c = remainder(x, #100) > x = floor(x / #100) > d = remainder(x, #100) > return {a,b,c,d} > end function > > You will notice it returns >>> {a,b,c,d} so the first two bytes would be > > a = remainder(x, #100) > x = floor(x / #100) > b = remainder(x, #100) > > However your version written as: > > function to2bytes(integer u) > return and_bits(u, #FF) & floor(u / 256) > end function > > Only does the a and x calculation... what happened to b? > If I am correct and your algorithm is slightly off, that will explain > the speed difference. > > Don > > > >
3. RE: poke - peek Speed?
- Posted by Don Phillips <EuNexus at yahoo.com> Nov 15, 2002
- 425 views
> Yes, it is correct, because I will provide to it numbers in the range 0 > - > 32767. But you are right: I am performing less calculations than the > poke4 - > peek version. However, the poke4 - peek version surely works by shifting > instead of dividing, or merely moving bytes around, so it should be > faster. > Don't you think so? Ahh yes, with only that number range the algo is indeed correct. The poke4 - peek version doesnt work by either shifting or dividing. poke4 writes a single 32 bit dword from any memory address where poke reads 8 bit byte values (in this case two of them). It is simply direct memory access with no math involved. The slow up (as pointed out by Mr Craig) is this line: return( peek({Addr1,2}) ) I dont really understand the difference between peek({Addr1,2}) and peek(Addr1) & peek(Addr1+1), they both build a sequence. But the second version just about matches the results you would expect so internally it must be be more efficient. function to2bytesa(integer u) poke4(Addr1, u) return( peek(Addr1) & peek(Addr1+1) ) end function
4. RE: poke - peek Speed?
- Posted by Matthew Lewis <matthewwalkerlewis at YAHOO.COM> Nov 15, 2002
- 424 views
> From: Don Phillips [mailto:EuNexus at yahoo.com] > The slow up (as pointed out by Mr Craig) is this line: > return( peek({Addr1,2}) ) > > I dont really understand the difference between peek({Addr1,2}) and > peek(Addr1) & peek(Addr1+1), they both build a sequence. But > the second > version just about matches the results you would expect so > internally it > must be be more efficient. It's not the return sequence, but the sequence: {Addr1,2} that's the problem. Rob's suggestion was to do this: sequence peek2addr peek2addr = {Addr1,2} function to2bytesa(integer u) poke4(Addr1, u) return peek( peek2addr ) end function Matt Lewis
5. RE: poke - peek Speed?
- Posted by Don Phillips <EuNexus at yahoo.com> Nov 15, 2002
- 424 views
> > The slow up (as pointed out by Mr Craig) is this line: > > return( peek({Addr1,2}) ) > > > > I dont really understand the difference between peek({Addr1,2}) and > > peek(Addr1) & peek(Addr1+1), they both build a sequence. But > > the second > > version just about matches the results you would expect so > > internally it > > must be be more efficient. > > It's not the return sequence, but the sequence: {Addr1,2} that's the > problem. Rob's suggestion was to do this: > > sequence peek2addr > peek2addr = {Addr1,2} > > function to2bytesa(integer u) > poke4(Addr1, u) > return peek( peek2addr ) > end function > > Matt Lewis Ahh, thanks Matt =) I didnt quite get what he was refering to. Thats quite a good improvement. I dont know if removing the sequences completely will break the rest of your program, but without them there is another difference in speed... integer lobyte integer hibyte procedure to2bytesD(integer u) poke4(Addr1, u) lobyte = peek(Addr1) hibyte = peek(Addr1+1) end procedure
6. RE: poke - peek Speed?
- Posted by rforno at tutopia.com Nov 16, 2002
- 442 views
It is *much* slower this way, as it is pointed out in the docs. The only way I've found to improve the performance is: sequence Addr2 Addr2 = {Addr1, 2} ..... return peek(Addr2) as it was suggested by Rob in a post in answer to mine. However, this only makes the 'division' and the 'peek' methods exactly equal in performance. Regards. ----- Original Message ----- From: Don Phillips <EuNexus at yahoo.com> Sent: Friday, November 15, 2002 12:40 PM Subject: RE: poke - peek Speed? > > > Yes, it is correct, because I will provide to it numbers in the range 0 > > - > > 32767. But you are right: I am performing less calculations than the > > poke4 - > > peek version. However, the poke4 - peek version surely works by shifting > > instead of dividing, or merely moving bytes around, so it should be > > faster. > > Don't you think so? > > Ahh yes, with only that number range the algo is indeed correct. The > poke4 - peek version doesnt work by either shifting or dividing. poke4 > writes a single 32 bit dword from any memory address where poke reads 8 > bit byte values (in this case two of them). It is simply direct memory > access with no math involved. > > The slow up (as pointed out by Mr Craig) is this line: > return( peek({Addr1,2}) ) > > I dont really understand the difference between peek({Addr1,2}) and > peek(Addr1) & peek(Addr1+1), they both build a sequence. But the second > version just about matches the results you would expect so internally it > must be be more efficient. > > function to2bytesa(integer u) > poke4(Addr1, u) > return( peek(Addr1) & peek(Addr1+1) ) > end function > > > >