1. RE: poke - peek Speed?
::CODE SNIP::
> I will use it to write to a file values less than 32768 in byte format.
> That
> is why I use integer and not atom arguments.
> Why does the poke - peek version take more time than the other, that
> uses
> division? Using allocate_low instead of allocate, it even takes more
> time!
> TIA.
Is your algorithm for to2bytes correct? Looking into machine.e I see
the fuction int_to_bytes is declared as:
global function int_to_bytes(atom x)
-- returns value of x as a sequence of 4 bytes
-- that you can poke into memory
-- {bits 0-7, (least significant)
-- bits 8-15,
-- bits 16-23,
-- bits 24-31} (most significant)
-- This is the order of bytes in memory on 386+ machines.
integer a,b,c,d
a = remainder(x, #100)
x = floor(x / #100)
b = remainder(x, #100)
x = floor(x / #100)
c = remainder(x, #100)
x = floor(x / #100)
d = remainder(x, #100)
return {a,b,c,d}
end function
You will notice it returns >>> {a,b,c,d} so the first two bytes would be
a = remainder(x, #100)
x = floor(x / #100)
b = remainder(x, #100)
However your version written as:
function to2bytes(integer u)
return and_bits(u, #FF) & floor(u / 256)
end function
Only does the a and x calculation... what happened to b?
If I am correct and your algorithm is slightly off, that will explain
the speed difference.
Don
2. RE: poke - peek Speed?
- Posted by rforno at tutopia.com
Nov 13, 2002
-
Last edited Nov 14, 2002
Yes, it is correct, because I will provide to it numbers in the range 0 -
32767. But you are right: I am performing less calculations than the poke4 -
peek version. However, the poke4 - peek version surely works by shifting
instead of dividing, or merely moving bytes around, so it should be faster.
Don't you think so?
----- Original Message -----
From: Don Phillips <EuNexus at yahoo.com>
Sent: Wednesday, November 13, 2002 3:25 PM
Subject: RE: poke - peek Speed?
>
> ::CODE SNIP::
>
> > I will use it to write to a file values less than 32768 in byte format.
> > That
> > is why I use integer and not atom arguments.
> > Why does the poke - peek version take more time than the other, that
> > uses
> > division? Using allocate_low instead of allocate, it even takes more
> > time!
> > TIA.
>
> Is your algorithm for to2bytes correct? Looking into machine.e I see
> the fuction int_to_bytes is declared as:
>
> global function int_to_bytes(atom x)
> -- returns value of x as a sequence of 4 bytes
> -- that you can poke into memory
> -- {bits 0-7, (least significant)
> -- bits 8-15,
> -- bits 16-23,
> -- bits 24-31} (most significant)
> -- This is the order of bytes in memory on 386+ machines.
> integer a,b,c,d
>
> a = remainder(x, #100)
> x = floor(x / #100)
> b = remainder(x, #100)
> x = floor(x / #100)
> c = remainder(x, #100)
> x = floor(x / #100)
> d = remainder(x, #100)
> return {a,b,c,d}
> end function
>
> You will notice it returns >>> {a,b,c,d} so the first two bytes would be
>
> a = remainder(x, #100)
> x = floor(x / #100)
> b = remainder(x, #100)
>
> However your version written as:
>
> function to2bytes(integer u)
> return and_bits(u, #FF) & floor(u / 256)
> end function
>
> Only does the a and x calculation... what happened to b?
> If I am correct and your algorithm is slightly off, that will explain
> the speed difference.
>
> Don
>
>
>
>
3. RE: poke - peek Speed?
> Yes, it is correct, because I will provide to it numbers in the range 0
> -
> 32767. But you are right: I am performing less calculations than the
> poke4 -
> peek version. However, the poke4 - peek version surely works by shifting
> instead of dividing, or merely moving bytes around, so it should be
> faster.
> Don't you think so?
Ahh yes, with only that number range the algo is indeed correct. The
poke4 - peek version doesnt work by either shifting or dividing. poke4
writes a single 32 bit dword from any memory address where poke reads 8
bit byte values (in this case two of them). It is simply direct memory
access with no math involved.
The slow up (as pointed out by Mr Craig) is this line:
return( peek({Addr1,2}) )
I dont really understand the difference between peek({Addr1,2}) and
peek(Addr1) & peek(Addr1+1), they both build a sequence. But the second
version just about matches the results you would expect so internally it
must be be more efficient.
function to2bytesa(integer u)
poke4(Addr1, u)
return( peek(Addr1) & peek(Addr1+1) )
end function
4. RE: poke - peek Speed?
> From: Don Phillips [mailto:EuNexus at yahoo.com]
> The slow up (as pointed out by Mr Craig) is this line:
> return( peek({Addr1,2}) )
>
> I dont really understand the difference between peek({Addr1,2}) and
> peek(Addr1) & peek(Addr1+1), they both build a sequence. But
> the second
> version just about matches the results you would expect so
> internally it
> must be be more efficient.
It's not the return sequence, but the sequence: {Addr1,2} that's the
problem. Rob's suggestion was to do this:
sequence peek2addr
peek2addr = {Addr1,2}
function to2bytesa(integer u)
poke4(Addr1, u)
return peek( peek2addr )
end function
Matt Lewis
5. RE: poke - peek Speed?
> > The slow up (as pointed out by Mr Craig) is this line:
> > return( peek({Addr1,2}) )
> >
> > I dont really understand the difference between peek({Addr1,2}) and
> > peek(Addr1) & peek(Addr1+1), they both build a sequence. But
> > the second
> > version just about matches the results you would expect so
> > internally it
> > must be be more efficient.
>
> It's not the return sequence, but the sequence: {Addr1,2} that's the
> problem. Rob's suggestion was to do this:
>
> sequence peek2addr
> peek2addr = {Addr1,2}
>
> function to2bytesa(integer u)
> poke4(Addr1, u)
> return peek( peek2addr )
> end function
>
> Matt Lewis
Ahh, thanks Matt =)
I didnt quite get what he was refering to. Thats quite a good
improvement. I dont know if removing the sequences completely will
break the rest of your program, but without them there is another
difference in speed...
integer lobyte
integer hibyte
procedure to2bytesD(integer u)
poke4(Addr1, u)
lobyte = peek(Addr1)
hibyte = peek(Addr1+1)
end procedure
6. RE: poke - peek Speed?
- Posted by rforno at tutopia.com
Nov 16, 2002
It is *much* slower this way, as it is pointed out in the docs.
The only way I've found to improve the performance is:
sequence Addr2
Addr2 = {Addr1, 2}
.....
return peek(Addr2)
as it was suggested by Rob in a post in answer to mine.
However, this only makes the 'division' and the 'peek' methods exactly equal
in performance.
Regards.
----- Original Message -----
From: Don Phillips <EuNexus at yahoo.com>
Sent: Friday, November 15, 2002 12:40 PM
Subject: RE: poke - peek Speed?
>
> > Yes, it is correct, because I will provide to it numbers in the range 0
> > -
> > 32767. But you are right: I am performing less calculations than the
> > poke4 -
> > peek version. However, the poke4 - peek version surely works by shifting
> > instead of dividing, or merely moving bytes around, so it should be
> > faster.
> > Don't you think so?
>
> Ahh yes, with only that number range the algo is indeed correct. The
> poke4 - peek version doesnt work by either shifting or dividing. poke4
> writes a single 32 bit dword from any memory address where poke reads 8
> bit byte values (in this case two of them). It is simply direct memory
> access with no math involved.
>
> The slow up (as pointed out by Mr Craig) is this line:
> return( peek({Addr1,2}) )
>
> I dont really understand the difference between peek({Addr1,2}) and
> peek(Addr1) & peek(Addr1+1), they both build a sequence. But the second
> version just about matches the results you would expect so internally it
> must be be more efficient.
>
> function to2bytesa(integer u)
> poke4(Addr1, u)
> return( peek(Addr1) & peek(Addr1+1) )
> end function
>
>
>
>