Re: fast bit rotation

new topic     » goto parent     » topic index » view thread      » older message » newer message

Tommy wrote:

> ------- <code>
> include machine.e
>
> constant rotate_left = allocate(15), rotate_left_param = rotate_left + 2
> poke(rotate_left, {
>      #60,                    --    0: pusha
>      #B8,#00,#00,#00,#00,    --    1: mov eax, dword param (2)
>      #D1,#C0,                --    6: rol eax, 1
>      #A3,#6A,#0A,#82,#82,    --    8: mov [@param], eax
>      #61,                    --    D: popa
>      #C3})                   --    E: ret
> poke4(rotate_left + 9, rotate_left_param) -- @param
>
> constant rotate_right = allocate(15), rotate_right_param = rotate_right + 2
> poke(rotate_right, {
>      #60,                    --    0: pusha
>      #B8,#00,#00,#00,#00,    --    1: mov eax, dword param (2)
>      #D1,#C8,                --    6: ror eax, 1
>      #A3,#6A,#3A,#C2,#83,    --    8: mov [@param], eax
>      #61,                    --    D: popa
>      #C3})                   --    E: ret
> poke4(rotate_right + 9, rotate_right_param) -- @param
>
> -- To use:
> poke4(rotate_left_param, i) -- i = an integer
> call(rotate_left)
> i = peek4s(rotate_left_param)
>
> poke4(rotate_right_param, j) -- j = an integer
> call(rotate_right)
> j = peek4s(rotate_right_param)
>
> -- Before closing your program, you should call:
> free(rotate_left)
> free(rotate_right)
> ------- </code>

I do not think that it is necessary or important to free the allocated
space for the code under normal circumstances:
- When the program terminates, all allocated space is given back to the
  O/S anyhow.
- The space that is occupied by normal Euphoria routines also is not
  freed during runtime of the program.
  Well, a machine-code routine could gain an advantage here, but that's
  not important unless the routine is big.
- If the machine-code routine is defined inside an include file, the
  statement for freeing the code space is inside the main program file.
  IMHO this is not elegant coding style.

These are just some remarks, so that nobody thinks that I've _forgotten_
to free the space in my code below. smile

> Perhaps there are faster methods than this, but on my laptop (Pentium II
> 433MHz, 64MB RAM), I can do 10_000_000 (ten million) rotations in 19.11
> seconds, which is more than 500_000 rotations per second.


In Euphoria 2.4, it's now possible to define the parameters and return
value for a machine-code routine. I like it, because in my opinion it's
more elegant than using the "old" method, but I don't know which method
is faster.

Inspired by your code, I modified Example #2 of the demo program
'callmach.ex' by RDS:

-----------------------------[ begin code ]-----------------------------
-- needs Euphoria 2.4 or newer

include machine.e
include dll.e

-- Define the function ROTATE_LEFT();
-- use define_c_func() to allow parameters to be passed to machine code.
constant
   ROL_CODE = {
       -- first int argument is at stack offset +4, 2nd int is at +8
      #8B, #44, #24, #04,      -- mov  eax, [esp+4]
      #8A, #4C, #24, #08,      -- mov  cl,  [esp+8]
      #D3, #C0,                -- rol  eax, cl
      #C2, #08, #00            -- ret 8  -- pop 8 bytes off the stack
   },
   ROL_SPACE = allocate(length(ROL_CODE))

poke(ROL_SPACE, ROL_CODE)

global constant
   ROTATE_LEFT = define_c_func("", ROL_SPACE, {C_INT, C_INT}, C_INT)


-- Demo
integer n, count, res

n = 9
count = 2
res = c_func(ROTATE_LEFT, {n, count})
printf(1, "The result of 'ROTATE_LEFT(%d,%d)' is: %d\n", {n, count, res})
------------------------------[ end code ]------------------------------

It seems to work, but there is no warranty on the above code. smile
I have some experience with writing 16 bit ASM code, but unfortunately
not too much regarding 32 bit ASM code.
Especially I'm not sure, whether the statement  "mov cl, [esp+8]" is
correct. Maybe it should be  "mov ecx, [esp+8]" or something?

Concerning preserving the registers see my other post, please.


BTW:
Who is the maintainer of the *very great* Euphoria assembler, Pete
Eberlein or Mic? It seems that the assembler does not understand a
command such as 'ret 8'.
It would be very nice, if you could improve the assembler in this regard.

Regards,
   Juergen

-- 
 /"\  ASCII ribbon campain  |  This message has been ROT-13 encrypted
 \ /  against HTML in       |  twice for higher security.
  X   e-mail and news,      |
 / \  and unneeded MIME     |  http://home.arcor.de/luethje/prog/

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu