1. Phix: #ilasm
- Posted by andreasWagner 6 days ago
- 163 views
Hallo,
This isn't meant to be a criticism or a complaint. It's just something I've noticed.
Phix is leisurely and slow when it comes to c_func/c_proc. That's not a problem for most applications.
But sometimes it becomes a bottleneck. That's why I followed your recommendations and came up with this solution it's certainly still far from ideal, but it's already significantly faster. But still slower than Euphoria c_func/c_proc.
constant DrawPixel_=GetProcAddress(ray,"DrawPixel") global procedure _DrawPixel(integer x,integer y,sequence color) integer col=bytes_to_int(color) --/**/#ilASM{ --/**/ [64] --/**/ mov rcx,[x] --/**/ mov rdx,[y] --/**/ mov r8,[col] --/**/ sub rsp, 40 -- Shadow Space (32) + Alignment (8) --/**/ mov rax,[DrawPixel_] --/**/ call rax --/**/ -- call "libraylib","DrawPixel" -- Direkter Sprung --/**/ add rsp, 40 --/**/ } --/* c_proc(xDrawPixel,{x,y,col}) --*/ end procedure
constant DrawPixelV_=GetProcAddress(ray,"DrawPixelV") global procedure _DrawPixelV(sequence pos,sequence color) atom reg=V2toReg(pos) integer col=bytes_to_int(color) --/**/#ilASM{ --/**/ [64] --/**/ mov rax,[reg] --/**/ call :%pLoadMint --/**/ mov rcx,rax --/**/ mov rdx,[col] --/**/ sub rsp, 40 -- Shadow Space (32) + Alignment (8) --/**/ mov rax,[DrawPixelV_] --/**/ call rax --/**/ -- call "libraylib","DrawPixelV" -- Direkter Sprung --/**/ add rsp, 40 --/**/ } --/* c_proc(xDrawPixelV,{reg,col}) --*/ end procedure
That's why I wanted to know how a C compiler does it. I was lucky DrawPixel in raylib simply calls DrawPixelV, and it looks like this:
00000001d64790ea <DrawPixel>: 1d64790ea: 48 83 ec 28 sub rsp,0x28 1d64790ee: 89 d0 mov eax,edx 1d64790f0: 44 89 c2 mov edx,r8d 1d64790f3: 66 0f ef c0 pxor xmm0,xmm0 1d64790f7: f3 0f 2a c1 cvtsi2ss xmm0,ecx 1d64790fb: 66 0f ef c9 pxor xmm1,xmm1 1d64790ff: f3 0f 2a c8 cvtsi2ss xmm1,eax 1d6479103: 66 0f 7e c8 movd eax,xmm1 1d6479107: 48 c1 e0 20 shl rax,0x20 1d647910b: 66 0f 7e c1 movd ecx,xmm0 1d647910f: 48 09 c1 or rcx,rax 1d6479112: e8 c2 fd ff ff call 1d6478ed9 <DrawPixelV> 1d6479117: 90 nop 1d6479118: 48 83 c4 28 add rsp,0x28 1d647911c: c3 ret
I think there are a few assembly instructions missing in ilasm to reproduce that. Or maybe I just couldn't find them.
2. Re: Phix: #ilasm
- Posted by petelomax 3 days ago
- 104 views
- Last edited 2 days ago
No disagreement from me with any of that.
Phix c_func/proc are decidedly hacky and I've long wanted to replace them but no real idea what with,
in fact I've toyed with the idea of forcing the use of asm code snippets just like those you posted.
Ideally perhaps the compiler should be smart enough to generate them itself, but that's not trivial.
There is indeed no support as yet for pxor or cvtsi2ss, pilasm.e has been written in an ad hoc fashion,
but is reasonably straightforward to extend, especially if you can find something similar to crib from.
I typically fire up OllyDbg/fdbg/edb, see what binary that assembles summat to and ensure I match that.
There are somewhere between 981 and 3,600 x86 instructions, in fact LLVM says there are 14,600 variants,
and just about the last thing I would ever want to do is add several hundred completely untested ones.
Mind you, it looks for all the world to me those pxor/cvtsi2ss are completely pointless instructions that
leave eax and ecx exactly as they found them...
If you were having a problem with the direct call, I'm pretty sure they need the ".dll" and it must be
located somewhere in %PATH% for the Windows program loader to find it.
If you can guarantee the result of V2toReg() is an integer, maybe and_bits(#FFFFFFFF), ditch LoadMint.
There may be an AGI stall on rax, try loading it first see if that helps (or eg call rsi while rax used for "")
If you only need to call DrawPixelV from a few places and can bear to copy that asm there, it should
also make a measurable difference.
3. Re: Phix: #ilasm
- Posted by andreasWagner 2 days ago
- 70 views
No disagreement from me with any of that.
Phix c_func/proc are decidedly hacky and I've long wanted to replace them but no real idea what with,
in fact I've toyed with the idea of forcing the use of asm code snippets just like those you posted.
Ideally perhaps the compiler should be smart enough to generate them itself, but that's not trivial.
I have no intention whatsoever of writing all the calls in assembly language. If anything, I'll only do so in a few places where I think it's necessary.
There is indeed no support as yet for pxor or cvtsi2ss, pilasm.e has been written in an ad hoc fashion,
but is reasonably straightforward to extend, especially if you can find something similar to crib from.
I typically fire up OllyDbg/fdbg/edb, see what binary that assembles summat to and ensure I match that.
There are somewhere between 981 and 3,600 x86 instructions, in fact LLVM says there are 14,600 variants,
and just about the last thing I would ever want to do is add several hundred completely untested ones.
That's more than understandable. That's one of the reasons why I'm spending so much time porting examples from Raylib to Phix. I want to test the functions as thoroughly as possible, not just wrap them.
Mind you, it looks for all the world to me those pxor/cvtsi2ss are completely pointless instructions that
leave eax and ecx exactly as they found them...
My mistake was a false lead: DrawPixel is fed with 2 integers (x, y), but DrawPixelV expects a struct Vector2 - 2 float32s that, according to the MS API, are packed into an integer register. Then it all makes sense again. And color moves from the 3rd parameter R8 to the 2nd parameter RDX. That's why cvti2ss is used (loosely translated: integer to float32). In the case of Phix and a direct call to DrawPixelV, cvtsd2ss would probably be more appropriate. Or just my V2toReg.
If you were having a problem with the direct call, I'm pretty sure they need the ".dll" and it must be
located somewhere in %PATH% for the Windows program loader to find it.
The direct call works without any issues; I just wanted to ensure by brute force that the DLL is referenced only once in the entire program. I currently have several versions of the DLL on my computer.
If you can guarantee the result of V2toReg() is an integer, maybe and_bits(#FFFFFFFF), ditch LoadMint.
There may be an AGI stall on rax, try loading it first see if that helps (or eg call rsi while rax used for "")
If you only need to call DrawPixelV from a few places and can bear to copy that asm there, it should
also make a measurable difference.
V2toReg is definitely anything but a clean integer; it's two float32s in a 64-bit value.
Btw I achieved the biggest speed gain by simply coding some functions from raymath directly as Phix/Euphoria and not calling the DLL at all. Of course, that helps Phix much more than it helps Euphoria.
Thanks for the reply; I'm always happy to learn something new. Unfortunately, I'm making very slow progress at the moment.

