RE: SV: Inner loop compiler
- Posted by Pete E <xseal at harborside.com> Sep 03, 2002
- 438 views
Hi Barbarella, Thanks for posting your text document in the body of an email for web readers to see. OthelloVivaldi at hotmail.com wrote: > Sadly, one would not be able to call > general euphoria functions from within the compiled > language. Many useful functions such as time(), rand() or > allocate() would probably not be available, unless someone > has an idea how this could done easily. I'm pretty much > clueless. Euphoria Win and Linux both have a call_back function for writing your own call back routines. I've been able to call these from machine code, so you can still do the hard stuff in Euphoria. But it would limit you to exu or exw. -- call.exw -- demonstrates use of call near instruction function handler (atom a1, atom a2, atom a3, atom a4) printf(1, "handler: %d %d %d %d\n", {a1, a2, a3, a4}) return 0 end function include asm.e asm_output(1,1) constant code = get_asm( "pusha "& "push dword 1 "& "push dword 2 "& "push dword 3 "& "push dword 4 "& "call near dword ptr [handler_addr] "& -- "add esp, 16 "& --uncomment this for Linux "popa "& "ret") constant handler_addr = allocate(4) resolve_param("handler_addr", handler_addr) include dll.e poke4(handler_addr, call_back(routine_id("handler"))) call(code) while get_key() = -1 do end while > * The official compiler should be at the same safety level > as Euphoria. Some run-time error checking will be > necessary. While type checking on data contents might be > turned off for speed, it would be too risky not to use > subscript checking whenever a pointer has been changed > since the last time it was used. Anything that can go wrong > will go wrong. The compiler will show zero tolerance if it > finds anything fishy. The BOUND instruction would work nicely for checking index values in sequences. Just set up a nice interrupt handler for out-of-range errors and you're good to go. > Since Euphoria allows peek(), poke() > and call(), this could also be allowed in the compiled > language, maybe together with asm() for fancy, hard-to- > port, inline assembly. Peek, poke and call - each can be done with a single asm instruction ... for integers though. > Basically, this leaves us with statements for program flow, I have some eucode that generates machine code branches automatically from a sequence of conditions, and correctly uses the right sized (signed byte or signed doubleword) jump instruction. I found the code - ridiculously concise and ridiculously uncommented. AAAAH! I will post it here to frighten newbies: constant kind_jumps={"JO","JNO","JB","JNAE","JNB","JAE","JZ","JE","JNZ","JNE","JBE", "JNA","JNBE","JA","JS","JNS","JP","JPE","JNP","JPO","JL","JNGE","JNL","JGE", "JLE","JNG","JNLE","JG","JMP"}, short_jumps={#70,#71,#72,#72,#73,#73,#74,#74,#75,#75,#76,#76,#77,#77,#78,#79, #7A,#7A,#7B,#7B,#7C,#7C,#7D,#7D,#7E,#7E,#7F,#7F,#EB}, near_jumps={{#0F,#80},{#0F,#81},{#0F,#82},{#0F,#82},{#0F,#83},{#0F,#83}, {#0F,#84},{#0F,#84},{#0F,#85},{#0F,#85},{#0F,#86},{#0F,#86},{#0F,#87}, {#0F,#87},{#0F,#88},{#0F,#89},{#0F,#8A},{#0F,#8A},{#0F,#8B},{#0F,#8B}, {#0F,#8C},{#0F,#8C},{#0F,#8D},{#0F,#8D},{#0F,#8E},{#0F,#8E},{#0F,#8F}, {#0F,#8F},{#E9}} function jump_blocks(sequence blocks) -- blocks: {{{code},condition,target1[,target2]},...} sequence result sequence offsets, deltas integer ok deltas = repeat(0, length(blocks)) offsets = repeat(0, length(blocks)+1) for i = 1 to length(blocks) do offsets[i+1] = offsets[i] + length(blocks[i][1]) + 2 end for ok = 0 while not ok do ok = 1 ? {deltas,offsets} for i = 1 to length(blocks) do if deltas[i] < -128 or deltas[i] > 127 then deltas[i] = offsets[blocks[i][3]] - offsets[i+1] else deltas[i] = offsets[blocks[i][3]] - offsets[i+1] if deltas[i] < -128 or deltas[i] > 127 then offsets[i+1..length(offsets)] += length(near_jumps[ find(blocks[i][2], kind_jumps)])+2 ok = 0 --exit end if end if end for end while result = repeat(0,offsets[length(offsets)]) for i = 1 to length(blocks) do if deltas[i] < -128 or deltas[i] > 127 then result[offsets[i]+1..offsets[i+1]] = blocks[i][1] & near_jumps[find(blocks[i][2], kind_jumps)] & int_to_bytes(deltas[i]) else result[offsets[i]+1..offsets[i+1]] = blocks[i][1] & short_jumps[find(blocks[i][2], kind_jumps)] & deltas[i] end if end for return result end function How it works: each element of the array is a {sequence of machine code, the type of branch instruction, and the index(es) of the desired block(s) to jump to. constant tmp=jump_blocks({ {{#90,#90,#90},"JMP",3}, -- (block 1) 3 NOP instructions followed by a jump to block 3 {repeat(#90,1220000),"JMP",4}, (block 2) a lot of NOPs then jump to block 4 {{#90,#90,#90},"JMP",2}}) -- (block 3) 3 NOPs followed by a jump to block 2 C3 -- (block 4) there is no block 4 (C3 is ret) constant tmp2=allocate(length(tmp)) poke(tmp2,tmp) call(tmp2) > declarations, integer and floating point arithmetics, integer > bit logic (shifting, rolling etc), and arrays that must be > compatible with Euphoria sequences. Has anyone reverse-engineered Rob's sequence format from the Euphoria-to-C translator yet? I guess that information wouldn't be that useful anyway.. since we can't get the address of a sequence by any means I know of. Someone might be able to poke around in memory and find some global sequence index and then locate other sequences from there... Manipulating sequences would also have to be required to emulate the translator conventions on reference counts and whatnot. OR... you could just write your own sequence engine in asm and use that. > * Practicality is important. The output from the compiler > would not be an executable or a dump of 'runes' (hex > machine code). Instead, it would output a Euphoria .e file, > that would contain all the Euphoria code and all the data > necessary to use the compiled library. This is inspired by > Pete Eberlein's asm.e. Look (in awe and reverence) at how > the assembler outputs a file with the machine code poked > into memory after an allocate() statement. Then some peeks > and pokes follow that are hardwired to point to the correct > parameters in the code. Isn't that neat? Now wrap a global > function definition around those peek and poke statements, > then add a call() statement inside it... The library writer > would never have to bother about using peek() or poke() or > call() at all. Just edit a program, compile it to an .e file > and use the global symbols in that file like any other Euphoria > library. "Automagical", as David Cuny might say... :) Asm.e also assembles directly to memory, which should be an option for the compiler to. Generating functions at run-time is always nice. Although I guess you could include the generated code in the same file if you didn't need to regenerate it over and over. But I agree that a compiled-to-.e file should be used for released code. > I don't know about all the final details of the language > definition. Any suggestions are welcome. What I would > like to know is how realistic you think this sounds, what > advice you might have to give and whether you think you > could contribute in any way. You need not be an assembly > programmer, since most of the code in this project will be > in pure Euphoria. This sounds very realistic to me, and I've already got some code laying around that was the start of something like this. I'll poke around my hard drives and see what I can come up with. Pete