1. Euphoria To C Translator Optimisations (2)
- Posted by Mike The Spike <mtsreborn at yahoo.com> Feb 01, 2001
- 459 views
Here's a new list of ways on how to optimise E2C; - Output 'register' instead of 'int' for integers embedded inside a heavy loop. (Easy to implement, requires an optimisation pass on the C code) - Change sequence initialisation code from this; *((int *)(_2+0)) = 9; *((int *)(_2+4)) = 34; *((int *)(_2+8)) = 14; *((int *)(_2+12)) = 18; *((int *)(_2+16)) = 33; *((int *)(_2+20)) = 46; to this: *((int *)_2 = *((int *){9,34,14,18,33,46}; Not only is it faster in some cases (for the C compiler might output better array assignment code), but it's smaller and more elegant, plus easy to read an understand. (This is easy to implement, yet hasn't being tested by me yet.) - DEFINTELY make de_reference() and NewS1() 'inline' functions. ie. they are declared as for ex. 'inline int myfunc()'. This saves calling overhead an speeds up all applications globaly by a large percentage. *everything* will run faster because of this. (This is extremely easy to implement! Some compilers support '_inline' instead of 'inline', but that's easy to lookup) - It's RECOMMENDED that binary_op_a() and NewDouble() are inlined aswell. - The following code from EC.exe for DJGPP: ((unsigned)poke4_addr > (unsigned)0x000FFFFF) *poke4_addr = (unsigned long)DBL_PTR(_dest)->dbl; else (unsigned long)poke4_addr, (unsigned long)DBL_PTR(_dest)->dbl); Should be eliminated. Why? Because A. The check perfomed slows things down. B. It's not portable. _farpoke[l]() is a Go32 specifc function. This would hae as result that you can't compile your Euphoria programs by using, for example, a user contributed ec_psx.lib runtime library for the playstation using EC.EXE. I expect that if rob does release the source to the runtime library, it will be the one for Win32, and will have code like the following to let you compile your code with non-windows C compilers: #ifdef WIN32 void _stdcall WinMain( ... ) { ... } #else void main( ... ) { ... } #endif Remember that only ONE of the above two main entry points is compiled, if 'WIN32' is defined by the compiler, ie. you're using LCCWin or whatever, WinMain() will be compiled and main() will not be compiled, it will totally be ignored by the compiler. Therefore there are no executable size reasons why not to implement this. But I guess you allready knew that :P So to conclude things, the easiest and cheapest way to gain some overal, evaluation, and arithmetic speed increases is by defining much-called runtime routines as inline. Mike The Spike PS. Yes these are straight-forward and simple optimisations, but that's about the best I can do right now. The best thing to do to get more speed is optimise the way Euphoria is mapped to C. But this is hard for me to do right now.
2. Re: Euphoria To C Translator Optimisations (2)
- Posted by Robert Craig <rds at RapidEuphoria.com> Feb 01, 2001
- 444 views
Rather than reply individually to the many posts made by Mike The Spike today, I'll just supply a few bits of information: --- register variables: Why don't you alter the C source files generated and tell me if "register" helps. WATCOM ignores register declarations. Most decent compilers do their own assignment of variables to registers. On the Pentium there aren't many registers available anyway. --- *((int *)_2 = *((int *){9,34,14,18,33,46}; Seems to me something like that only works for static initializations. Not dynamically executing code. --- in-lining: The routines that you want to inline are a bit too large. inlining increases the size of your code, but does not necessarily increase the speed. There is a severe penalty for executing code that is not in the Pentium code cache. If your code is larger you will likely have more cache misses. --- peeks/pokes with DJGPP DJGPP puts low memory in a completely separate memory segment, so you have to call that go32 macro to peek or poke into it. WATCOM /Causeway made low memory contiguous with high memory in the same data segment. For compatibility I had to handle DJGPP this way. You really don't want me to put a single poke of one byte into memory into a subroutine, rather than *inlining* it would you? --- #ifdef: The Euphoria source consists of *one* set of source files. There are #ifdef's throughout the code to select: Windows vs. DOS vs. Linux Public Domain vs Complete Translator vs Interpreter vs. Runtime Library Debug vs Release build It would be potentially exciting to release the source to other people, but it has to be done in a way that maintains my income. I can't make any decisions about this for a couple of months (so stop nagging me ). Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
3. Re: Euphoria To C Translator Optimisations (2)
- Posted by Euman <euman at bellsouth.net> Feb 01, 2001
- 437 views
You know as well as I do that if someone like MTS gets their hands on your knowledge base it'll be gone forever. euman ----- Original Message ----- From: "Robert Craig" <rds at RapidEuphoria.com> To: <EUforum at topica.com> Sent: Thursday, February 01, 2001 19:08 Subject: Re: Euphoria To C Translator Optimisations (2) > Rather than reply individually to the many posts made > by Mike The Spike today, I'll just supply a > few bits of information: > > --- > > register variables: > > Why don't you alter the C source > files generated and tell me if "register" helps. > WATCOM ignores register declarations. > Most decent compilers do their own assignment > of variables to registers. On the Pentium there > aren't many registers available anyway. > > --- > > *((int *)_2 = *((int *){9,34,14,18,33,46}; > > Seems to me something like that only works > for static initializations. Not dynamically executing code. > > --- > in-lining: > > The routines that you want to inline are a bit too large. > inlining increases the size of your code, but does > not necessarily increase the speed. There is a > severe penalty for executing code that is not in the > Pentium code cache. If your code is larger you will > likely have more cache misses. > > --- > peeks/pokes with DJGPP > > DJGPP puts low memory in a completely separate > memory segment, so you have to call that go32 macro > to peek or poke into it. > WATCOM /Causeway made low memory contiguous > with high memory in the same data segment. For compatibility > I had to handle DJGPP this way. You really don't > want me to put a single poke of one byte > into memory into a subroutine, rather than *inlining* it > would you? > > --- > > #ifdef: > > The Euphoria source consists of *one* set of source > files. There are #ifdef's throughout the code to select: > Windows vs. DOS vs. Linux > Public Domain vs Complete > Translator vs Interpreter vs. Runtime Library > Debug vs Release build > > It would be potentially exciting to release the source > to other people, but it has to be done in a way that > maintains my income. I can't make any decisions about > this for a couple of months (so stop nagging me ). > > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com > > >
4. Re: Euphoria To C Translator Optimisations (2)
- Posted by Mike The Spike <mtsreborn at yahoo.com> Feb 01, 2001
- 441 views
--- Robert Craig <rds at RapidEuphoria.com> wrote: > Rather than reply individually to the many posts > made > by Mike The Spike today, I'll just supply a > few bits of information: > > --- > > register variables: > > Why don't you alter the C source > files generated and tell me if "register" helps. > WATCOM ignores register declarations. > Most decent compilers do their own assignment > of variables to registers. On the Pentium there > aren't many registers available anyway. Hmm... Guess you're right... I'm still living with my head in the 286 days anyways :P > --- > > *((int *)_2 = *((int *){9,34,14,18,33,46}; > > Seems to me something like that only works > for static initializations. Not dynamically > executing code. Yes, that's be cool for static sequence initializations. Then later on, code something like this; _2[idx], it's more readble than _2+offset. > --- > in-lining: > > The routines that you want to inline are a bit too > large. Didn't know that, sorry :P > inlining increases the size of your code, but does > not necessarily increase the speed. There is a > severe penalty for executing code that is not in the > Pentium code cache. If your code is larger you will > likely have more cache misses. That's right... Damn! That's what happens when you don't have the soruce ;p > --- > peeks/pokes with DJGPP > > DJGPP puts low memory in a completely separate > memory segment, so you have to call that go32 macro > to peek or poke into it. > WATCOM /Causeway made low memory contiguous > with high memory in the same data segment. For > compatibility > I had to handle DJGPP this way. You really don't > want me to put a single poke of one byte > into memory into a subroutine, rather than > *inlining* it > would you? That's right I guess :p (tip: How about building up pokes so untill you have 4 of them, then poke an integer. This is a standard way to quadrupple speed in mode 13h graphics coding. ) > --- > > #ifdef: > > The Euphoria source consists of *one* set of source > files. There are #ifdef's throughout the code to > select: > Windows vs. DOS vs. Linux > Public Domain vs Complete > Translator vs Interpreter vs. Runtime Library > Debug vs Release build Wha? So you're the same style of coder as me!! :p Anyways, that means the code is extremely portable. Add a new #ifdef, then the code for your platform. > It would be potentially exciting to release the > source > to other people, but it has to be done in a way that > maintains my income. I can't make any decisions > about > this for a couple of months (so stop nagging me > ). LOL! We discussed this ;) A small subset of the interpreter, translator, whatever, basically being your Euphoria code with the platform-specific routines stripped out. I don't see someone implementing poke(), call(), open_dll(), pixel(), polygon(), sound(), draw_line(), set_vector(), get_vector(), poke4(), peek(), peek4(), etc, etc, ... for DOS32, Win32 and Linux, and then sell it all as a Euphoria clone. It took you nearly a decade to code all that stuff. Aslong as that Portable Euphoria has append(), prepend(), find(), match(), cos(), sin(), puts(), gets(), getc() open(), close(), print(), get() and printf(), we're all more than happy. In other words, these libs are supported; file.e get.e and these aren't; image.e dll.e graphics.e msgbox.e machine.e safe.e Plus you can leave out tracing and/or profiling. And, offcourse, you ship Portable Euphoria with the full version of the translator and/or interpreter :p So people gotta buy that first ;) > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com Later man! Mike The Spike PS. I won't bug you anymore :p