1. Euphoria To C Translator Optimisations (2)

Here's a new list of ways on how to optimise E2C;

- Output 'register' instead of 'int' for integers
embedded inside a heavy loop. (Easy to implement,
requires an optimisation pass on the C code)
- Change sequence initialisation code from this;
 *((int *)(_2+0)) = 9;
 *((int *)(_2+4)) = 34;
 *((int *)(_2+8)) = 14;
 *((int *)(_2+12)) = 18;
 *((int *)(_2+16)) = 33;
 *((int *)(_2+20)) = 46;
to this:
*((int *)_2 = *((int *){9,34,14,18,33,46};

Not only is it faster in some cases (for the C
compiler might output better array assignment code),
but it's smaller and more elegant, plus easy to read
an understand.
(This is easy to implement, yet hasn't being tested by
me yet.)

- DEFINTELY make de_reference() and NewS1() 'inline'
functions. ie. they are declared as for ex. 'inline
int myfunc()'. This saves calling overhead an speeds
up all applications globaly by a large percentage.
*everything* will run faster because of this.
(This is extremely easy to implement! Some compilers
support '_inline' instead of 'inline', but that's easy
to lookup)

- It's RECOMMENDED that binary_op_a() and NewDouble()
are inlined aswell.

- The following code from EC.exe for DJGPP:
((unsigned)poke4_addr > (unsigned)0x000FFFFF)
            *poke4_addr = (unsigned
long)DBL_PTR(_dest)->dbl;
        else

(unsigned long)poke4_addr, (unsigned
long)DBL_PTR(_dest)->dbl);

Should be eliminated.
Why?
Because A. The check perfomed slows things down.
B. It's not portable. _farpoke[l]() is a Go32 specifc
function.
This would hae as result that you can't compile your
Euphoria programs by using, for example, a user
contributed ec_psx.lib runtime library for the
playstation using EC.EXE.

I expect that if rob does release the source to the
runtime library, it will be the one for Win32, and
will have code like the following to let you compile
your code with non-windows C compilers:

#ifdef WIN32
void _stdcall WinMain( ... )
{
   ...
}
#else
void main( ... )
{
   ...
}
#endif

Remember that only ONE of the above two main entry
points is compiled, if 'WIN32' is defined by the
compiler, ie. you're using LCCWin or whatever,
WinMain() will be compiled and main() will not be
compiled, it will totally be ignored by the compiler.
Therefore there are no executable size reasons why not
to implement this.
But I guess you allready knew that :P


So to conclude things,
the easiest and cheapest way to gain some overal,
evaluation, and arithmetic speed increases is by
defining much-called runtime routines as inline.


Mike The Spike
PS. Yes these are straight-forward and simple
optimisations, but that's about the best I can do
right now. The best thing to do to get more speed is
optimise the way Euphoria is mapped to C. But this is
hard for me to do right now.

new topic     » topic index » view message » categorize

2. Re: Euphoria To C Translator Optimisations (2)

Rather than reply individually  to the many posts made
by Mike The Spike today, I'll just supply a
few bits of information:

---

register variables: 

Why don't you alter the C source
files generated and tell me if "register" helps. 
WATCOM ignores register declarations. 
Most decent compilers do their own assignment 
of variables to registers. On the Pentium there
aren't many registers available anyway.

---

*((int *)_2 = *((int *){9,34,14,18,33,46};

Seems to me something like that only works 
for static initializations. Not dynamically executing code.

---
in-lining:

The routines that you want to inline are a bit too large.
inlining increases the size of your code, but does
not necessarily increase the speed. There is a
severe penalty for executing code that is not in the 
Pentium code cache. If your code is larger you will
likely have more cache misses.

---
peeks/pokes with DJGPP

DJGPP puts low memory in a completely separate
memory segment, so you have to call that go32 macro
to peek or poke into it.
WATCOM /Causeway made low memory contiguous
with high memory in the same data segment. For compatibility
I had to handle DJGPP this way. You really don't
want me to put a single poke of one byte 
into memory into a subroutine, rather than *inlining* it 
would you?

---

#ifdef:

The Euphoria source consists of *one* set of source
files. There are #ifdef's throughout the code to select:
Windows vs. DOS vs. Linux
Public Domain vs Complete
Translator vs Interpreter vs. Runtime Library
Debug vs Release build

It would be potentially exciting to release the source
to other people, but it has to be done in a way that
maintains my income. I can't make any decisions about
this for a couple of months (so stop nagging me smile).

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

3. Re: Euphoria To C Translator Optimisations (2)

You know as well as I do that if someone like MTS
gets their hands on your knowledge base it'll be gone 
forever.

euman

----- Original Message ----- 
From: "Robert Craig" <rds at RapidEuphoria.com>
To: <EUforum at topica.com>
Sent: Thursday, February 01, 2001 19:08
Subject: Re: Euphoria To C Translator Optimisations (2)


> Rather than reply individually  to the many posts made
> by Mike The Spike today, I'll just supply a
> few bits of information:
> 
> ---
> 
> register variables: 
> 
> Why don't you alter the C source
> files generated and tell me if "register" helps. 
> WATCOM ignores register declarations. 
> Most decent compilers do their own assignment 
> of variables to registers. On the Pentium there
> aren't many registers available anyway.
> 
> ---
> 
> *((int *)_2 = *((int *){9,34,14,18,33,46};
> 
> Seems to me something like that only works 
> for static initializations. Not dynamically executing code.
> 
> ---
> in-lining:
> 
> The routines that you want to inline are a bit too large.
> inlining increases the size of your code, but does
> not necessarily increase the speed. There is a
> severe penalty for executing code that is not in the 
> Pentium code cache. If your code is larger you will
> likely have more cache misses.
> 
> ---
> peeks/pokes with DJGPP
> 
> DJGPP puts low memory in a completely separate
> memory segment, so you have to call that go32 macro
> to peek or poke into it.
> WATCOM /Causeway made low memory contiguous
> with high memory in the same data segment. For compatibility
> I had to handle DJGPP this way. You really don't
> want me to put a single poke of one byte 
> into memory into a subroutine, rather than *inlining* it 
> would you?
> 
> ---
> 
> #ifdef:
> 
> The Euphoria source consists of *one* set of source
> files. There are #ifdef's throughout the code to select:
> Windows vs. DOS vs. Linux
> Public Domain vs Complete
> Translator vs Interpreter vs. Runtime Library
> Debug vs Release build
> 
> It would be potentially exciting to release the source
> to other people, but it has to be done in a way that
> maintains my income. I can't make any decisions about
> this for a couple of months (so stop nagging me smile).
> 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    http://www.RapidEuphoria.com
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

4. Re: Euphoria To C Translator Optimisations (2)

--- Robert Craig <rds at RapidEuphoria.com> wrote:
> Rather than reply individually  to the many posts
> made
> by Mike The Spike today, I'll just supply a
> few bits of information:
> 
> ---
> 
> register variables: 
> 
> Why don't you alter the C source
> files generated and tell me if "register" helps. 
> WATCOM ignores register declarations. 
> Most decent compilers do their own assignment 
> of variables to registers. On the Pentium there
> aren't many registers available anyway.

Hmm...
Guess you're right...
I'm still living with my head in the 286 days anyways
:P

> ---
> 
> *((int *)_2 = *((int *){9,34,14,18,33,46};
> 
> Seems to me something like that only works 
> for static initializations. Not dynamically
> executing code.

Yes, that's be cool for static sequence
initializations.
Then later on, code something like this;
_2[idx], it's more readble than _2+offset.
 
> ---
> in-lining:
> 
> The routines that you want to inline are a bit too
> large.

Didn't know that, sorry :P

> inlining increases the size of your code, but does
> not necessarily increase the speed. There is a
> severe penalty for executing code that is not in the
> Pentium code cache. If your code is larger you will
> likely have more cache misses.

That's right...
Damn! That's what happens when you don't have the
soruce ;p
 
> ---
> peeks/pokes with DJGPP
> 
> DJGPP puts low memory in a completely separate
> memory segment, so you have to call that go32 macro
> to peek or poke into it.
> WATCOM /Causeway made low memory contiguous
> with high memory in the same data segment. For
> compatibility
> I had to handle DJGPP this way. You really don't
> want me to put a single poke of one byte 
> into memory into a subroutine, rather than
> *inlining* it 
> would you?

That's right I guess :p

(tip: How about building up pokes so untill you have 4
of them, then poke an integer. This is a standard way
to quadrupple speed in mode 13h graphics coding. )

> ---
> 
> #ifdef:
> 
> The Euphoria source consists of *one* set of source
> files. There are #ifdef's throughout the code to
> select:
> Windows vs. DOS vs. Linux
> Public Domain vs Complete
> Translator vs Interpreter vs. Runtime Library
> Debug vs Release build

Wha?
So you're the same style of coder as me!! :p

Anyways, that means the code is extremely portable.
Add a new #ifdef, then the code for your platform.
 
> It would be potentially exciting to release the
> source
> to other people, but it has to be done in a way that
> maintains my income. I can't make any decisions
> about
> this for a couple of months (so stop nagging me
> smile).

LOL!
We discussed this ;)
A small subset of the interpreter, translator,
whatever, basically being your Euphoria code with the
platform-specific routines stripped out.
I don't see someone implementing poke(), call(),
open_dll(), pixel(), polygon(), sound(), draw_line(),
set_vector(), get_vector(), poke4(), peek(), peek4(),
etc, etc, ... for DOS32, Win32 and Linux, and then
sell it all as a Euphoria clone.
It took you nearly a decade to code all that stuff.

Aslong as that Portable Euphoria has append(),
prepend(), find(), match(), cos(), sin(), puts(),
gets(), getc() open(), close(), print(), get() and
printf(), we're all more than happy.

In other words, these libs are supported;
file.e
get.e

and these aren't;
image.e
dll.e
graphics.e
msgbox.e
machine.e
safe.e

Plus you can leave out tracing and/or profiling.

And, offcourse, you ship Portable Euphoria with the
full version of the translator and/or interpreter :p
So people gotta buy that first ;)

 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    http://www.RapidEuphoria.com


Later man!



Mike The Spike 
PS. I won't bug you anymore :p

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu