Bob, Slob, and Rob ...

new topic     » topic index » view thread      » older message » newer message

Allright.
Here's an anecdote that realy happened.

A person I recently met that worked as a programmer
back when they used puchcards, wanted to get back into
the scene.
He lives on welfare and wants to make mon ey coding
again but was out for too long to know what to start
with.

I advised him 2 things; - Make a game and - Learn
Euphoria.
The second because all he knows is 80x86 ASM and wants
to get along with this fast.

So, after teaching him Euphoria, I started showing him
3D basics. Voxels for starters as 3D hardware coding
is too confusing to start him off with.

I started with a simple perpective projection;

global function get_xy(integer midx,integer midy,atom
x,atom y,atom z)
	-- Get XY coords from 3D Coords
	-- midx/midy are the halfheights of the resolution
if z = 0 then z = -0.0001 end if
	return floor({((x*80)/z)+midx,((y*80)/z)+midy})
end function

Straightfoward code right?
Well, I also did a C version for him, because he's
closer to ints and floats than 'atoms' and
'sequences':

point get_xy(unsigned short midx,unsigned short
midy,double x,double y,double z)
{	
	/* Get the perspective projection of x,y,z */
	/* _get_xy() is much faster, but it returns */
	/* the result of the projection in global 'g_point'
*/

	point ret;
	if(z == 0)
	    z = 0.0001;

	ret.x = (int)((x*60)/z)+midx;
	ret.y = (int)((y*60)/z)+midy;
	return ret;
}

Now, if you have Eu and a C compiler, try out both
routines. The C version is so much faster...
But I made it even faster by inlining the routine:

#define _get_xy(a,b,c,d,e) g_point.x =
(int)(((double)c*80)/((double)e!=0)?(double)e:0.0001)+(unsigned
short)a; g_point.y =
(int)(((double)d*80)/((double)e!=0)?(double)e:0.001)+(unsigned
short)b;


Ok it's confusing, but it's about 20 times faster.

Now while explaining it hit me that this is the
essential algorithm for *any* 3D program, it's the
integer of 3D math. Perspective projection.

But the speed difference between both is so big, I'd
have to draw 20 voxels in Eu and about 200,000 in C
ust to get the same speeds.

The same with my voxel drawing code;

global procedure draw_voxel(integer midx,integer
midy,atom color,atom x,atom y,atom z)
sequence s,s2
atom add
if z = 0 then z = -0.0001 end if
s = {((x*60)/z)+midx,((y*60)/z)+midy}		-- get_xy( ...
)
s = floor(s)

add = (60/z)*4

if s[1]+add < (midx*2)-add and s[2]+add < (midy*2)-add
then
for i = s[1] to s[1]+add do
	for ix = s[2] to s[2]+add do
		setPixel( Screen, i, ix, color )
	end for
end for
end if
end procedure

The math is quite optimised, but the two nested for
loops run extremely slow in Euphoria.

A simple C version ran extreme amounts faster!

And what about rotation?
To show him how to do it, I explained him the trig
with this toy routine;


global function rotate2d(sequence point, atom rad)
	return {(cos(rad)*point[1] -
sin(rad)*point[2])+axis[1], (sin(rad)*point[1] +
cos(rad)*point[2]) + axis[2]}
end function


At the end, after seeing the differance in speed
compared to C he'd rather learn something  more
complex, than easy and slow.

This crackled my skull afterwards...
Why is Eu so slow in graphics programming even when
translated to C?

The Eu version of get_xy() above, translated to this
using ecw;

int _0get_xy(int _midx, int _midy, int _x, int _y, int
_z)
{
    int _243 = 0;
    int _235 = 0;
    int _0, _1, _2, _3;
    

    // if z = 0 then z = -0.0001 end if
    if (10 != 0)
        goto L0;
    RefDS(_237);
    _z = _237;
L0:

    // 	return
floor({((x*80)/z)+midx,((y*80)/z)+midy})
    DeRef(_235);
    _235 = 800;
    if (IS_ATOM_INT(_z)) {
        _235 = (800 % _z) ? NewDouble((double)800 /
_z) : (800 / _z);
    }
    else {
        _235 = NewDouble((double)800 /
DBL_PTR(_z)->dbl);
    }
    _0 = _235;
    if (IS_ATOM_INT(_235)) {
        _235 = _235 + 320;
        if (_235 + HIGH_BITS >= 0) 
            _235 = NewDouble((double)_235);
    }
    else {
        _235 = NewDouble(DBL_PTR(_235)->dbl +
(double)320);
    }
    DeRef(_0);
    DeRef(_243);
    _243 = 800;
    if (IS_ATOM_INT(_z)) {
        _243 = (800 % _z) ? NewDouble((double)800 /
_z) : (800 / _z);
    }
    else {
        _243 = NewDouble((double)800 /
DBL_PTR(_z)->dbl);
    }
    _0 = _243;
    if (IS_ATOM_INT(_243)) {
        _243 = _243 + 200;
        if (_243 + HIGH_BITS >= 0) 
            _243 = NewDouble((double)_243);
    }
    else {
        _243 = NewDouble(DBL_PTR(_243)->dbl +
(double)200);
    }
    DeRef(_0);
    _0 = _243;
    _1 = NewS1(2);
    _2 = (int)((s1_ptr)_1)->first;
    ((int *)_2)[0] = _235;
    Ref(_235);
    ((int *)_2)[1] = _243;
    Ref(_243);
    _243 = MAKE_SEQ(_1);
    DeRef(_0);
    _0 = _243;
    _243 = unary_op(FLOOR, _243);
    DeRefDS(_0);
    DeRef(_z);
    DeRef(_235);
    return _243;
    ;
}

Compare that to the hand-coded C version above, and
you'll see that DeRef(), NewDouble() and the
IS_ATOM_INT evaluation nestage are what slow what
should be a straight-forward 3D math routine, down.

Getting rid of DeRef(), NewDouble() and the
IS_ATOM_INT evaluation nestage is hard work, but the
second best thing is to inline it.
Now Rob told me that that'd be too hard as DeRef() is
complex. No it's not. Change it into a macro no matter
how big that macro will be. Or if you're lazy, do it
like this;

int *arg;

DeRef:
/* Inster Code Here */


And call it like this in the generated C source:

arg = _myvar;
goto DeRef:

Capture everything relevant in braces so that DeRef is
defined in the same function as the rest of the
program code for a given source file.

Or, people, do the following EXTREME COMPLEX BRAIN
CRACKING CODE CHANGE (!!!):

__inline DeRef(int what)

Woaah!!
That's COMPLEX!!!!

It'll take HOURS!

Sure, no one trusts the 'inline' directive, just like
no one trusts the use of 'register' variables in C,
but atleast it's worth to be implemented as a
commandline option, speed over size you know...


Mike The Spike

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu