Bob, Slob, and Rob ...
- Posted by mtsreborn_again at yahoo.com Jul 29, 2001
- 383 views
Allright. Here's an anecdote that realy happened. A person I recently met that worked as a programmer back when they used puchcards, wanted to get back into the scene. He lives on welfare and wants to make mon ey coding again but was out for too long to know what to start with. I advised him 2 things; - Make a game and - Learn Euphoria. The second because all he knows is 80x86 ASM and wants to get along with this fast. So, after teaching him Euphoria, I started showing him 3D basics. Voxels for starters as 3D hardware coding is too confusing to start him off with. I started with a simple perpective projection; global function get_xy(integer midx,integer midy,atom x,atom y,atom z) -- Get XY coords from 3D Coords -- midx/midy are the halfheights of the resolution if z = 0 then z = -0.0001 end if return floor({((x*80)/z)+midx,((y*80)/z)+midy}) end function Straightfoward code right? Well, I also did a C version for him, because he's closer to ints and floats than 'atoms' and 'sequences': point get_xy(unsigned short midx,unsigned short midy,double x,double y,double z) { /* Get the perspective projection of x,y,z */ /* _get_xy() is much faster, but it returns */ /* the result of the projection in global 'g_point' */ point ret; if(z == 0) z = 0.0001; ret.x = (int)((x*60)/z)+midx; ret.y = (int)((y*60)/z)+midy; return ret; } Now, if you have Eu and a C compiler, try out both routines. The C version is so much faster... But I made it even faster by inlining the routine: #define _get_xy(a,b,c,d,e) g_point.x = (int)(((double)c*80)/((double)e!=0)?(double)e:0.0001)+(unsigned short)a; g_point.y = (int)(((double)d*80)/((double)e!=0)?(double)e:0.001)+(unsigned short)b; Ok it's confusing, but it's about 20 times faster. Now while explaining it hit me that this is the essential algorithm for *any* 3D program, it's the integer of 3D math. Perspective projection. But the speed difference between both is so big, I'd have to draw 20 voxels in Eu and about 200,000 in C ust to get the same speeds. The same with my voxel drawing code; global procedure draw_voxel(integer midx,integer midy,atom color,atom x,atom y,atom z) sequence s,s2 atom add if z = 0 then z = -0.0001 end if s = {((x*60)/z)+midx,((y*60)/z)+midy} -- get_xy( ... ) s = floor(s) add = (60/z)*4 if s[1]+add < (midx*2)-add and s[2]+add < (midy*2)-add then for i = s[1] to s[1]+add do for ix = s[2] to s[2]+add do setPixel( Screen, i, ix, color ) end for end for end if end procedure The math is quite optimised, but the two nested for loops run extremely slow in Euphoria. A simple C version ran extreme amounts faster! And what about rotation? To show him how to do it, I explained him the trig with this toy routine; global function rotate2d(sequence point, atom rad) return {(cos(rad)*point[1] - sin(rad)*point[2])+axis[1], (sin(rad)*point[1] + cos(rad)*point[2]) + axis[2]} end function At the end, after seeing the differance in speed compared to C he'd rather learn something more complex, than easy and slow. This crackled my skull afterwards... Why is Eu so slow in graphics programming even when translated to C? The Eu version of get_xy() above, translated to this using ecw; int _0get_xy(int _midx, int _midy, int _x, int _y, int _z) { int _243 = 0; int _235 = 0; int _0, _1, _2, _3; // if z = 0 then z = -0.0001 end if if (10 != 0) goto L0; RefDS(_237); _z = _237; L0: // return floor({((x*80)/z)+midx,((y*80)/z)+midy}) DeRef(_235); _235 = 800; if (IS_ATOM_INT(_z)) { _235 = (800 % _z) ? NewDouble((double)800 / _z) : (800 / _z); } else { _235 = NewDouble((double)800 / DBL_PTR(_z)->dbl); } _0 = _235; if (IS_ATOM_INT(_235)) { _235 = _235 + 320; if (_235 + HIGH_BITS >= 0) _235 = NewDouble((double)_235); } else { _235 = NewDouble(DBL_PTR(_235)->dbl + (double)320); } DeRef(_0); DeRef(_243); _243 = 800; if (IS_ATOM_INT(_z)) { _243 = (800 % _z) ? NewDouble((double)800 / _z) : (800 / _z); } else { _243 = NewDouble((double)800 / DBL_PTR(_z)->dbl); } _0 = _243; if (IS_ATOM_INT(_243)) { _243 = _243 + 200; if (_243 + HIGH_BITS >= 0) _243 = NewDouble((double)_243); } else { _243 = NewDouble(DBL_PTR(_243)->dbl + (double)200); } DeRef(_0); _0 = _243; _1 = NewS1(2); _2 = (int)((s1_ptr)_1)->first; ((int *)_2)[0] = _235; Ref(_235); ((int *)_2)[1] = _243; Ref(_243); _243 = MAKE_SEQ(_1); DeRef(_0); _0 = _243; _243 = unary_op(FLOOR, _243); DeRefDS(_0); DeRef(_z); DeRef(_235); return _243; ; } Compare that to the hand-coded C version above, and you'll see that DeRef(), NewDouble() and the IS_ATOM_INT evaluation nestage are what slow what should be a straight-forward 3D math routine, down. Getting rid of DeRef(), NewDouble() and the IS_ATOM_INT evaluation nestage is hard work, but the second best thing is to inline it. Now Rob told me that that'd be too hard as DeRef() is complex. No it's not. Change it into a macro no matter how big that macro will be. Or if you're lazy, do it like this; int *arg; DeRef: /* Inster Code Here */ And call it like this in the generated C source: arg = _myvar; goto DeRef: Capture everything relevant in braces so that DeRef is defined in the same function as the rest of the program code for a given source file. Or, people, do the following EXTREME COMPLEX BRAIN CRACKING CODE CHANGE (!!!): __inline DeRef(int what) Woaah!! That's COMPLEX!!!! It'll take HOURS! Sure, no one trusts the 'inline' directive, just like no one trusts the use of 'register' variables in C, but atleast it's worth to be implemented as a commandline option, speed over size you know... Mike The Spike