Re: call_func is expensive

new topic     » goto parent     » topic index » view thread      » older message » newer message

I found a chance to look at this, but sad to say it ain't good news.

            rds Eu 2.4:     phix:       eui 4.1.0 
compare:        0.05        0.05        0.09 
kustom:         0.20        0.57        0.18 
call_func:      0.64        2.0         0.50 

So, regarding phix vs eu, the builtin is fine, standard call is 3x, call_func is 4x...
(The 3x-ish ratio between kustom and call_func within each is about the same across all three and not discussed further)

Now, taking the headline 4x of call_func first:
in builtins/pcallfunc.e the hll function call_common is used, in theory all that code could be rewritten from hll to low-level assembly, and perhaps that could get us from 4x to 3x, but there is a fair chance that doing so would not make even the slightest dent; I suspect the real problem is elsewhere. (while I was there, I made a small improvement to call_proc, that oddly enough made no difference at all to call_func)

More shocking, and probably 75% of the 4x issue anyway, is the 3x difference between a plain hll call in Phix and OpenEuphoria, made stranger by the fact that overall performance otherwise seems pretty much on a par.
I found a few bits of old debug code but removing them made little difference.
My instinct says that the Phix code is suffering at least two (extra) branch mispredictions and corresponding pipeline flushes.
I quickly tried a few things that made absolutely no difference.

Perhaps the compiler could sort locals by type, integer last, to minimise the dealloc loop inside opRetf.
Perhaps integer-only routines could use an optimised variant of opRetf with no dealloc loop....
Otherwise I'm at a bit of a loss (regarding any other way to improve opFrame/opRetf in builtins/VM/pStack.e).

In summary: I cannot see a quick fix for this.
The more subtle 3x is probably more significant that the headline-making 4x.
At least from a Phix perspective, use of call_func in any sort function should indeed be minimised, at least for now.

Pete

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu