1. absolute function
- Posted by Vinoba Mar 17, 2011
- 3890 views
The Euphoria absolute function in math.e looks too old-fashioned. The following seems to work as well and might be faster. The outer brackets can probably be omitted. It takes care of atom and sequence arguments
public function newabs (object s ) return s + ( s * -2 * ( s < 0 ) ) end function
2. Re: absolute function
- Posted by DerekParnell (admin) Mar 18, 2011
- 3885 views
The Euphoria absolute function in math.e looks too old-fashioned. The following seems to work as well and might be faster. The outer brackets can probably be omitted. It takes care of atom and sequence arguments
public function newabs (object s ) return s + ( s * -2 * ( s < 0 ) ) end function
Old-fashioned? Maybe, but so what. The current implementation is easy to understand and is faster than the one you had suggested.
3. Re: absolute function
- Posted by jaygade Mar 18, 2011
- 3845 views
The Euphoria absolute function in math.e looks too old-fashioned. The following seems to work as well and might be faster. The outer brackets can probably be omitted. It takes care of atom and sequence arguments
public function newabs (object s ) return s + ( s * -2 * ( s < 0 ) ) end function
Old-fashioned? Maybe, but so what. The current implementation is easy to understand and is faster than the one you had suggested.
Easier to understand, yes. But I'm measuring about a 38% increase in timing on Linux. Unless I'm doing it wrong.
Using the most recent 4.0.1:
-- Test old abs() include std/math.e function newabs (object s ) return s + ( s * -2 * ( s < 0 ) ) end function integer x = 1, y = 0, z = -1 atom ax = 1.0, ay = 0.0, az = -1.0 sequence s1 = {1, 0, -1} object result with profile for i = 1 to 262143 do result = {abs(x), abs(y), abs(z)} result = {abs(ax), abs(ay), abs(az)} result = {abs(s1)} end for
-- Test newabs() include std/math.e function newabs (object s ) return s + ( s * -2 * ( s < 0 ) ) end function integer x = 1, y = 0, z = -1 atom ax = 1.0, ay = 0.0, az = -1.0 sequence s1 = {1, 0, -1} object result with profile for i = 1 to 262143 do result = {newabs(x), newabs(y), newabs(z)} result = {newabs(ax), newabs(ay), newabs(az)} result = {abs(s1)} end for
Result:
Old ABS results, three trials: real 0m0.510s user 0m0.476s sys 0m0.004s real 0m0.460s user 0m0.456s sys 0m0.004s real 0m0.517s user 0m0.476s sys 0m0.004s New ABS results, three trials: real 0m0.356s user 0m0.352s sys 0m0.004s real 0m0.361s user 0m0.360s sys 0m0.000s real 0m0.357s user 0m0.352s sys 0m0.004s Average Real Old: 0.493 Average Real New: 0.358 Percentage improvement: 38%
So the obvious tradeoff is between code clarity and code speed. How speed intensive does abs() need to be? How often will someone look at the implementation of abs() and need to understand it in the standard library?
4. Re: absolute function
- Posted by mattlewis (admin) Mar 18, 2011
- 3990 views
The Euphoria absolute function in math.e looks too old-fashioned. The following seems to work as well and might be faster. The outer brackets can probably be omitted. It takes care of atom and sequence arguments
Interesting...we've gone around about abs() a few times. Perhaps some of the internals have changed enough that the old timings aren't applicable any more. See, for instance this thread about abs().
Matt
5. Re: absolute function
- Posted by jaygade Mar 18, 2011
- 3892 views
Dang, I'd forgotten how many times this has gone around.
The messages you linked have a link which goes back even further: ABS (Changed link from listfilter to rapideuphoria).
Ah, memories.
6. Re: absolute function
- Posted by mattlewis (admin) Mar 18, 2011
- 3845 views
Dang, I'd forgotten how many times this has gone around.
The messages you linked have a link which goes back even further: ABS (Changed link from listfilter to rapideuphoria).
Ah, memories.
Yeah, it comes up every few years.
In any case, benchmarking Vinoba's function vs std/math.e, I get different results (same benchmark as he posted, different reps...times are seconds):
Interpreter | Vinoba | std/math.e |
---|---|---|
4.0.0 | 2.01 | 1.69 |
4.0.1 | 1.35 | 1.07 |
4.1 (64-bits) | 1.34 | 1.07 |
I believe that the 4.0.0 interpreter was built with an older version of gcc (whatever versions ship with Debian 5 vs Kubuntu 10.10). Newer versions seem much better for euphoria performance, in my experience. Still, in each case, the std/math.e version outperforms.
Matt
7. Re: absolute function
- Posted by Vinoba Mar 18, 2011
- 3845 views
Thank you for the replies and conclusions.
1. It is generally thought that a loop is costly in time, compared to a straight one line solution. On the other hand compiled executable have narrowed down the time for looping.
2. There have been some very interesting (and at least 2 better than mine) in the past. I was not aware of the past threads.
3. I do intend to test the functions in a more (i) business like and (ii) scientific level program with the following variables in mind.
A. The occurrence of the function in a typical program as an atom;
B. The occurrence of the function in a typical program as a sequence and as a long sequence;
C. Interpreted v. compiled executable. The latter are more relevant to me.
4. I could quote a thousand APL v. Basic (straight line solutions v. looping and branching) and by and large the APL solutions were better in terms of speed, space used, etc. However, this is a Euphoria forum and I don't want to bring in those very old arguments, and other languages.
5. No low level function is trivial enough to be ignored for speed; even a 1% speed improvement in a low level functions relevant and was modified in the past by the space used - even space required improve performance is not relevant any more. The very fact that it is incorporated in the core distribution of a language, is sufficient evidence that it is an important relevant function or code.
6. Resistance to change has been the downfall of many civilizations.
8. Re: absolute function
- Posted by jaygade Mar 18, 2011
- 3834 views
Dang, I'd forgotten how many times this has gone around.
The messages you linked have a link which goes back even further: ABS (Changed link from listfilter to rapideuphoria).
Ah, memories.
Yeah, it comes up every few years.
In any case, benchmarking Vinoba's function vs std/math.e, I get different results (same benchmark as he posted, different reps...times are seconds):
Interpreter | Vinoba | std/math.e |
---|---|---|
4.0.0 | 2.01 | 1.69 |
4.0.1 | 1.35 | 1.07 |
4.1 (64-bits) | 1.34 | 1.07 |
I believe that the 4.0.0 interpreter was built with an older version of gcc (whatever versions ship with Debian 5 vs Kubuntu 10.10). Newer versions seem much better for euphoria performance, in my experience. Still, in each case, the std/math.e version outperforms.
Matt
Interesting.
And I remembered/realized after the fact that I should have used Euphoria's built-in time() rather than Bash's time.
My Euphoria was built on Ubuntu 10.04, BTW. I'll have to check the gcc version when I can.
9. Re: absolute function
- Posted by mattlewis (admin) Mar 18, 2011
- 3834 views
Thank you for the replies and conclusions.
1. It is generally thought that a loop is costly in time, compared to a straight one line solution. On the other hand compiled executable have narrowed down the time for looping.
That's true, although there's more to it. Unless you have some way to do vector operations (GPU) or smartly parallelize the work, there are still loops. In this case, the loop is in the back end of the interpreter instead of user code. So there is some efficiency there. However, let's look at the code:
SubProgram [/home/matt/eu/test/abs.ex-newabs:00146] [s:148] STACK SPACE: 5 Required: 5 [/home/matt/eu/test/abs.ex:2] (2) 1: 013 148 150 151 # MULTIPLY: [s:148], [LIT -2:150] => [_temp_:151] 5: 001 148 143 152 # LESS: [s:148], [LIT 0:143] => [_temp_:152] 9: 013 151 152 153 # MULTIPLY: [_temp_:151], [_temp_:152] => [_temp_:153] 13: 011 148 153 154 # PLUS: [s:148], [_temp_:153] => [_temp_:154] 17: 208 151 # DEREF_TEMP: [_temp_:151] 19: 208 152 # DEREF_TEMP: [_temp_:152] 21: 208 153 # DEREF_TEMP: [_temp_:153] 23: 028 146 147 154 # RETURNF: [_temp_:154] block[147] 27: 043 # BADRETURNF: End SubProgram [newabs:00146]
Operations at 1, 5, 9 and 13 operate on each element of a sequence. Therefore, you have 4*n operations to complete. In the std/math.e implementation, we loop through once, at somewhat less efficiency, also including recursive calls, which take some time.
So there is a bit of a tradeoff. Here is the std/math.e version:
SubProgram [/home/matt/eu/test/abs.ex-stdabs:00164] [a:166] STACK SPACE: 13 Required: 13 [/home/matt/eu/test/abs.ex:16] (16) 1: 067 166 169 # IS_AN_ATOM: [a:166] [_temp_:169] 4: 020 169 24 # IF: [_temp_:169] = 0 goto 0024 [/home/matt/eu/test/abs.ex:17] (17) 7: 103 166 143 17 # GREATEREQ_IFW [a:166] > [LIT 0:143] goto 0011 else goto 0017 [/home/matt/eu/test/abs.ex:18] (18) 11: 028 164 165 166 # RETURNF: [a:166] block[165] [/home/matt/eu/test/abs.ex:19] (19) 15: 023 24 # ELSE goto 0024 [/home/matt/eu/test/abs.ex:20] (20) 17: 012 166 173 # UMINUS: [a:166] => [_temp_:173] 20: 028 164 165 173 # RETURNF: [_temp_:173] block[165] [/home/matt/eu/test/abs.ex:23] (23) 24: 042 166 176 # LENGTH: [a:166] => [_temp_:176] 27: 125 142 176 142 164 175 79 # FOR_I: inc [LIT 1:142], lim [_temp_:176], initial [LIT 1:142], # lv [i:175], jmp 0079 [/home/matt/eu/test/abs.ex:24] (24) 34: 092 166 175 167 # RHS_SUBS_CHECK: [a:166] sub [i:175] => [t:167] [/home/matt/eu/test/abs.ex:25] (25) 38: 067 167 179 # IS_AN_ATOM: [t:167] [_temp_:179] 41: 020 179 59 # IF: [_temp_:179] = 0 goto 0059 [/home/matt/eu/test/abs.ex:26] (26) 44: 102 167 143 74 # IFW [t:167] < [LIT 0:143] goto 0048 else goto 0074 [/home/matt/eu/test/abs.ex:27] (27) 48: 012 167 182 # UMINUS: [t:167] => [_temp_:182] 51: 084 166 175 182 # ASSIGN_SUBS_CHECK: [a:166], [i:175] <= [_temp_:182] 55: 208 182 # DEREF_TEMP: [_temp_:182] [/home/matt/eu/test/abs.ex:29] (29) 57: 023 74 # ELSE goto 0074 [/home/matt/eu/test/abs.ex:30] (30) 59: 018 167 184 # [t:167] => [_temp_:184] 62: 027 164 184 185 # FUNC: [stdabs:164][_temp_:184] => [_temp_:185] 66: 084 166 175 185 # ASSIGN_SUBS_CHECK: [a:166], [i:175] <= [_temp_:185] 70: 208 184 # DEREF_TEMP: [_temp_:184] 72: 208 185 # DEREF_TEMP: [_temp_:185] [/home/matt/eu/test/abs.ex:32] (32) 74: 054 34 176 175 142 # ENDFOR_INT_UP1: top 0034, lim: [_temp_:176], lv [i:175], inc # [LIT 1:142] [/home/matt/eu/test/abs.ex:33] (33) 79: 028 164 165 166 # RETURNF: [a:166] block[165] 83: 043 # BADRETURNF: End SubProgram [stdabs:00164]
In the case of just an atom, it looks faster (I haven't tested in detail). And in the case of sequences, larger sequences give larger penalties for sequence arithmetic.
4. I could quote a thousand APL v. Basic (straight line solutions v. looping and branching) and by and large the APL solutions were better in terms of speed, space used, etc. However, this is a Euphoria forum and I don't want to bring in those very old arguments, and other languages.
One of the interesting things about euphoria is how it performs relative to other languages. User code tends to be a lot more efficient in many ways, so penalties that other interpreted languages might incur as far as loops and that sort of thing aren't necessarily the bottleneck to the same extent.
Of course, compiled languages sometimes do a lot better, especially if they are smart about parallelizing things like this.
6. Resistance to change has been the downfall of many civilizations.
I think you'll find that improved performance is rarely resisted around here. However, as you've discovered, abs() has a rich history of performance tests and trials.
Matt
10. Re: absolute function
- Posted by Vinoba Mar 19, 2011
- 3827 views
Thanks for the rational analysis. I want to keep the discussion at a scientific level.
I agree that Vector arithmetic also has a loop at the lowest level, unless use can be made of the parallel processing inherent in the newer CPU, but hardly used.
I too am aware of the extra multiplies in my routine - they costly in time, much more than additions. I acknowledged that in the flash backs into the past of euphoria, I noticed at least two slightly different (but still short and sweet) efforts which appear distinctly better than mine.
Purely on technical grounds, I would be very suspicious of the recursion in the current std/math.e version of abs() At best it is not necessary.
However, when I get some time, I will do extensive testing as outlined by me earlier, and will publish the result here including the methods used.
11. Re: absolute function
- Posted by Vinoba Mar 21, 2011
- 3702 views
Testing abs() function
I did extensive testing of the abs function for a single element integer, single element double value, a sequence of 1000 integers and a sequence of 1000 doubles.
The loop tested was had 1000000 iterations. Loop overhead was deducted from final results.
The following suggestion of mine was tested against the current system's abs function found in include\std\math.e
-- Method A return s + ( s * -2 * ( s < 0 ) )
I also tested the following code which was suggested by somebody in the past.
-- Method B return s * ((s > 0) - (s < 0))
Conclusions:
1. Method B CONSISTENTLY proved better than Method A
2. The system's current method, however clumsy it might look, CONSISTENTLY proved better than my proposed Method A, and the better proposed Method B.
3. The interpreted Euphoria results and compiled exe files gave nearly similar results, with the exe files doing only slightly better.
4. There is definitely NO SPEED GAIN by adopting either of the two changes suggested.
.. .. .. .. .. .. .. .. .. ..
Please note that the ratio numbers indicate time taken - therefore a number of greater than 1 means "slower" for the new changes suggested by me. .. .. .. .. .. .. .. .. .. ..
-- Method A New means Alternative abs - Exist means existing Abs fuction Time for execution 1000000 iterations using interpreted Euporia 4.01 Integer - 1 element: New, Exist, New/Exist ratio{0.047, 0.017, 2.764705882} Double - 1 element: New, Exist, New/Exist ratio{0.126, 0.032, 3.9375} Integer - 1000 elements: New, Exist, New/Exist ratio{25.282, 17.172, 1.472280457} Double - 1000 elements: New, Exist, New/Exist ratio{74.454, 38.36, 1.94092805}
.. .. .. .. .. .. .. .. .. ..
-- Method A compiled New means Alternative abs - Exist means existing Abs fuction Time for execution of 1000000 iterations using exe file created under Euporia 4.01 Bind Integer - 1 element: New, Exist, New/Exist ratio{0.047, 0.047, 1} Double - 1 element: New, Exist, New/Exist ratio{0.125, 0.047, 2.659574468} Integer - 1000 elements: New, Exist, New/Exist ratio{25.36, 17.062, 1.486343922} Double - 1000 elements: New, Exist, New/Exist ratio{83.828, 35.093, 2.388738495}
.. .. .. .. .. .. .. .. .. ..
-- Method B New means Alternative abs - Exist means existing Abs fuction Time for execution 1000000 iterations using interpreted Euporia 4.01 Integer - 1 element: New, Exist, New/Exist ratio{0.047, 0.017, 2.764705882} Double - 1 element: New, Exist, New/Exist ratio{0.094, 0.032, 2.9375} Integer - 1000 elements: New, Exist, New/Exist ratio{23.86, 16.735, 1.425754407} Double - 1000 elements: New, Exist, New/Exist ratio{48.954, 38.548, 1.269949154}
.. .. .. .. .. .. .. .. .. ..
-- Method B compiled New means Alternative abs - Exist means existing Abs fuction Time for execution 1000000 iterations using exe file created under Euporia 4.01 Bind Integer - 1 element: New, Exist, New/Exist ratio{0.047, 0.031, 1.516129032} Double - 1 element: New, Exist, New/Exist ratio{0.11, 0.062, 1.774193548} Integer - 1000 elements: New, Exist, New/Exist ratio{23.906, 16.812, 1.421960504} Double - 1000 elements: New, Exist, New/Exist ratio{49.047, 38.516, 1.273418839}
.. .. .. .. .. .. .. .. .. ..
While there are some discrepancies in the above numbers, I tried several times and the results are typical of what I got.
12. Re: absolute function
- Posted by DerekParnell (admin) Mar 21, 2011
- 3717 views
I did extensive testing of the abs function for a single element integer, single element double value, a sequence of 1000 integers and a sequence of 1000 doubles.
The loop tested was had 1000000 iterations.
Another aspect to consider is that it is expected that in real-world usage, find the absolute of integer values would be far more commonly done than for doubles, and the absolute for sequences would be a fairly rare occurrence. So it might be interesting to adjust the test counts to something more like ...
1500 integers
75 doubles
2 sequences of 1000 integers
1 sequence of 50 doubles
13. Re: absolute function
- Posted by Vinoba Mar 21, 2011
- 3687 views
1500 integers 75 doubles 2 sequences of 1000 integers 1 sequence of 50 doubles
Will try in a day or two. I presume you meant 1500 integers (or 75 doubles) supplied to abs() function sequentially in a loop rather than as a sequence.
14. Re: absolute function
- Posted by bill Mar 24, 2011
- 3534 views
function abs(integer x) return and_bits(-(x<0), -x) + and_bits(-(x>0), x) end function
for what it is worth.
15. Re: absolute function
- Posted by mattlewis (admin) Mar 24, 2011
- 3562 views
function abs(integer x) return and_bits(-(x<0), -x) + and_bits(-(x>0), x) end function
for what it is worth.
Pretty clever. I haven't tested, but for integers, a simple if statement should be faster:
function simple_abs( integer x ) if x >= 0 then return x else return -x end if end function
If you look at the IL code for the two of them:
SubProgram [/home/matt/eu/test/bill.ex-abs:00146] [x:148] STACK SPACE: 9 Required: 9 [/home/matt/eu/test/bill.ex:1] (1) 1: 096 148 # INTEGER_CHECK: [x:148] [/home/matt/eu/test/bill.ex:2] (2) 3: 001 148 143 149 # LESS: [x:148], [LIT 0:143] => [_temp_:149] 7: 012 149 150 # UMINUS: [_temp_:149] => [_temp_:150] 10: 012 148 151 # UMINUS: [x:148] => [_temp_:151] 13: 056 150 151 152 # AND_BITS: [_temp_:150], [_temp_:151] => [_temp_:152] 17: 006 148 143 153 # GREATER: [x:148], [LIT 0:143] => [_temp_:153] 21: 012 153 154 # UMINUS: [_temp_:153] => [_temp_:154] 24: 056 154 148 155 # AND_BITS: [_temp_:154], [x:148] => [_temp_:155] 28: 011 152 155 156 # PLUS: [_temp_:152], [_temp_:155] => [_temp_:156] 32: 208 150 # DEREF_TEMP: [_temp_:150] 34: 208 151 # DEREF_TEMP: [_temp_:151] 36: 208 152 # DEREF_TEMP: [_temp_:152] 38: 208 154 # DEREF_TEMP: [_temp_:154] 40: 208 155 # DEREF_TEMP: [_temp_:155] 42: 028 146 147 156 # RETURNF: [_temp_:156] block[147] 46: 043 # BADRETURNF: End SubProgram [abs:00146] SubProgram [/home/matt/eu/test/bill.ex-simple_abs:00157] [x:159] STACK SPACE: 3 Required: 3 [/home/matt/eu/test/bill.ex:5] (5) 1: 096 159 # INTEGER_CHECK: [x:159] [/home/matt/eu/test/bill.ex:6] (6) 3: 120 159 143 13 # GREATEREQ_IFW [x:159] > [LIT 0:143] goto 0007 else goto 0013 [/home/matt/eu/test/bill.ex:7] (7) 7: 028 157 158 159 # RETURNF: [x:159] block[158] [/home/matt/eu/test/bill.ex:8] (8) 11: 023 20 # ELSE goto 0020 [/home/matt/eu/test/bill.ex:9] (9) 13: 012 159 163 # UMINUS: [x:159] => [_temp_:163] 16: 028 157 158 163 # RETURNF: [_temp_:163] block[158] 20: 043 # BADRETURNF: End SubProgram [simple_abs:00157]
There are lots of operations created, and several temps.
Matt
16. Re: absolute function
- Posted by bill Mar 26, 2011
- 3451 views
OK,
I didn't promise that it would be fast.
But here is a ?serious? contender - from the bit-twiddling website.
include std/math.e function abs(x) integer v = shift_bits(x, 30) return xor_bits(x+v, v) end function
Can't promise it will be faster either.
17. Re: absolute function
- Posted by petelomax Mar 26, 2011
- 3501 views
Can't promise it will be faster either.
I can promise it won't work with floats though.
18. Re: absolute function
- Posted by bill Mar 27, 2011
- 3453 views
I could say floats are not really numbers...
The principle should work with floats if you change the power appropriately.
I used 2^30 because it works with integers. Presumably 2^52 or something like that sohould work with floats.
19. Re: absolute function
- Posted by mattlewis (admin) Mar 27, 2011
- 3422 views
I could say floats are not really numbers...
You could say lots of things like that, but you'd be wrong.
The principle should work with floats if you change the power appropriately.
I used 2^30 because it works with integers. Presumably 2^52 or something like that sohould work with floats.
If you could easily have low level access to the double, you could do this. It wouldn't work in pure euphoria though, with an atom, of course.
Matt
20. Re: absolute function
- Posted by alanjohnoxley Mar 28, 2011
- 3426 views
IMHO
Coding for Bulletproof, easily understandable, and fast will seldom result in the same code :) How about having the bulletproof vesion as the official abs(), and include a comment/link to notable fast and understandable versions (as comments) in the euphoria standard include libs?
That way, the developers won't need to maintain different versions of abs() but the other versions are visible to those who may have a interest in customised versions.
HTH!
21. Re: absolute function
- Posted by mattlewis (admin) Mar 28, 2011
- 3409 views
IMHO
Coding for Bulletproof, easily understandable, and fast will seldom result in the same code :) How about having the bulletproof vesion as the official abs(), and include a comment/link to notable fast and understandable versions (as comments) in the euphoria standard include libs?
That way, the developers won't need to maintain different versions of abs() but the other versions are visible to those who may have a interest in customised versions.
HTH!
Actually, the code we already have is the fastest, it's really pretty easy to understand, and seems pretty bulletproof (barring errors in the euphoria runtime!). What it isn't is terribly clever or compact. Fortunately, it has a pretty simple spec to follow.
Matt
22. Re: absolute function
- Posted by Vinoba Mar 28, 2011
- 3359 views
return s * sign ( s )
This code and similar other single line codes I have tried are easy to understand and bullet proof and valid for atoms or sequences, but as Matt quite rightly says, the current lengthy code with branching and iteration is still the fastest! One can blame the multiply and other functions people have used, but this can only get us into more lengthy debates. I have accepted the current code as the fastest, but am using one line code in place without calling the actual abs() function.
include std\math.e sequence vec = { 23.456, -76.893, -34.5, 0, 23.55 } ? vec ? vec * sign (vec) -- you could use this ? abs (vec) -- instead of this inside of a long series of math operations -- althouth the latter is faster -- if you are worried about the multi-line abs function or the iteration therein.
But then, even then sign function is also not straightforward
23. Re: absolute function
- Posted by alanjohnoxley Mar 30, 2011
- 3328 views
@Matt
Just wanted to suggest the comments to be included in the standard ABS() so that there is less rehashes of this thread...to save developer time :)