1. Reverse: fastest yet, long post
- Posted by Hawke <mdeland at NWINFO.NET> Nov 30, 1998
- 761 views
On the Reverse() thread, I got to thinking about Rob's algorithm as opposed to mine. I then studied the results a little harder and had a few thoughts, which have panned out. In Rob's algorithm, he executes *four* actions within his loop, and I execute *two*. Well, you say, he only does half as many loops! That is what I thought as well, and dismissed it just like you likely did. Something kept nagging at me. Then I dawned upon the following: we are assuming that every assignment, every lookup of an element, within that sequence, and the subsequent movement of that pointer or that byte (depending on contents) only takes as much time as any other action. I think we believed this because binary is supposedly always faster, and results were shown that seemed to indicate this. We dismissed it. If each assignment, within his loop, the: temp=s[i] s[i]=s[n] s[n]=temp section, was, indeed, only to take a single clock tick, then Rob's algorithm and my algorithm would be *identical* in speed. It was the math that was haunting me, that nagging that something was not right. To wit: the following chart shows the clock tick ratio for a length 1000 'string'. The 'swap' heading are the number of element swaps that each of us do in the body of the loop. The 'sub' heading (which is equal) represents the n=n-1 that we both have to do within our loops, after the assignment(s). The first column shows how many iterations we each need to make to solve for a 1000 length seq. so for Rob, his 500 loops, 3 assignments, and 1 sub, with those 4 actions each costing 1 tick apiece, yields 2000 clock ticks for his loop. and for me, my 1000 loops, 1 assign, and 1 sub, with those 2 actions taking a tick apiece, yeilds: yep... same thing. len = 1000 --all swaps/subs are 1 tick swap sub tick 500 * (3*1 + 1) = 2000 rob 500 loops for 1000 len 1000* (1*1 + 1) = 2000 hawke 1000 loops for 1000 len ***tick counts are equal*** What this means: whoever does the least processing before entering and exiting that loop, _wins_. It also means that it's impossible for either algorithm, within the loop, to be better or worse. Now we get to my nagging. The lookup of an offset, and moving of a pointer or byte, I don't believe can occur in *ONE* clock tick, not with even the tiny overhead that EU imposes. I don't think it can be done even with asm tricks, like xchg, unless you did it *all* in asm. The overhead for EU to interpret and offset into the sequence I would have to say costs a tick or more. As soon as you place 2 ticks upon an exchange between sequence elements, even a simple one like: a[i] = b[j], as soon as we agree that it might indeed take 2 or more clock ticks to do that, Rob's algorithm fails, inside the loop, by a large margin, to equate with mine in clock tick cost. To wit: as of now, swaps of elements cost 2 ticks and the sub (n=n-1) still costs 1 tick, which *that* can be done by watcom as there is a Dec() command in asm, that can easily be accessed by Rob and the causeway extender, that costs a single tick. Looking at the math, with those parameters, we see that Rob, with 500 loops, 3 swaps (6 ticks now) and 1 sub (using a single tick), his new tick total is *larger* than mine. --swaps = 2tick sub = 1tick swap sub tick 500 * (3*2 + 1) = 3500 rob 500 loops for 1000 len 1000* (1*2 + 1) = 3000 hawke 1000 loops for 1000 len ***tick counts unequal*** We then start discussing repetitive iterations in the order of hundreds of thousands, and my curiousity got the better of me. I then reran our two functions on my machine here and came up with different results. I believe that the 'bad' answers arised from the call_proc usage. I believe they influnced the benchmarks more than we may have thought. Once I stripped his original benchmark of all routineID/call_proc the results became rather different indeed. I also noticed that his was not 'walking away' from mine like it did with his first results. The difference were much closer, and I also noticed some interesting trends to sequence size versus iterations. I then fine tuned mine, armed with a rather stable and seemingly very accurate, and very consistent benchmark program, to various ideas I had for improvement in the areas that my algorithm lacked behind his. I've also discovered that n=n-1 does *NOT* cost the same clock tick amount as n=n+1. And it *should*. Period. There is no reason, interpreted or not, that those two statements should not take exactly one tick on a 386+ machine. With this in mind, I submit the following benchmark program, and my latest algorithm compared against Rob's latest algorithm, and the result file created automatically appended to the end. If anyone runs this, and does not see the '!' marks in nearly the exact same places when they run this, please, post. Now, you may note, that as the sequences grow, I back down the iterations, as, I am not as patient apparently as Rob :) I can't do the 800+second thing :) The sequences are larger, so it's not an issue with memory, it's simply trying rep counts that bring the timings to manageable wait periods. If anyone sees a grievous error, please, post. I do not wish to spread bad code, especially since this may become 'the' algorithm for inclusion directly into EU. Analysis of results: -------------------- If you note, in the list of results (below), no matter the size or repetitions, my altered/new algorithm completed the task faster than Rob's, for *any* length sequence (that I tried, which is far more than displayed here) that was over length 10. Also, my algorithm exits nearly immediately for 0 and 1 based lengths, which will lend itself to 'if atom()' tests when/if we apply this to objects and recursion. My reasoning for the scattered timings in the range of 2-10 lengths is the repeat() function. I have to call that function and Rob doesnt. He has to call a floor and a divide however. Once my repeat time catches up, the algorithm can then can perform faster than Rob's. Trust me when I tell you I tried every trick to lose the time difference in the 2-10 range. I set up 10 functions that directly flipped the elements (ie: return {d[2],d[1]} if the len was 2, and return {d[3],d[2],d[1]} if the len was three, and so on) and then I created a fast lookup table that was ordered to length being 0-10. I then created a routineID hash into that table so I could use find() based upon the value of len, and call a proc directly to hard code flip those indicies. something like: func rev0 return d func rev1 return d func rev2 return {d[2],d[1]} func rev3 return {d[3],d[2],d[1]} etc etc etc revID = {"rev0","rev1","rev2","rev3",...etc} lookup = repeat(0,10) for i = 1 to 10 do revID[i] = routineID(revID[i]) lookup[i]= i end for i did this as soon as the program started, outside all functions. my reverse then became: func reverse(seq d) int len seq new len = length(d) if len < 10 then if len < 2 then return d end if return call_func( revID[find(len,lookup)], {d} ) end if new = repeat(0,len) for i = 1 to len do.... etc return new end func now, you would think this to be *fast* for those ranges. nope, very very slow and that is what clued me most of all to the other benchmark program being slightly askew. all right i say, I try a binary if..then tree for the values from 0-10, but that didn't help... len = length(d) if len < 2 return d if len < 10 then if len < 6 then etc... with each stage shortcircuiting as much as possible, basically cutting and pasting the code used to flip the indexes from the above rev0, rev1, rev2, rev3, rev4 etc functions... that didn't help. So, I say, well, quick find and short circuits aren't working, and they really should be, since i am really not facing competition here with the floor and division in Rob's algorithm... One of those two methods should have outpaced those 'cumbersome' calculations. I then stumbled upon the thought that repeat() was my problem. so i made another quicklist, of sorts. But not holding values, but instead holding preallocated memory, so i wouldn't have to stub out to the function repeat for such a trivial amount of space required. constant premade = {{},{0},{0,0},{0,0,0},{0,0,0,0},... etc up to length 10. i altered the entry to the reverse function which involved duplicate code (that, if this had worked, i would have simply made into seperate functions if need be) to look as so: func rev(seq d) int len seq new len = length(d) if len<2 then return d if len<10 then new = premade[len] for i = 1 to len do blahblahblah return new end if new = repeat(0,len) for i = 1 to... do blah return new end func see, this way, i dont have to call the repeat() function to allocate, i simply point right into premade... and grab my new seq that way. *sigh* even that didn't shave quite enough. What actually made the difference, and began out pacing Rob's algorithm for every value above that 10 and for almost all values below 10, (the odd/even numbers, if you notice... queer eh?) was the reversing of the for loop in my algorithm... and the simple act alone of exiting for the len < 2 condition only... This is where i discovered all of the claims I have made in this post, and this is the final, fastest version based upon _lengthy_ testing of all the above ideals against both forward and backward counting for loops. thank you for your time, if you have read this far. take care all, and care of indeed --Hawke' code and results follow: --------------------------BEGIN CODE without type_check include machine.e tick_rate(1000) object junk, before, after, fn procedure bothputs(object data) puts(1, data) puts(fn,data) end procedure procedure bothprintf(sequence format,object data) printf(1, format,data) printf(fn,format,data) end procedure function min(atom a, atom b) if b<a then return b end if return a end function function rev1(sequence d) -- Hawke' sequence new --t is to, f is from integer len, f len = length(d) if len < 2 then return d end if --simple case, bye! f = 0 new = repeat(0,len) for t = len to 1 by -1 do --faster to make loop go down f = f + 1 new[t] = d[f] --and INCrement inside loop end for return new end function function rev2(sequence s) -- Rob Craig object temp integer n n = length(s) for i = 1 to floor(n/2) do temp=s[i] s[i]=s[n] s[n]=temp n=n-1 end for return s end function function measure(sequence what, integer reps) -- time a routine, pass the timing result back atom start object junk, res res = {0,0} junk = {} start = time() for i = 1 to reps do junk = rev1(what) end for res[1]= time() - start junk = {} start = time() for i = 1 to reps do junk = rev2(what) end for res[2]= time() - start if get_key() then end if return res end function procedure preproc(integer reps) for i = 1 to length(before) do junk = before[i] bothprintf ("%6d", length(junk)) after= measure(junk, reps) junk = min(after[1],after[2]) bothprintf( "%10.4f" & (' ' + (junk = after[1])), after[1] ) bothprintf( "%10.4f" & (' ' + (junk = after[2])), after[2] ) bothprintf( " reps:%d", reps ) bothputs('\n') end for end procedure after = {0,0,0} fn = open("results.rev","w") if fn = -1 then puts(1,"cant open") abort(1) end if for run = 1 to 3 do bothputs(sprintf("Run number:%d\n",run)) bothputs("length Hawke Rob !=winner \n") bothputs("---------------------------- \n") before = {{},"a","aa","AAA","AAAA"} for i= 5 to 10 do before=append(before,repeat('a',i)) end for preproc(100000) before = {} for i= 25 to 100 by 25 do before=append(before,repeat('b',i)) end for for i= 100 to 500 by 100 do before=append(before,repeat('c',i)) end for preproc(10000) before = {} for i=1000 to 5000 by 500 do before=append(before,repeat('d',i)) end for preproc(1000) before = {} for i=10000 to 50000 by 5000 do before=append(before,repeat('d',i)) end for preproc(100) bothputs("\n\n") end for close(fn) ------------------END CODE ------------------------RESULTS----------------- ---the ! mark indicates the faster of the two results Run number:1 length Hawke Rob !=winner ---------------------------- 0 0.1330! 0.1470 reps:100000 1 0.1350! 0.1510 reps:100000 2 0.2820 0.2770! reps:100000 3 0.3180 0.2810! reps:100000 4 0.3409! 0.3449 reps:100000 5 0.3769 0.3489! reps:100000 6 0.4139! 0.4269 reps:100000 7 0.4509 0.4299! reps:100000 8 0.4809! 0.4889 reps:100000 9 0.5109 0.4919! reps:100000 10 0.5209! 0.5549 reps:100000 25 0.0990! 0.1010 reps:10000 50 0.1730! 0.1870 reps:10000 75 0.2500! 0.2640 reps:10000 100 0.3220! 0.3479 reps:10000 100 0.3230! 0.3509 reps:10000 200 0.6359! 0.6939 reps:10000 300 0.9349! 1.0208 reps:10000 400 1.2388! 1.3548 reps:10000 500 1.5528! 1.7047 reps:10000 1000 0.3100! 0.3399 reps:1000 1500 0.4919! 0.5409 reps:1000 2000 0.6749! 0.7459 reps:1000 2500 0.8749! 0.9858 reps:1000 3000 1.0288! 1.2048 reps:1000 3500 1.2178! 1.4438 reps:1000 4000 1.3968! 1.6597 reps:1000 4500 1.5968! 1.8867 reps:1000 5000 1.7897! 2.0887 reps:1000 10000 0.3899! 0.4209 reps:100 15000 0.6059! 0.6619 reps:100 20000 0.8399! 0.8809 reps:100 25000 1.0518! 1.1268 reps:100 30000 1.2648! 1.3358 reps:100 35000 1.4988! 1.5638 reps:100 40000 1.7067! 1.7937 reps:100 45000 1.9587! 2.0717 reps:100 50000 2.1657! 2.2877 reps:100 Run number:2 length Hawke Rob !=winner ---------------------------- 0 0.1370! 0.1470 reps:100000 1 0.1360! 0.1490 reps:100000 2 0.2840 0.2780! reps:100000 3 0.3180 0.2780! reps:100000 4 0.3429 0.3429! reps:100000 5 0.3779 0.3519! reps:100000 6 0.4119! 0.4249 reps:100000 7 0.4519 0.4299! reps:100000 8 0.4829! 0.4879 reps:100000 9 0.5109 0.4899! reps:100000 10 0.5229! 0.5559 reps:100000 25 0.0970! 0.1020 reps:10000 50 0.1710! 0.1890 reps:10000 75 0.2470! 0.2650 reps:10000 100 0.3230! 0.3529 reps:10000 100 0.3200! 0.3519 reps:10000 200 0.6319! 0.6919 reps:10000 300 0.9369! 1.0268 reps:10000 400 1.2418! 1.3598 reps:10000 500 1.5518! 1.6947 reps:10000 1000 0.3100! 0.3409 reps:1000 1500 0.4919! 0.5419 reps:1000 2000 0.6739! 0.7439 reps:1000 2500 0.8769! 0.9839 reps:1000 3000 1.0288! 1.2078 reps:1000 3500 1.2148! 1.4428 reps:1000 4000 1.3898! 1.6607 reps:1000 4500 1.5978! 1.8857 reps:1000 5000 1.7937! 2.0887 reps:1000 10000 0.3929! 0.4249 reps:100 15000 0.5989! 0.6389 reps:100 20000 0.8189! 0.8729 reps:100 25000 1.0348! 1.0938 reps:100 30000 1.2728! 1.3468 reps:100 35000 1.4838! 1.5668 reps:100 40000 1.7107! 1.8027 reps:100 45000 1.9527! 2.0657 reps:100 50000 2.1707! 2.2817 reps:100 Run number:3 length Hawke Rob !=winner ---------------------------- 0 0.1370! 0.1470 reps:100000 1 0.1350! 0.1490 reps:100000 2 0.2850 0.2770! reps:100000 3 0.3309 0.2820! reps:100000 4 0.3489! 0.3499 reps:100000 5 0.3779 0.3539! reps:100000 6 0.4259! 0.4289 reps:100000 7 0.4579 0.4359! reps:100000 8 0.4989 0.4949! reps:100000 9 0.5219 0.4909! reps:100000 10 0.5219! 0.5559 reps:100000 25 0.0980! 0.1010 reps:10000 50 0.1760! 0.1860 reps:10000 75 0.2500! 0.2640 reps:10000 100 0.3230! 0.3469 reps:10000 100 0.3250! 0.3499 reps:10000 200 0.6339! 0.6909 reps:10000 300 0.9349! 1.0188 reps:10000 400 1.2358! 1.3508 reps:10000 500 1.5448! 1.6927 reps:10000 1000 0.3130! 0.3399 reps:1000 1500 0.4909! 0.5419 reps:1000 2000 0.6749! 0.7429 reps:1000 2500 0.8779! 0.9849 reps:1000 3000 1.0278! 1.2038 reps:1000 3500 1.2168! 1.4438 reps:1000 4000 1.3908! 1.6597 reps:1000 4500 1.5938! 1.8857 reps:1000 5000 1.7877! 2.0917 reps:1000 10000 0.3939! 0.4259 reps:100 15000 0.5989! 0.6349 reps:100 20000 0.8199! 0.8769 reps:100 25000 1.0358! 1.0958 reps:100 30000 1.2698! 1.3468 reps:100 35000 1.4868! 1.5638 reps:100 40000 1.7147! 1.7997 reps:100 45000 1.9547! 2.0667 reps:100 50000 2.1707! 2.2807 reps:100
2. Re: Reverse: fastest yet, long post
- Posted by Robert Craig <rds at EMAIL.MSN.COM> Nov 30, 1998
- 555 views
Hawke writes: <long post about how he now has the fastest reverse()> I ran your program on my Pentium-150 and confirmed that your new reverse() is faster than mine. So... I made an even faster reverse() that beats your new one on all sizes from about 10 up. I copied it into your program below: Run number:1 length Hawke Rob !=winner ---------------------------- 0 0.1220! 0.2940 reps:100000 1 0.1210! 0.3230 reps:100000 2 0.3439 0.3439! reps:100000 3 0.3819! 0.4219 reps:100000 4 0.4249! 0.4289 reps:100000 5 0.4719! 0.5129 reps:100000 6 0.5149! 0.5189 reps:100000 7 0.5589! 0.5909 reps:100000 8 0.6019 0.5979! reps:100000 9 0.6709! 0.6939 reps:100000 10 0.6649 0.6459! reps:100000 25 0.1300 0.1280! reps:10000 50 0.2340 0.2180! reps:10000 75 0.3429 0.3190! reps:10000 100 0.4579 0.4309! reps:10000 100 0.4489 0.4209! reps:10000 200 0.9139 0.8859! reps:10000 300 1.3528 1.2578! reps:10000 400 1.8557 1.7347! reps:10000 500 2.2827 2.1527! reps:10000 1000 0.4829 0.4529! reps:1000 1500 0.7449 0.6919! reps:1000 2000 1.0028 0.9409! reps:1000 2500 1.2688 1.1798! reps:1000 3000 1.6667 1.5708! reps:1000 3500 1.9027 1.7747! reps:1000 4000 2.0727 1.9617! reps:1000 4500 2.5276 2.3826! reps:1000 5000 2.9605 2.6516! reps:1000 10000 0.5899 0.5449! reps:100 15000 0.8899 0.8309! reps:100 20000 1.2128 1.1198! reps:100 25000 1.5138 1.4328! reps:100 30000 1.8707 1.7387! reps:100 35000 2.2907 2.0557! reps:100 40000 2.5706 2.3446! reps:100 45000 2.7946 2.6456! reps:100 50000 3.1335 2.9925! reps:100 Run number:2 length Hawke Rob !=winner ---------------------------- 0 0.1210! 0.2420 reps:100000 1 0.1200! 0.3230 reps:100000 2 0.3289! 0.3299 reps:100000 3 0.3819! 0.4249 reps:100000 4 0.4259! 0.4269 reps:100000 5 0.4699! 0.5109 reps:100000 6 0.5779 0.5749! reps:100000 7 0.5579! 0.5939 reps:100000 8 0.6029 0.5949! reps:100000 9 0.6969! 0.7219 reps:100000 10 0.6619 0.6499! reps:100000 25 0.1300 0.1280! reps:10000 50 0.2330 0.2180! reps:10000 75 0.3399 0.3200! reps:10000 100 0.4629! 0.4789 reps:10000 100 0.4529 0.4139! reps:10000 200 0.9119 0.8729! reps:10000 300 1.3468 1.2438! reps:10000 400 1.8437 1.7247! reps:10000 500 2.2767 2.1427! reps:10000 1000 0.4869 0.4529! reps:1000 1500 0.7309 0.6839! reps:1000 2000 1.0008 0.9139! reps:1000 2500 1.3118 1.2258! reps:1000 3000 1.5098 1.4218! reps:1000 3500 1.8077 1.6797! reps:1000 4000 2.1597 2.0037! reps:1000 4500 2.3326 2.1927! reps:1000 5000 2.6126 2.4596! reps:1000 10000 0.5619 0.5269! reps:100 15000 0.8649 0.8029! reps:100 20000 1.1678 1.0918! reps:100 25000 1.4658 1.3788! reps:100 30000 1.8637 1.7367! reps:100 35000 2.3116 2.0627! reps:100 40000 2.5726 2.3496! reps:100 45000 2.8016 2.6366! reps:100 50000 3.1275 2.9945! reps:100 Run number:3 length Hawke Rob !=winner ---------------------------- 0 0.1200! 0.2420 reps:100000 1 0.1210! 0.3240 reps:100000 2 0.3319 0.3309! reps:100000 3 0.3809! 0.4229 reps:100000 4 0.4239! 0.4299 reps:100000 5 0.4709! 0.5119 reps:100000 6 0.5149! 0.5179 reps:100000 7 0.5629! 0.5929 reps:100000 8 0.6629 0.6519! reps:100000 9 0.6479! 0.6769 reps:100000 10 0.7099 0.6989! reps:100000 25 0.1310 0.1280! reps:10000 50 0.2420 0.2240! reps:10000 75 0.3389 0.3190! reps:10000 100 0.4719 0.4399! reps:10000 100 0.4579 0.4139! reps:10000 200 0.9079 0.8769! reps:10000 300 1.3448 1.2468! reps:10000 400 1.8447 1.7237! reps:10000 500 2.2727 2.1417! reps:10000 1000 0.4839 0.4559! reps:1000 1500 0.7309 0.6799! reps:1000 2000 1.0018 0.9139! reps:1000 2500 1.3118 1.2228! reps:1000 3000 1.5088 1.4258! reps:1000 3500 1.8107 1.6797! reps:1000 4000 2.1587 2.0037! reps:1000 4500 2.3336 2.1937! reps:1000 5000 2.6116 2.4636! reps:1000 10000 0.5629 0.5259! reps:100 15000 0.8659 0.8049! reps:100 20000 1.1678 1.0938! reps:100 25000 1.4658 1.3828! reps:100 30000 1.8517 1.7367! reps:100 35000 2.3186 2.0647! reps:100 40000 2.5696 2.3496! reps:100 45000 2.7936 2.6416! reps:100 50000 3.1265 2.9965! reps:100 Here's your program with my new rev2(): --------------------------BEGIN CODE without type_check include machine.e tick_rate(1000) object junk, before, after, fn procedure bothputs(object data) puts(1, data) puts(fn,data) end procedure procedure bothprintf(sequence format,object data) printf(1, format,data) printf(fn,format,data) end procedure function min(atom a, atom b) if b<a then return b end if return a end function function rev1(sequence d) -- Hawke' sequence new --t is to, f is from integer len, f len = length(d) if len < 2 then return d end if --simple case, bye! f = 0 new = repeat(0,len) for t = len to 1 by -1 do --faster to make loop go down f = f + 1 new[t] = d[f] --and INCrement inside loop end for return new end function -- function rev2(sequence s) -- Rob Craig -- old -- object temp -- integer n -- n = length(s) -- for i = 1 to floor(n/2) do -- temp=s[i] s[i]=s[n] s[n]=temp n=n-1 -- end for -- return s -- end function function rev2(sequence s) -- Rob Craig, new! integer lower, n sequence t n = length(s) t = repeat(0, n) lower = 1 for upper = n to floor(n/2)+1 by -1 do t[upper] = s[lower] t[lower] = s[upper] lower = lower + 1 end for return t end function function measure(sequence what, integer reps) -- time a routine, pass the timing result back atom start object junk, res res = {0,0} junk = {} start = time() for i = 1 to reps do junk = rev1(what) end for res[1]= time() - start junk = {} start = time() for i = 1 to reps do junk = rev2(what) end for res[2]= time() - start if get_key() then end if return res end function procedure preproc(integer reps) for i = 1 to length(before) do junk = before[i] bothprintf ("%6d", length(junk)) after= measure(junk, reps) junk = min(after[1],after[2]) bothprintf( "%10.4f" & (' ' + (junk = after[1])), after[1] ) bothprintf( "%10.4f" & (' ' + (junk = after[2])), after[2] ) bothprintf( " reps:%d", reps ) bothputs('\n') end for end procedure after = {0,0,0} fn = open("results.rev","w") if fn = -1 then puts(1,"cant open") abort(1) end if for run = 1 to 3 do bothputs(sprintf("Run number:%d\n",run)) bothputs("length Hawke Rob !=winner \n") bothputs("---------------------------- \n") before = {{},"a","aa","AAA","AAAA"} for i= 5 to 10 do before=append(before,repeat('a',i)) end for preproc(100000) before = {} for i= 25 to 100 by 25 do before=append(before,repeat('b',i)) end for for i= 100 to 500 by 100 do before=append(before,repeat('c',i)) end for preproc(10000) before = {} for i=1000 to 5000 by 500 do before=append(before,repeat('d',i)) end for preproc(1000) before = {} for i=10000 to 50000 by 5000 do before=append(before,repeat('d',i)) end for preproc(100) bothputs("\n\n") end for close(fn) ------------------END CODE Regards, Rob Craig Rapid Deployment Software http://members.aol.com/FilesEu/
3. Re: Reverse: fastest yet, long post
- Posted by Albert Brauneis <Dajawu36 at AOL.COM> Nov 30, 1998
- 580 views
In a message dated 11/30/98 7:59:39 PM !!!First Boot!!!, rds at EMAIL.MSN.COM writes: << I ran your program on my Pentium-150 and confirmed that your new reverse() is faster than mine. So... I made an even faster reverse() that beats your new one on all sizes from about 10 up. I copied it into your program below: >> You guys are crazy!! :)