Re: which is faster?
- Posted by Robert Craig <rds at RapidEuphor?a.com> Nov 08, 2007
- 631 views
Matt Lewis wrote: > CChris wrote: > > I thought that precomputing s[i] would speed things up, but this didn't > > occur. > > Not sure why. > > I think you didn't use s[i] enough to make the difference. Also, the > interpreter is smart enough to reuse the temp variable that it creates. > Here's the IL that's generated when you don't try to cache s[i]: > > 40: 058 119 # STARTLINE: string.exw(119)<<sum += power(s[i], > # 2)>> > 42: 025 171 178 179 # RHS_SUBS: [s:171] sub [i:178] => [_temp_:179] > 46: 013 179 179 179 # MULTIPLY: [_temp_:179], [_temp_:179] => > # [_temp_:179] > 50: 011 174 179 174 # PLUS: [sum:174], [_temp_:179] => [sum:174] > > Here's the IL when you *do* cache s[i]: > 52: 058 138 # STARTLINE: string.exw(138)<<sd = s[i]>> > 54: 025 181 189 185 # RHS_SUBS: [s:181] sub [i:189] => [sd:185] > 58: 101 185 # ATOM_CHECK: [sd:185] > 60: 087 185 # DISPLAY_VAR: [sd:185] > 62: 058 139 # STARTLINE: string.exw(139)<<sum += sd*sd>> > 64: 013 185 185 190 # MULTIPLY: [sd:185], [sd:185] => [_temp_:190] > 68: 011 184 190 184 # PLUS: [sum:184], [_temp_:190] => [sum:184] > > The STARTLINE and DISPLAY_VAR codes are due to with trace (which I put in > so we could see the actual lines for clarity) and don't really affect > the algorithm (assuming that a production release would make sure that > without trace was in effect). > > But the bottom line is that you've added an extra ATOM_CHECK op that > doesn't happen when you use it straight. I think the interpreter is able > to do this because it's on the same line (though I'll admit that the > logic of when to use temps and when to save them is still opaque to > me, so Rob is probably the only one who could explain this without pouring > over the code). When type_check is in force (by default) then the interpreter has to do the ATOM_CHECK when you assign to a variable declared as atom. If you say "without type_check", that check should be eliminated. As Michael Sabal noted, power(x, 2) is optimized to x * x (see emit.e), so you get: temp = s[i] temp = temp * temp It's smart enough to re-use the same temp. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com