1. match() speed test
- Posted by useless_ Dec 11, 2012
- 953 views
I wrote a short test to determine the fastest way to locate substrings in a huge file. The results are unexpected. The code is at http://openeuphoria.org/pastey/175.wc , typical results are at http://openeuphoria.org/pastey/176.wc . Often, time2 is a clean sweep winner. I alternated the test order.
It's unexpected that lookng for the longer term is faster than the shorter term.
Looking for "eleven" was faster even if "x" was before "eleven", and even if i put an "e" in every needle (to attempt tripping up the search for "eleven" (the "e" would hit on every needle)).
How can this be? Is my test defective?
All that said, the speeds are very impressive: average is 260 nanoseconds per each needle!
useless
2. Re: match() speed test
- Posted by mattlewis (admin) Dec 11, 2012
- 953 views
Looking for "eleven" was faster even if "x" was before "eleven", and even if i put an "e" in every needle (to attempt tripping up the search for "eleven" (the "e" would hit on every needle)).
How can this be? Is my test defective?
I think that the answer is that with a longer needle, match() stops looking sooner, since there's no point in looking past the last element where the needle could possibly start. Since most of your tests have match() looking at non-matches, the longer needle does many fewer comparisons.
Try shorter haystacks for test1, and you should see the longer needle taking longer to find.
Matt
3. Re: match() speed test
- Posted by useless_ Dec 11, 2012
- 955 views
Looking for "eleven" was faster even if "x" was before "eleven", and even if i put an "e" in every needle (to attempt tripping up the search for "eleven" (the "e" would hit on every needle)).
How can this be? Is my test defective?
I think that the answer is that with a longer needle, match() stops looking sooner, since there's no point in looking past the last element where the needle could possibly start. Since most of your tests have match() looking at non-matches, the longer needle does many fewer comparisons.
Try shorter haystacks for test1, and you should see the longer needle taking longer to find.
Matt
So match does a length test also?
useless
4. Re: match() speed test
- Posted by useless_ Dec 11, 2012
- 937 views
Ah, it must be. I changed "ppppp" (5 chars long) to "eppppp" (same len as "eleven") on every needle, causing the match for "eleven" to take a speed hit.
useless
5. Re: match() speed test
- Posted by mattlewis (admin) Dec 11, 2012
- 913 views
Looking for "eleven" was faster even if "x" was before "eleven", and even if i put an "e" in every needle (to attempt tripping up the search for "eleven" (the "e" would hit on every needle)).
How can this be? Is my test defective?
I think that the answer is that with a longer needle, match() stops looking sooner, since there's no point in looking past the last element where the needle could possibly start. Since most of your tests have match() looking at non-matches, the longer needle does many fewer comparisons.
Try shorter haystacks for test1, and you should see the longer needle taking longer to find.
So match does a length test also?
Sort of. It uses a while loop, but stops the loop at the length of the sequence plus one minus the length, since there's no point in looking for the needle to start if there aren't enough elements left from where it's looking.
Matt