1. Which is faster ?
- Posted by bernie Feb 17, 2009
- 998 views
On a long sequence of items which is faster using the
find() function or searching with a for loop ???
2. Re: Which is faster ?
- Posted by mattlewis (admin) Feb 17, 2009
- 970 views
On a long sequence of items which is faster using the
find() function or searching with a for loop ???
find() should always be faster than a manual loop. In fact, find() is even faster in 4.0 than it was in 3.1. If the list gets too long, however, other methods may be able to beat it, such as a binary search (of which there is an implementation in the 4.0 std library). But of course, you have to make sure that you sort your data first.
Matt
3. Re: Which is faster ?
- Posted by DerekParnell (admin) Feb 17, 2009
- 984 views
- Last edited Feb 18, 2009
On a long sequence of items which is faster using the
find() function or searching with a for loop ???
find() should always be faster than a manual loop. In fact, find() is even faster in 4.0 than it was in 3.1. If the list gets too long, however, other methods may be able to beat it, such as a binary search (of which there is an implementation in the 4.0 std library). But of course, you have to make sure that you sort your data first.
Matt
Just a bit more clarification, find() is faster for long sequences. If the sequence is, say less than 10-15 items, then a loop can be faster. Also, the type of data in the sequence can influence the speed too. Generally, comparing integers (using find() or '=') is faster than comparing atoms, which is in turn fast than comparing sequences.
However, following Matt's advice, if you know something about the contents of the sequence before seeking, you can often come up with a smarter way of searching it.
- A sorted sequence would benefit from a binary search.
- Or if there are mostly only a few items that are found much more than other items, arrange to have them at the front of the sequence.
- Or if the sequence is fairly static (doesn't change much), you might consider a hash map instead.
- Maybe you can arrange the sequence to contain a set of sub-sequences based of the length of the item. Such that all 1-element items are in the first sub-sequence, 2-element items in the second sub-sequence etc.... Then when doing a search, you can quick work out which sub-sequence your item is contained it.
4. Re: Which is faster ?
- Posted by mattlewis (admin) Feb 17, 2009
- 977 views
- Last edited Feb 18, 2009
Just a bit more clarification, find() is faster for long sequences. If the sequence is, say less than 10-15 items, then a loop can be faster.
Do you have any benchmarks that demonstrate this? I ran the following on r1380, and find() always won. I changed the length of the sequence, changed the values to be all integers (and used = instead of equal()), etc. Actually, the loop won one time, when s = {""}. There's just a lot more overhead in a euphoria loop than the native C loop in the backend find function (not to mention your optimizations for non-integer data).
sequence s = { 1,"2",3,4.4,5,6,7} constant REPS = 1000000 atom t = time() for i = 1 to REPS do for j = 1 to length(s) do object x = s[j] integer ix = find( j, s ) end for end for atom find_time = time() - t t = time() for i = 1 to REPS do for j = 1 to length(s) do for k = 1 to length(s) do object x = s[j] if equal( s[k], j ) then exit end if end for end for end for atom loop_time = time() - t printf(1, "%d repetitions:\nfind: %1.3f\nloop: %1.3f\n", { REPS, find_time, loop_time })
Matt
5. Re: Which is faster ?
- Posted by DerekParnell (admin) Feb 17, 2009
- 981 views
- Last edited Feb 18, 2009
Just a bit more clarification, find() is faster for long sequences. If the sequence is, say less than 10-15 items, then a loop can be faster.
Do you have any benchmarks that demonstrate this?
No. Actually I mis-remembered. What I was thinking about was the difference between find() and map() - there is a point at which map() becomes faster than find(). But you are correct, find() is always faster than looping if you have an unordered sequence to scan through.
6. Re: Which is faster ?
- Posted by mattlewis (admin) Feb 18, 2009
- 958 views
No. Actually I mis-remembered. What I was thinking about was the difference between find() and map() - there is a point at which map() becomes faster than find(). But you are correct, find() is always faster than looping if you have an unordered sequence to scan through.
Ok. That I believe.
Matt
7. Re: Which is faster ?
- Posted by bernie Feb 18, 2009
- 936 views
No. Actually I mis-remembered. What I was thinking about was the difference between find() and map() - there is a point at which map() becomes faster than find(). But you are correct, find() is always faster than looping if you have an unordered sequence to scan through.
Ok. That I believe.
Matt
Thanks Derek and Matt for the information.
I gave up using maps because i kept running
into problems that I could not trace the origin of
and the maps seemed to be slower because of the overhead of calling them.