Re: FastFind() for sorted sequences

new topic     » goto parent     » topic index » view thread      » older message » newer message

Been there. It's officialy called binary search. And yes it is very fast i
have done some benchmarks myself. You can find it also in database.e.

----- Original Message -----
From: "Jonathan Leger" <jleger at amerisafe.com>
To: "EUforum" <EUforum at topica.com>
Subject: FastFind() for sorted sequences


>
> Hey folks,
>
> In working out an indexing routine for a new Eu database I'm working
> on, I wrote a routine that you might find useful and thought I'd
> post it here.
>
> It's called FastFind(), and is the equivalent of Find(), but only
> works on sorted sequences.  However, it works _very_ _very_ fast.
>
> For example, on my PC, finding 50,000 randomly selected values in a
> sequences of 50,000 integers took Find() 10 seconds, but only took
> FastFind() 0.6 seconds.  I've pasted in the code with the routine as
> well as the benchmarking program.  Hope you guys find it useful!
>
> Jonathan Leger
> ---START CODE---
> include sort.e
>
> function fastfind(object x, sequence s)
>
>     -- Perform a search in a sequence for a given value X.
>     -- Returns location of X in S, or 0 otherwise.
>     -- S must be sorted.
>
>     -- As you will see when you run the benchmarks,
>     -- using fastfind() on a sorted sequence is magnitudes
>     -- faster than using find().
>
>     integer i_half
>     integer i_min
>     integer i_max
>
>     if length(s) = 0 then
>         -- The sequence length is 0, so return
>         -- a 0 to have user prepend item.
>         return 0
>     end if
>
>     if x > s[length(s)] then
>         -- The value we're looking for is greater
>         -- than the last value, so return 0.
>         return 0
>     elsif x < s[1] then
>         -- The value we're looking for is less
>         -- than the first value, so return 0.
>         return 0
>     end if
>
>     i_min = 1
>     i_max = length(s)
>
>     while 1 do
>         -- Find center of current sequence portion.
>         i_half = floor((i_min - 1) + ((i_max - (i_min - 1)) / 2))
>
>         if x = s[i_half] then
>             -- Found it.  Return index.
>             return i_half
>         elsif s[i_half] < x then
>             -- Adjust min to current half plus one.
>             i_min = i_half + 1
>         else
>             -- Adjust max to current half minus one.
>             i_max = i_half
>         end if
>
>         -- If min and max are equal, then the
>         -- value is not in the sequence.
>         -- Return 0.
>         if i_min = i_max then
>             return 0
>         end if
>     end while
>
> end function
>
>
> -- Run some comparison benchmarks for fastfind() and find().
>
> integer i_count     -- Number of values to run.
> sequence values     -- Some random values.
> atom n_start        -- Start time (for benchmark).
> atom n_end          -- End time.
> atom n_time         -- Seconds elapsed.
> object dummy        -- For benchmark.
>
>
> -- Generate some random values.
> puts(1, "Generating values...\n")
> values = {}
> i_count = 50000
> for i = 1 to i_count do
>     values = append(values, rand(i_count))
> end for
>
> -- Sort the values.
> puts(1, "Sorting values...\n")
> values = sort(values)
>
<snip>

>
>
>

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu