Re: stdlib: Binary search function

new topic     » goto parent     » topic index » view thread      » older message » newer message

Kenneth Rhodes wrote:
> 
> CChris wrote:
> > 
> > 
> > This has been an oft requested functionality, and was in principle agreed,
> > even
> > by Rob:
> > <a
> > href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find</a>
> > <a
> > href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind</a>
> > 
> > Some implementations have been mentioned already (G. Boehme, J. Babor,...).
> > 
> > Now, the devil is in the details. What shall the function do?
> > * accept only strictly increasing sequences
> > * accept only strictly monotoni sequences
> > * accept loosely monotonic sequences (with possible duplicates)
> > * only cope with <any selection above>, not checking
> > 
> > I think everyone agrees it should return like db_find_key(), ie n>0 when
> > found,
> > n<0 with ordered insertion resulting in item lading at position -n.
> > 
> > And another point which wasn't agreed upon: would this go to sort.e or
> > sequence.e?
> > 
> > The version I use in sets.e has a from and to extra parameters. They could
> > be
> > easily defaulted. Monoony could be passed as a parameter too, depending on
> > the
> > option chosen.
> > 
> > CChris
> 
> Well, if the Devil is in the details, lets see the details. 
> Babor, Boehme, and Otto's bfind is short sweet and simple.
> 
> 
> }}}
<eucode>
> global function bfind(object x, sequence s)
> -- does a binary search on a sorted sequence
> -- returns index location if found, 0 if not
> -- assumes that sequence s has been sorted prior to this call
> -- Gabriel Boehme's code
> -- Earlier Joe Otto's routine is almost identical.
> integer lo, hi, mid, c
>    lo = 1
>    hi = length(s)
> 
>    while lo <= hi do
>       mid = floor((lo + hi) / 2)
>       c = compare(x, s[mid])
>       if    c < 0 then  -- x < s[mid]
>          hi = mid - 1
>       elsif c > 0 then  -- x > s[mid]
>          lo = mid + 1
>       else              -- x = s[mid]
>          return mid
>       end if
>    end while
> 
>    return 0
> end function
> </eucode>
{{{

> 
> It sounds as though your binary find may be a bit of
> a swiss army knife approach. Please post your bfind
> code from sets.e and let us play arround with it--
> But if there are many details, there might be more Devils!
> 
> 
> Ken Rhodes
> Folding at Home: <a
> href="http://folding.stanford.edu/">http://folding.stanford.edu/</a>
> 100% MicroSoft Free
> SuSE Linux 10.3
> No AdWare, SpyWare, or Viruses!
> Life is Good,  smile

I realised I forgot about your request:
constant SETS_1 = {-1,1,-2} -- endgame for length 1
function bfind_(object x,sequence s,integer startpoint,integer endpoint)
-- the actul bfind() sanitizes args, then calls this
-- assumes s sorted with no duplicates, length(s)>=startpoint<=endpoint>0. No
checking.
  integer mid,c

  if length(s)=0 then
    return -1
  end if
  if startpoint<=0 then
    startpoint=1
  end if
  if endpoint>length(s) then
    endpoint=length(s)
  end if
-- check for outliers, which are worst cases 
  c=compare(x,s[startpoint])
  if c=-1 then
    return -1
  elsif c=0 then
    return 1
  end if
  c=compare(x,s[endpoint])
  if c=1 then
    return -endpoint-1
  elsif c=0 then
    return endpoint
  end if
  -- main loop
  while endpoint-startpoint>1 do
-- handling of duplicates
-- if equal(s[startpoint],s[endpoint]) then
--   return startpoint
-- end if
    mid=floor((startpoint+endpoint)/2)
    c=compare(x,s[mid])
    if c=1 then
      startpoint=mid
    elsif c=-1 then
      endpoint=mid
    else
      return mid
    end if
  end while
  return -startpoint
end function


One can change the order of the tests if it is thought that there will be more
hits than misses or the other way round.

Results are undefined is s is not increasing. When x has more than one instance
in s, the commented out handling code causes the choice of the exac index to be
undefined amongst the several possible answers. If s is decreasing, and the slope
is knon somehow, then some 1 and -1 may be replaced by ±slope.

CChris

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu