1. stdlib: Binary search function
This has been an oft requested functionality, and was in principle agreed, even
by Rob:
http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find
http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind
Some implementations have been mentioned already (G. Boehme, J. Babor,...).
Now, the devil is in the details. What shall the function do?
* accept only strictly increasing sequences
* accept only strictly monotoni sequences
* accept loosely monotonic sequences (with possible duplicates)
* only cope with <any selection above>, not checking
I think everyone agrees it should return like db_find_key(), ie n>0 when found,
n<0 with ordered insertion resulting in item lading at position -n.
And another point which wasn't agreed upon: would this go to sort.e or
sequence.e?
The version I use in sets.e has a from and to extra parameters. They could be
easily defaulted. Monoony could be passed as a parameter too, depending on the
option chosen.
CChris
2. Re: stdlib: Binary search function
CChris wrote:
>
>
> This has been an oft requested functionality, and was in principle agreed,
> even
> by Rob:
> <a
> href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find</a>
> <a
> href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind</a>
>
> Some implementations have been mentioned already (G. Boehme, J. Babor,...).
>
> Now, the devil is in the details. What shall the function do?
> * accept only strictly increasing sequences
> * accept only strictly monotoni sequences
> * accept loosely monotonic sequences (with possible duplicates)
> * only cope with <any selection above>, not checking
>
> I think everyone agrees it should return like db_find_key(), ie n>0 when
> found,
> n<0 with ordered insertion resulting in item lading at position -n.
>
> And another point which wasn't agreed upon: would this go to sort.e or
> sequence.e?
>
> The version I use in sets.e has a from and to extra parameters. They could be
> easily defaulted. Monoony could be passed as a parameter too, depending on the
> option chosen.
>
> CChris
Well, if the Devil is in the details, lets see the details.
Babor, Boehme, and Otto's bfind is short sweet and simple.
global function bfind(object x, sequence s)
-- does a binary search on a sorted sequence
-- returns index location if found, 0 if not
-- assumes that sequence s has been sorted prior to this call
-- Gabriel Boehme's code
-- Earlier Joe Otto's routine is almost identical.
integer lo, hi, mid, c
lo = 1
hi = length(s)
while lo <= hi do
mid = floor((lo + hi) / 2)
c = compare(x, s[mid])
if c < 0 then -- x < s[mid]
hi = mid - 1
elsif c > 0 then -- x > s[mid]
lo = mid + 1
else -- x = s[mid]
return mid
end if
end while
return 0
end function
It sounds as though your binary find may be a bit of
a swiss army knife approach. Please post your bfind
code from sets.e and let us play arround with it--
But if there are many details, there might be more Devils!
Ken Rhodes
Folding at Home: http://folding.stanford.edu/
100% MicroSoft Free
SuSE Linux 10.3
No AdWare, SpyWare, or Viruses!
Life is Good,
3. Re: stdlib: Binary search function
integer found --flag
function findPos(sequence s, sequence ss)
--bynary search in a sorted array of strings
--it returns position of found string or position where it could be iserted
--which one is found we know by the flag
integer lo,hi,mid,c
lo = 1
hi = length(ss)
c = compare(s,ss[1])
if c<0 then
found = 0
return 1
elsif not c then
found = 1
return 1
end if
c = compare(s,ss[hi])
if c>0 then
found = 0
return hi+1
elsif not c then
found = 1
return hi
end if
while 1 do
w = hi - lo
if not w then
found = 0
return hi
elsif w<10 then
lo += 1
while 1 do
c = compare(s,ss[lo])
if not c then
found = 1
return lo
elsif c<0 then
found = 0
return lo
else
lo += 1
end if
end while
else --(hi-lo)>=10
mid = floor((lo + hi) / 2)
c = compare(s,ss[mid])
if c<0 then -- x < ss[mid]
hi = mid
elsif c>0 then -- x > ss[mid]
lo = mid
else -- x = ss[mid]
found = 1
return mid
end if
end if
end while
end function --findPos
4. Re: stdlib: Binary search function
Kenneth Rhodes wrote:
>
> CChris wrote:
> >
> >
> > This has been an oft requested functionality, and was in principle agreed,
> > even
> > by Rob:
> > <a
> > href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find</a>
> > <a
> > href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind</a>
> >
> > Some implementations have been mentioned already (G. Boehme, J. Babor,...).
> >
> > Now, the devil is in the details. What shall the function do?
> > * accept only strictly increasing sequences
> > * accept only strictly monotoni sequences
> > * accept loosely monotonic sequences (with possible duplicates)
> > * only cope with <any selection above>, not checking
> >
> > I think everyone agrees it should return like db_find_key(), ie n>0 when
> > found,
> > n<0 with ordered insertion resulting in item lading at position -n.
> >
> > And another point which wasn't agreed upon: would this go to sort.e or
> > sequence.e?
> >
> > The version I use in sets.e has a from and to extra parameters. They could
> > be
> > easily defaulted. Monoony could be passed as a parameter too, depending on
> > the
> > option chosen.
> >
> > CChris
>
> Well, if the Devil is in the details, lets see the details.
> Babor, Boehme, and Otto's bfind is short sweet and simple.
>
>
> }}}
<eucode>
> global function bfind(object x, sequence s)
> -- does a binary search on a sorted sequence
> -- returns index location if found, 0 if not
> -- assumes that sequence s has been sorted prior to this call
> -- Gabriel Boehme's code
> -- Earlier Joe Otto's routine is almost identical.
> integer lo, hi, mid, c
> lo = 1
> hi = length(s)
>
> while lo <= hi do
> mid = floor((lo + hi) / 2)
> c = compare(x, s[mid])
> if c < 0 then -- x < s[mid]
> hi = mid - 1
> elsif c > 0 then -- x > s[mid]
> lo = mid + 1
> else -- x = s[mid]
> return mid
> end if
> end while
>
> return 0
> end function
> </eucode>
{{{
>
> It sounds as though your binary find may be a bit of
> a swiss army knife approach. Please post your bfind
> code from sets.e and let us play arround with it--
> But if there are many details, there might be more Devils!
>
>
> Ken Rhodes
> Folding at Home: <a
> href="http://folding.stanford.edu/">http://folding.stanford.edu/</a>
> 100% MicroSoft Free
> SuSE Linux 10.3
> No AdWare, SpyWare, or Viruses!
> Life is Good,
The bfind() I use (can posy it only tonight) only assumes sorted sequences
without duplicates, as that's all it needs, and haas extra parms (lo and hi in
the above code).
Given the discussions referenced above, is this enough?
I forgot to mention: according to whether you expect hits to be more frequent
than misses, or the other way round, you can tweak the algorithm differently for
more overall speed.
In the routine you posted, if the sequence is not sorted, which is not checked,
results are undefined (but there is a result).
CChris
5. Re: stdlib: Binary search function
Kenneth Rhodes wrote:
>
> CChris wrote:
> >
> >
> > This has been an oft requested functionality, and was in principle agreed,
> > even
> > by Rob:
> > <a
> > href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=binary_find</a>
> > <a
> > href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=6&fromYear=1&toMonth=5&toYear=D&postedBy=&keywords=bfind</a>
> >
> > Some implementations have been mentioned already (G. Boehme, J. Babor,...).
> >
> > Now, the devil is in the details. What shall the function do?
> > * accept only strictly increasing sequences
> > * accept only strictly monotoni sequences
> > * accept loosely monotonic sequences (with possible duplicates)
> > * only cope with <any selection above>, not checking
> >
> > I think everyone agrees it should return like db_find_key(), ie n>0 when
> > found,
> > n<0 with ordered insertion resulting in item lading at position -n.
> >
> > And another point which wasn't agreed upon: would this go to sort.e or
> > sequence.e?
> >
> > The version I use in sets.e has a from and to extra parameters. They could
> > be
> > easily defaulted. Monoony could be passed as a parameter too, depending on
> > the
> > option chosen.
> >
> > CChris
>
> Well, if the Devil is in the details, lets see the details.
> Babor, Boehme, and Otto's bfind is short sweet and simple.
>
>
> }}}
<eucode>
> global function bfind(object x, sequence s)
> -- does a binary search on a sorted sequence
> -- returns index location if found, 0 if not
> -- assumes that sequence s has been sorted prior to this call
> -- Gabriel Boehme's code
> -- Earlier Joe Otto's routine is almost identical.
> integer lo, hi, mid, c
> lo = 1
> hi = length(s)
>
> while lo <= hi do
> mid = floor((lo + hi) / 2)
> c = compare(x, s[mid])
> if c < 0 then -- x < s[mid]
> hi = mid - 1
> elsif c > 0 then -- x > s[mid]
> lo = mid + 1
> else -- x = s[mid]
> return mid
> end if
> end while
>
> return 0
> end function
> </eucode>
{{{
>
> It sounds as though your binary find may be a bit of
> a swiss army knife approach. Please post your bfind
> code from sets.e and let us play arround with it--
> But if there are many details, there might be more Devils!
>
>
> Ken Rhodes
> Folding at Home: <a
> href="http://folding.stanford.edu/">http://folding.stanford.edu/</a>
> 100% MicroSoft Free
> SuSE Linux 10.3
> No AdWare, SpyWare, or Viruses!
> Life is Good,
I realised I forgot about your request:
constant SETS_1 = {-1,1,-2} -- endgame for length 1
function bfind_(object x,sequence s,integer startpoint,integer endpoint)
-- the actul bfind() sanitizes args, then calls this
-- assumes s sorted with no duplicates, length(s)>=startpoint<=endpoint>0. No
checking.
integer mid,c
if length(s)=0 then
return -1
end if
if startpoint<=0 then
startpoint=1
end if
if endpoint>length(s) then
endpoint=length(s)
end if
-- check for outliers, which are worst cases
c=compare(x,s[startpoint])
if c=-1 then
return -1
elsif c=0 then
return 1
end if
c=compare(x,s[endpoint])
if c=1 then
return -endpoint-1
elsif c=0 then
return endpoint
end if
-- main loop
while endpoint-startpoint>1 do
-- handling of duplicates
-- if equal(s[startpoint],s[endpoint]) then
-- return startpoint
-- end if
mid=floor((startpoint+endpoint)/2)
c=compare(x,s[mid])
if c=1 then
startpoint=mid
elsif c=-1 then
endpoint=mid
else
return mid
end if
end while
return -startpoint
end function
One can change the order of the tests if it is thought that there will be more
hits than misses or the other way round.
Results are undefined is s is not increasing. When x has more than one instance
in s, the commented out handling code causes the choice of the exac index to be
undefined amongst the several possible answers. If s is decreasing, and the slope
is knon somehow, then some 1 and -1 may be replaced by ±slope.
CChris