8.15 Searching

8.15.1 Equality

8.15.1.1 compare

<built-in> function compare(object compared, object reference)

Compare two items returning less than, equal or greater than.

Parameters:
  1. compared : the compared object
  2. reference : the reference object
Returns:

An integer,

  • 0 -- if objects are identical
  • 1 -- if compared is greater than reference
  • -1 -- if compared is less than reference
Comments:

Atoms are considered to be less than sequences. Sequences are compared alphabetically starting with the first element until a difference is found or one of the sequences is exhausted. Atoms are compared as ordinary reals.

Example 1:
x = compare({1,2,{3,{4}},5}, {2-1,1+1,{3,{4}},6-1})
-- identical, x is 0
Example 2:
if compare("ABC", "ABCD") < 0 then   -- -1
    -- will be true: ABC is "less" because it is shorter
end if
Example 3:
x = compare('a', "a")
-- x will be -1 because 'a' is an atom
-- while "a" is a sequence
See Also:

equal, relational operators, operations on sequences, sort

8.15.1.2 equal

<built-in> function equal(object left, object right)

Compare two Euphoria objects to see if they are the same.

Parameters:
  1. left : one of the objects to test
  2. right : the other object
Returns:

An integer, 1 if the two objects are identical, else 0.

Comments:

This is equivalent to the expression: compare(left, right) = 0.

This routine, like most other built-in routines, is very fast. It does not have any subroutine call overhead.

Example 1:
if equal(PI, 3.14) then
    puts(1, "give me a better value for PI!\n")
end if
Example 2:
if equal(name, "George") or equal(name, "GEORGE") then
   puts(1, "name is George\n")
end if
See Also:

compare

8.15.2 Finding

8.15.2.1 find

<built-in> function find(object needle, sequence haystack, integer start)

Find the first occurrence of a "needle" as an element of a "haystack", starting from position "start"..

Parameters:
  1. needle : an object whose presence is being queried
  2. haystack : a sequence, which is being looked up for needle
  3. start : an integer, the position at which to start searching. Defaults to 1.
Returns:

An integer, 0 if needle is not on haystack, else the smallest index of an element of haystack that equals needle.

Example 1:
location = find(11, {5, 8, 11, 2, 3})
-- location is set to 3
Example 2:
names = {"fred", "rob", "george", "mary", ""}
location = find("mary", names)
-- location is set to 4
See Also:

find, match, compare

8.15.2.2 find_from

<built-in> function find_from(object needle, object haystack, integer start)
Deprecated:

Deprecated in version 4.0.0

In Euphoria 4.0.0 we have the ability to default parameters to procedures and functions. The built-in find therefore now has a start parameter that is defaulted to the beginning of the sequence. Thus, find can perform the identical functionality provided by find_from. In an undetermined future release of Euphoria, find_from will be removed.

8.15.2.3 find_any

include std/search.e
namespace search
public function find_any(object needles, sequence haystack, integer start = 1)

Find any element from a list inside a sequence. Returns the location of the first hit.

Parameters:
  1. needles : a sequence, the list of items to look for
  2. haystack : a sequence, in which "needles" are looked for
  3. start : an integer, the starting point of the search. Defaults to 1.
Returns:

An integer, the smallest index in haystack of an element of needles, or 0 if no needle is found.

Comments:

This function may be applied to a string sequence or a complex sequence.

Example 1:
location = find_any("aeiou", "John Smith", 3)
-- location is 8
Example 2:
location = find_any("aeiou", "John Doe")
-- location is 2
See Also:

find

8.15.2.4 match_any

include std/search.e
namespace search
public function match_any(sequence needles, sequence haystack, integer start = 1)

Determines if any element from needles is in haystack.

Parameters:
  1. needles : a sequence, the list of items to look for
  2. haystack : a sequence, in which "needles" are looked for
  3. start : an integer, the starting point of the search. Defaults to 1.
Returns:

An integer, 0 if no matches, 1 if any matches.

Comments:

This function may be applied to a string sequence or a complex sequence. An empty needles sequence will always result in 0.

Example 1:
ok = match_any("aeiou", "John Smith")
-- okay is 1
ok = match_any("xyz", "John Smith" )
-- okay is 0
See Also:

find_any

8.15.2.5 find_each

include std/search.e
namespace search
public function find_each(sequence needles, sequence haystack, integer start = 1)

Find all instances of any element from the needle sequence that occur in the haystack sequence. Returns a list of indexes.

Parameters:
  1. needles : a sequence, the list of items to look for
  2. haystack : a sequence, in which "needles" are looked for
  3. start : an integer, the starting point of the search. Defaults to 1.
Returns:

A sequence, the list of indexes into haystack that point to an element that is also in needles.

Comments:

This function may be applied to a string sequence or a complex sequence.

Example 1:
location = find_each("aeiou", "John Smith", 3)
-- location is {8}
Example 2:
location = find_each("aeiou", "John Doe")
-- location is {2,7,8}
See Also:

find, find_any

8.15.2.6 find_all

include std/search.e
namespace search
public function find_all(object needle, sequence haystack, integer start = 1)

Find all occurrences of an object inside a sequence, starting at some specified point.

Parameters:
  1. needle : an object, what to look for
  2. haystack : a sequence to search in
  3. start : an integer, the starting index position (defaults to 1)
Returns:

A sequence, the list of all indexes no less than start of elements of haystack that equal needle. This sequence is empty if no match found.

Example 1:
s = find_all('A', "ABCABAB")
-- s is {1,4,6}
See Also:

find, match, match_all

8.15.2.7 find_all_but

include std/search.e
namespace search
public function find_all_but(object needle, sequence haystack, integer start = 1)

Find all non-occurrences of an object inside a sequence, starting at some specified point.

Parameters:
  1. needle : an object, what to look for
  2. haystack : a sequence to search in
  3. start : an integer, the starting index position (defaults to 1)
Returns:

A sequence, the list of all indexes no less than start of elements of haystack that not equal to needle. This sequence is empty if haystack only consists of needle.

Example 1:
s = find_all_but('A', "ABCABAB")
-- s is {2,3,5,7}
See Also:

find_all, match, match_all

8.15.2.8 NESTED_ANY

include std/search.e
namespace search
public constant NESTED_ANY

8.15.2.9 NESTED_ALL

include std/search.e
namespace search
public constant NESTED_ALL

8.15.2.10 NESTED_INDEX

include std/search.e
namespace search
public constant NESTED_INDEX

8.15.2.11 NESTED_BACKWARD

include std/search.e
namespace search
public constant NESTED_BACKWARD

8.15.2.12 find_nested

include std/search.e
namespace search
public function find_nested(object needle, sequence haystack, integer flags = 0,
        integer rtn_id = types :NO_ROUTINE_ID)

Find any object (among a list) in a sequence of arbitrary shape at arbitrary nesting.

Parameters:
  1. needle : an object, either what to look up, or a list of items to look up
  2. haystack : a sequence, where to look up
  3. flags : options to the function, see Comments section. Defaults to 0.
  4. routine : an integer, the routine_id of an user supplied equal/find function. Defaults to types:NO_ROUTINE_ID.
Returns:

A possibly empty sequence, of results, one for each hit.

Comments:

Each item in the returned sequence is either a sequence of indexes, or a pair {sequence of indexes, index in needle}.

The following flags are available to fine tune the search:
  • NESTED_BACKWARD -- if on flags, search is performed backward. Default is forward.
  • NESTED_ALL -- if on flags, all occurrences are looked for. Default is one hit only.
  • NESTED_ANY -- if present on flags, needle is a list of items to look for. Not the default.
  • NESTED_INDEXES -- if present on flags, an individual result is a pair {position, index in needle}. Default is just return the position.

If s is a single index list, or position, from the returned sequence, then fetch(haystack, s) = needle.

If a routine id is supplied, the routine must behave like equal() if the NESTED_ANY flag is not supplied, and like find() if it is. The routine is being passed the current haystack item and needle. The returned integer is interpreted as if returned by equal() or find().

If the NESTED_ANY flag is specified, and needle is an atom, then the flag is removed.

Example 1:
sequence s = find_nested(3, {5, {4, {3, {2}}}})
-- s is {2 ,2 ,1}
Example 2:
sequence s = find_nested({3, 2}, {1, 3, {2,3}},
                                   NESTED_ANY + NESTED_BACKWARD + NESTED_ALL)
-- s is {{3,2}, {3,1}, {2}}
Example 3:
sequence s = find_nested({3, 2}, {1, 3, {2,3}},
                                    NESTED_ANY + NESTED_INDEXES + NESTED_ALL)
-- s is {{{2}, 1}, {{3, 1}, 2}, {{3, 2}, 1}}
See Also:

find, rfind, find_any, fetch

8.15.2.13 rfind

include std/search.e
namespace search
public function rfind(object needle, sequence haystack, integer start = length(haystack))

Find a needle in a haystack in reverse order.

Parameters:
  1. needle : an object to search for
  2. haystack : a sequence to search in
  3. start : an integer, the starting index position (defaults to length(haystack))
Returns:

An integer, 0 if no instance of needle can be found on haystack before index start, or the highest such index otherwise.

Comments:

If start is less than 1, it will be added once to length(haystack) to designate a position counted backwards. Thus, if start is -1, the first element to be queried in haystack will be haystack[$-1], then haystack[$-2] and so on.

Example 1:
location = rfind(11, {5, 8, 11, 2, 11, 3})
-- location is set to 5
Example 2:
names = {"fred", "rob", "rob", "george", "mary"}
location = rfind("rob", names)
-- location is set to 3
location = rfind("rob", names, -4)
-- location is set to 2
See Also:

find, rmatch

8.15.2.14 find_replace

include std/search.e
namespace search
public function find_replace(object needle, sequence haystack, object replacement,
        integer max = 0)

Finds needle in the haystack, and replaces all or upto max occurrences with replacement.

Parameters:
  1. needle : an object to search and perhaps replace
  2. haystack : a sequence to be inspected
  3. replacement : an object to substitute for any (first) instance of needle
  4. max : an integer, 0 to replace all occurrences
Returns:

A sequence, the modified haystack.

Comments:

Replacements will not be made recursively on the part of haystack that was already changed.

If max is 0 or less, any occurrence of needle in haystack will be replaced by replacement. Otherwise, only the first max occurrences are.

Example 1:
s = find_replace('b', "The batty book was all but in Canada.", 'c', 0)
-- s is "The catty cook was all cut in Canada."
Example 2:
s = find_replace('/', "/euphoria/demo/unix", '\\', 2)
-- s is "\\euphoria\\demo/unix"
Example 3:
s = find_replace("theater", { "the", "theater", "theif" }, "theatre")
-- s is { "the", "theatre", "theif" }
See Also:

find, replace, match_replace

8.15.2.15 match_replace

include std/search.e
namespace search
public function match_replace(object needle, sequence haystack, object replacement,
        integer max = 0)

Finds a "needle" in a "haystack", and replace any, or only the first few, occurrences with a replacement.

Parameters:
  1. needle : an non-empty sequence or atom to search and perhaps replace
  2. haystack : a sequence to be inspected
  3. replacement : an object to substitute for any (first) instance of needle
  4. max : an integer, 0 to replace all occurrences
Returns:

A sequence, the modified haystack.

Comments:

Replacements will not be made recursively on the part of haystack that was already changed.

If max is 0 or less, any occurrence of needle in haystack will be replaced by replacement. Otherwise, only the first max occurrences are.

If either needle or replacement are atoms they will be treated as if you had passed in a length-1 sequence containing the said atom.

If needle is an empty sequence, an error will be raised and your program will exit.

Example 1:
s = match_replace("the", "the cat ate the food under the table", "THE", 0)
-- s is "THE cat ate THE food under THE table"
Example 2:
s = match_replace("the", "the cat ate the food under the table", "THE", 2)
-- s is "THE cat ate THE food under the table"
Example 3:
s = match_replace('/', "/euphoria/demo/unix", '\\', 2)
-- s is "\\euphoria\\demo/unix"
Example 4:
s = match_replace('a', "abracadabra", 'X')
-- s is now "XbrXcXdXbrX"
s = match_replace("ra", "abracadabra", 'X')
-- s is now "abXcadabX"
s = match_replace("a", "abracadabra", "aa")
-- s is now "aabraacaadaabraa"
s = match_replace("a", "abracadabra", "")
-- s is now "brcdbr"
See Also:

find, replace, regex:find_replace, find_replace

8.15.2.16 binary_search

include std/search.e
namespace search
public function binary_search(object needle, sequence haystack, integer start_point = 1,
        integer end_point = 0)

Finds a "needle" in an ordered "haystack". Start and end point can be given for the search.

Parameters:
  1. needle : an object to look for
  2. haystack : a sequence to search in
  3. start_point : an integer, the index at which to start searching. Defaults to 1.
  4. end_point : an integer, the end point of the search. Defaults to 0, ie search to end.
Returns:

An integer, either:

  1. a positive integer i, which means haystack[i] equals needle.
  2. a negative integer, -i, with i between adjusted start and end points. This means that needle is not in the searched slice of haystack, but would be at index i if it were there.
  3. a negative integer -i with i out of the searched range. This means than needlemight be either below the start point if i is below the start point, or above the end point if i is.
Comments:
  • If end_point is not greater than zero, it is added to length(haystack) once only. Then, the end point of the search is adjusted to length(haystack) if out of bounds.
  • The start point is adjusted to 1 if below 1.
  • The way this function returns is very similar to what db_find_key does. They use variants of the same algorithm. The latter is all the more efficient as haystack is long.
  • haystack is assumed to be in ascending order. Results are undefined if it is not.
  • If duplicate copies of needle exist in the range searched on haystack, any of the possible contiguous indexes may be returned.
See Also:

find, db_find_key

8.15.3 Matching

8.15.3.1 match

<built-in> function match(sequence needle, sequence haystack, integer start)

Try to match a "needle" against some slice of a "haystack", starting at position "start".

Parameters:
  1. needle : a sequence whose presence as a "substring" is being queried
  2. haystack : a sequence, which is being looked up for needle as a sub-sequence
  3. start : an integer, the point from which matching is attempted. Defaults to 1.
Returns:

An integer, 0 if no slice of haystack is needle, else the smallest index at which such a slice starts.

Comments:

If needle is an empty sequence, an error is raised and your program will exit.

Example 1:
location = match("pho", "Euphoria")
-- location is set to 3
See Also:

find, compare, wildcard:is_match

8.15.3.2 match_from

<built-in> function match_from(sequence needle, sequence haystack, integer start)
Deprecated:

Deprecated in version 4.0.0

In Euphoria 4.0.0 we have the ability to default parameters to procedures and functions. The built-in match therefore now has a start parameter that is defaulted to the beginning of the sequence. Thus, match can perform the identical functionality provided by match_from. In an undetermined future release of Euphoria, match_from will be removed.

Comments:

If needle is an empty sequence, an error is raised and your program will exit.

8.15.3.3 match_all

include std/search.e
namespace search
public function match_all(sequence needle, sequence haystack, integer start = 1)

Match all items of haystack in needle.

Parameters:
  1. needle : a non-empty sequence, what to look for
  2. haystack : a sequence to search in
  3. start : an integer, the starting index position (defaults to 1)
Returns:

A sequence, of integers, the list of all lower indexes, not less than start, of all slices in haystack that equal needle. The list may be empty.

Comments:

If needle is an empty sequence, an error will be raised and your program will exit.

Example 1:
s = match_all("the", "the dog chased the cat under the table.")
-- s is {1,16,30}
See Also:

match, regex:find_all find, find_all

8.15.3.4 rmatch

include std/search.e
namespace search
public function rmatch(sequence needle, sequence haystack, integer start = length(haystack))

Try to match a needle against some slice of a haystack in reverse order.

Parameters:
  1. needle : a sequence to search for
  2. haystack : a sequence to search in
  3. start : an integer, the starting index position (defaults to length(haystack))
Returns:

An integer, either 0 if no slice of haystack starting before start equals needle, else the highest lower index of such a slice.

Comments:

If start is less than 1, it will be added once to length(haystack) to designate a position counted backwards. Thus, if start is -1, the first element to be queried in haystack will be haystack[$-1], then haystack[$-2] and so on.

If a needle is an empty sequence this will return 0.

Example 1:
location = rmatch("the", "the dog ate the steak from the table.")
-- location is set to 28 (3rd 'the')
location = rmatch("the", "the dog ate the steak from the table.", -11)
-- location is set to 13 (2nd 'the')
See Also:

rfind, match

8.15.3.5 begins

include std/search.e
namespace search
public function begins(object sub_text, sequence full_text)

Test whether a sequence is the head of another one.

Parameters:
  1. sub_text : an object to be looked for
  2. full_text : a sequence, the head of which is being inspected.
Returns:

An integer, 1 if sub_text begins full_text, else 0.

Comments:

If sub_text is an empty sequence, this returns 1 unless full_text is also an empty sequence. When they are both empty sequences this returns 0.

Example 1:
s = begins("abc", "abcdef")
-- s is 1
s = begins("bcd", "abcdef")
-- s is 0
See Also:

ends, head

8.15.3.6 ends

include std/search.e
namespace search
public function ends(object sub_text, sequence full_text)

Test whether a sequence ends another one.

Parameters:
  1. sub_text : an object to be looked for
  2. full_text : a sequence, the tail of which is being inspected.
Returns:

An integer, 1 if sub_text ends full_text, else 0.

- Comments: If sub_text is an empty sequence, this returns 1 unless full_text is also an empty sequence. When they are both empty sequences this returns 0.

Example 1:
s = ends("def", "abcdef")
-- s is 1
s = begins("bcd", "abcdef")
-- s is 0
See Also:

begins, tail

8.15.3.7 is_in_range

include std/search.e
namespace search
public function is_in_range(object item, sequence range_limits, sequence boundries = "[]")

Tests to see if the item is in a range of values supplied by range_limits

Parameters:
  1. item : The object to test for.
  2. range_limits : A sequence of two or more elements. The first is assumed to be the smallest value and the last is assumed to be the highest value.
  3. boundries: a sequence. This determines if the range limits are inclusive or not. Must be one of "[]" (the default), "[)", "(]", or "()".
Returns:

An integer, 0 if item is not in the range_limits otherwise it returns 1.

Comments:
  • In boundries, square brackets mean inclusive and round brackets mean exclusive. Thus "[]" includes both limits in the range, while "()" excludes both limits. And, "[)" includes the lower limit and excludes the upper limits while "(]" does the reverse.
Example 1:
if is_in_range(2, {2, 75}) then
    procA(user_data) -- Gets run (both limits included)
end if
if is_in_range(2, {2, 75}, "(]") then
    procA(user_data) -- Does not get run
end if

8.15.3.8 is_in_list

include std/search.e
namespace search
public function is_in_list(object item, sequence list)

Tests to see if the item is in a list of values supplied by list

Parameters:
  1. item : The object to test for.
  2. list : A sequence of elements that item could be a member of.
Returns:

An integer, 0 if item is not in the list, otherwise it returns 1.

Example 1:
if is_in_list(user_data, {100, 45, 2, 75, 121}) then
    procA(user_data)
end if

8.15.3.9 lookup

include std/search.e
namespace search
public function lookup(object find_item, sequence source_list, sequence target_list,
        object def_value = 0)

If the supplied item is in the source list, this returns the corresponding element from the target list.

Parameters:
  1. find_item: an object that might exist in source_list.
  2. source_list: a sequence that might contain pITem.
  3. target_list: a sequence from which the corresponding item will be returned.
  4. def_value: an object (defaults to zero). This is returned when find_item is not in source_list and target_list is not longer than source_list.
Returns:

an object

  • If find_item is found in source_list then this is the corresponding element from target_list
  • If find_item is not in source_list then if target_list is longer than source_list then the last item in target_list is returned otherwise def_value is returned.
Examples:
lookup('a', "cat", "dog") --> 'o'
lookup('d', "cat", "dogx") --> 'x'
lookup('d', "cat", "dog") --> 0
lookup('d', "cat", "dog", -1) --> -1
lookup("ant", {"ant","bear","cat"}, {"spider","seal","dog","unknown"}) 
            --> "spider"
lookup("dog", {"ant","bear","cat"}, {"spider","seal","dog","unknown"})     
            --> "unknown"

8.15.3.10 vlookup

include std/search.e
namespace search
public function vlookup(object find_item, sequence grid_data, integer source_col,
        integer target_col, object def_value = 0)

If the supplied item is in a source grid column, this returns the corresponding element from the target column.

Parameters:
  1. find_item: an object that might exist in source_col.
  2. grid_data: a 2D grid sequence that might contain pITem.
  3. source_col: an integer. The column number to look for find_item.
  4. target_col: an integer. The column number from which the corresponding item will be returned.
  5. def_value: an object (defaults to zero). This is returned when find_item is not found in the source_col column, or if found but the target column does not exist.
Comments:
  • If a row in the grid is actually a single atom, the row is ignored.
  • If a row's length is less than the source_col, the row is ignored.
Returns:

an object

  • If find_item is found in the source_col column then this is the corresponding element from the target_col column.
Examples:
sequence grid
grid = {
       {"ant", "spider", "mortein"},
       {"bear", "seal", "gun"},
       {"cat", "dog", "ranger"},
       $
 }
vlookup("ant", grid, 1, 2, "?") --> "spider"
vlookup("ant", grid, 1, 3, "?") --> "mortein"
vlookup("seal", grid, 2, 3, "?") --> "gun"
vlookup("seal", grid, 2, 1, "?") --> "bear"
vlookup("mouse", grid, 2, 3, "?") --> "?"