1. RE: match() in depth!

What's faster?

find("is",test)
or match({"is"},test)

Chris

Derek Parnell wrote:
> Hi Kat,
> I'm not quite sure what you are driving at either.
> 
> When you run this code of yours, what do you get displayed?
> 
> When I run it, it only reports that 'is' is found. And this is exactly 
> what I would have expected.
> 
> My assumption is that 'parse()' converts :
> 
>  "this is a test of match()"
> 
> to:
> 
>  {"this","is","a","test","of","match()"}
> 
> In which case the match({"is"},test) call will return 2. Because "is" is 
> the second element of the 
> test sequence.
> 
> In which case the match({"s"},test) call will return 0. Because "s" is 
> not any of the elements in 
> the test sequence.
> 
> What did you find surprising with your test?
> ------
> Derek.
> 
> 
> 25/02/2002 1:43:42 PM, Kat <gertie at PELL.NET> wrote:
> 
> >
> >On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:
> >
> >> 
> >> OK. But what is the strange thing about it? I don't understand.
> >
> >The {bracketed} and "quoted" parts in the examples below. 
> >
> >Kat
> >
> > 
> >> ----- Original Message ----- 
> >> From: "Kat" <gertie at PELL.NET> 
> >> To: "EUforum" <EUforum at topica.com>
> >> Sent: Sunday, February 24, 2002 12:49 AM
> >> Subject: match() in depth!
> >> 
> >> 
> >> > I didn't know we could do this!
> >> > 
> >> > include strtok.e
> >> > with trace
> >> > sequence test
> >> > 
> >> > test = "this is a test of match()"
> >> > test = parse(test,32)
> >> > 
> >> > 
> >> > if match({"is"},test) then 
> >> >  puts(1,"found is\n") 
> >> > end if
> >> > 
> >> > if match({"s"},test) then 
> >> >  puts(1,"found s\n") 
> >> > end if
> >> > 
> >> > trace(1)
> >> > abort(0)
> >> > 
> >> > 
> ---------
> Cheers,
> Derek Parnell 
> 
>

new topic     » topic index » view message » categorize

2. Re: RE: match() in depth!

Chris,
it probably doesn't matter too much which is faster as I imagine that what Kat
is trying to do is
find out which *word* is the "is" word, rather than what its character position
is in the original
string.

However, I find that the find() is about 10% faster than match().

25/02/2002 2:48:26 PM, bensler at mail.com wrote:

>
>What's faster?
>
>find("is",test)
>or match({"is"},test)
>
>Chris
>
>Derek Parnell wrote:
>> Hi Kat,
>> I'm not quite sure what you are driving at either.
>> 
>> When you run this code of yours, what do you get displayed?
>> 
>> When I run it, it only reports that 'is' is found. And this is exactly 
>> what I would have expected.
>> 
>> My assumption is that 'parse()' converts :
>> 
>>  "this is a test of match()"
>> 
>> to:
>> 
>>  {"this","is","a","test","of","match()"}
>> 
>> In which case the match({"is"},test) call will return 2. Because "is" is 
>> the second element of the 
>> test sequence.
>> 
>> In which case the match({"s"},test) call will return 0. Because "s" is 
>> not any of the elements in 
>> the test sequence.
>> 
>> What did you find surprising with your test?
>> ------
>> Derek.
>> 
>> 
>> 25/02/2002 1:43:42 PM, Kat <gertie at PELL.NET> wrote:
>> 
>> >
>> >On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:
>> >
>> >> 
>> >> OK. But what is the strange thing about it? I don't understand.
>> >
>> >The {bracketed} and "quoted" parts in the examples below. 
>> >
>> >Kat
>> >
>> > 
>> >> ----- Original Message ----- 
>> >> From: "Kat" <gertie at PELL.NET> 
>> >> To: "EUforum" <EUforum at topica.com>
>> >> Sent: Sunday, February 24, 2002 12:49 AM
>> >> Subject: match() in depth!
>> >> 
>> >> 
>> >> > I didn't know we could do this!
>> >> > 
>> >> > include strtok.e
>> >> > with trace
>> >> > sequence test
>> >> > 
>> >> > test = "this is a test of match()"
>> >> > test = parse(test,32)
>> >> > 
>> >> > 
>> >> > if match({"is"},test) then 
>> >> >  puts(1,"found is\n") 
>> >> > end if
>> >> > 
>> >> > if match({"s"},test) then 
>> >> >  puts(1,"found s\n") 
>> >> > end if
>> >> > 
>> >> > trace(1)
>> >> > abort(0)
>> >> > 
>> >> > 
>> ---------
>> Cheers,
>> Derek Parnell 
>> 
>> 
>
>
>
---------
Cheers,
Derek Parnell

new topic     » goto parent     » topic index » view message » categorize

3. Re: RE: match() in depth!

On 25 Feb 2002, at 15:16, Derek Parnell wrote:

> 
> Chris,
> it probably doesn't matter too much which is faster as I imagine that what Kat
> is trying to do is find out which *word* is the "is" word, rather than what
> its
> character position is in the original string.

What i was trying to do is find a faster match than the find() was doing in 
find_all() in strtok.e. I was surpised that match accepted the {""} in place of
a
"" or a {}. I was hoping there was a way to nest the {} or "" to specify a 
nesting level to search in match(). For instance (non-tested code follows):

test = {{"one","two"},{"three","four"}}
match({"three"},test) = 2
match("three",test) = 0

But it doesn't work like that.

Kat

> However, I find that the find() is about 10% faster than match().



> 25/02/2002 2:48:26 PM, bensler at mail.com wrote:
> 
> >
> >What's faster?
> >
> >find("is",test)
> >or match({"is"},test)
> >
> >Chris
> >
> >Derek Parnell wrote:
> >> Hi Kat,
> >> I'm not quite sure what you are driving at either.
> >> 
> >> When you run this code of yours, what do you get displayed?
> >> 
> >> When I run it, it only reports that 'is' is found. And this is exactly 
> >> what I would have expected.
> >> 
> >> My assumption is that 'parse()' converts :
> >> 
> >>  "this is a test of match()"
> >> 
> >> to:
> >> 
> >>  {"this","is","a","test","of","match()"}
> >> 
> >> In which case the match({"is"},test) call will return 2. Because "is" is
> >> the
> >> second element of the test sequence.
> >> 
> >> In which case the match({"s"},test) call will return 0. Because "s" is 
> >> not any of the elements in 
> >> the test sequence.
> >> 
> >> What did you find surprising with your test?
> >> ------
> >> Derek.
> >> 
> >> 
> >> 25/02/2002 1:43:42 PM, Kat <gertie at PELL.NET> wrote:
> >> 
> >> >
> >> >On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:
> >> >
> >> >> 
> >> >> OK. But what is the strange thing about it? I don't understand.
> >> >
> >> >The {bracketed} and "quoted" parts in the examples below. 
> >> >
> >> >Kat
> >> >
> >> > 
> >> >> ----- Original Message ----- 
> >> >> From: "Kat" <gertie at PELL.NET> 
> >> >> To: "EUforum" <EUforum at topica.com>
> >> >> Sent: Sunday, February 24, 2002 12:49 AM
> >> >> Subject: match() in depth!
> >> >> 
> >> >> 
> >> >> > I didn't know we could do this!
> >> >> > 
> >> >> > include strtok.e
> >> >> > with trace
> >> >> > sequence test
> >> >> > 
> >> >> > test = "this is a test of match()"
> >> >> > test = parse(test,32)
> >> >> > 
> >> >> > 
> >> >> > if match({"is"},test) then 
> >> >> >  puts(1,"found is\n") 
> >> >> > end if
> >> >> > 
> >> >> > if match({"s"},test) then 
> >> >> >  puts(1,"found s\n") 
> >> >> > end if
> >> >> > 
> >> >> > trace(1)
> >> >> > abort(0)
> >> >> > 
> >> >> > 
> >> ---------
> >> Cheers,
> >> Derek Parnell 
> >> 
> >> 
> ---------
> Cheers,
> Derek Parnell 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

4. RE: match() in depth!

> > If we use some symbolic representation of the test sequence above it 
> > could be
> > coded as :
> > 
> >  {TIAtOM} 
> > 
> > where I've simple replaced each word with a letter representing that 
> > word.
> > 
> > Thus the find for "is" similarly coded is match({I}, test). This will 
> > return 2.


{TIAtOM} = parse("this is a test of match()",32}
^ this should really be {T,I,A,T,O,M}

I = "is"
I[1] = 'i'

> > The scan for the "s" word is coded match({S},test) and returns 0 because 
> > there
> > is no "s" word in the test sequence.
> 

This is a bit confusing. match({S},test) will not return 0, it will fail 
with "variable not defined", because there is no S variable.

> You lost me there. You said match({I}, test), which i understand, but 
> that is 
> {I}, same as "I", which meets the specs for match() in the help files. I 
> used 
> {"I"}.

{I} != "I"
{I} = {"is"}
{"I"} = {{'I'}} -- (confused again? :P)


> > May I recommend trying to locate "is" word in a standard Euphoria 
> > sequence
> > string by doing :
> > 
> >    find (" is ", test) -- Note the spaces around the word. This might save 
> >    you
> >    parsing it too often.
> 
> I know that works too, same thing about the {" "}  though. But about 
> parsing 
> it too often, i parse the lines once, then each item is a slice, and i 
> deparse 
> before writing each out again. There is at least one change on each of 
> 200,000+ lines, and some have dozens of changes.

The reason I asked if find() was faster than match() is because that is 
what I thought, but you confused me with match({"is"},test) :)

find(x,s) will compare x against any ELEMENT of s
match(x,s) will compare x against any SLICE of s

EXAMPLE:
  find("the","look for the matching element")
returns 0, because there is no element that matches "the"

  find("the",parse("look for the matching element",32))
returns 3

  match("the","look for the matching slice")
returns 10


If you're parsing every line, find() should be quite a bit faster.
  find("is",parse("this is a test",32))

If you only need to find one word in the string, match() without parsing 
is faster.
  match(" is "," " & "this is a test" & " ") -- you'll need the 
concatenated spaces so you can compare the first and last slices


Chris

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu