1. match() in depth!

I didn't know we could do this!

include strtok.e
with trace
sequence test

test = "this is a test of match()"
test = parse(test,32)


if match({"is"},test) then 
 puts(1,"found is\n") 
end if

if match({"s"},test) then 
 puts(1,"found s\n") 
end if

trace(1)
abort(0)

new topic     » topic index » view message » categorize

2. Re: match() in depth!

OK. But what is the strange thing about it? I don't understand.

----- Original Message ----- 
From: "Kat" <gertie at PELL.NET> 
To: "EUforum" <EUforum at topica.com>
Subject: match() in depth!


> 
> I didn't know we could do this!
> 
> include strtok.e
> with trace
> sequence test
> 
> test = "this is a test of match()"
> test = parse(test,32)
> 
> 
> if match({"is"},test) then 
>  puts(1,"found is\n") 
> end if
> 
> if match({"s"},test) then 
>  puts(1,"found s\n") 
> end if
> 
> trace(1)
> abort(0)
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

3. Re: match() in depth!

On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:

> 
> OK. But what is the strange thing about it? I don't understand.

The {bracketed} and "quoted" parts in the examples below. 

Kat

 
> ----- Original Message ----- 
> From: "Kat" <gertie at PELL.NET> 
> To: "EUforum" <EUforum at topica.com>
> Sent: Sunday, February 24, 2002 12:49 AM
> Subject: match() in depth!
> 
> 
> > I didn't know we could do this!
> > 
> > include strtok.e
> > with trace
> > sequence test
> > 
> > test = "this is a test of match()"
> > test = parse(test,32)
> > 
> > 
> > if match({"is"},test) then 
> >  puts(1,"found is\n") 
> > end if
> > 
> > if match({"s"},test) then 
> >  puts(1,"found s\n") 
> > end if
> > 
> > trace(1)
> > abort(0)
> > 
> > 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

4. Re: match() in depth!

Hi Kat,
I'm not quite sure what you are driving at either.

When you run this code of yours, what do you get displayed?

When I run it, it only reports that 'is' is found. And this is exactly what I
would have expected.

My assumption is that 'parse()' converts :

 "this is a test of match()"

to:

 {"this","is","a","test","of","match()"}

In which case the match({"is"},test) call will return 2. Because "is" is the
second element of the
test sequence.

In which case the match({"s"},test) call will return 0. Because "s" is not any
of the elements in
the test sequence.

What did you find surprising with your test?
------
Derek.


25/02/2002 1:43:42 PM, Kat <gertie at PELL.NET> wrote:

>
>On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:
>
>> 
>> OK. But what is the strange thing about it? I don't understand.
>
>The {bracketed} and "quoted" parts in the examples below. 
>
>Kat
>
> 
>> ----- Original Message ----- 
>> From: "Kat" <gertie at PELL.NET> 
>> To: "EUforum" <EUforum at topica.com>
>> Sent: Sunday, February 24, 2002 12:49 AM
>> Subject: match() in depth!
>> 
>> 
>> > I didn't know we could do this!
>> > 
>> > include strtok.e
>> > with trace
>> > sequence test
>> > 
>> > test = "this is a test of match()"
>> > test = parse(test,32)
>> > 
>> > 
>> > if match({"is"},test) then 
>> >  puts(1,"found is\n") 
>> > end if
>> > 
>> > if match({"s"},test) then 
>> >  puts(1,"found s\n") 
>> > end if
>> > 
>> > trace(1)
>> > abort(0)
>> > 
>> > 
>
>
>
---------
Cheers,
Derek Parnell

new topic     » goto parent     » topic index » view message » categorize

5. Re: match() in depth!

On 25 Feb 2002, at 14:01, Derek Parnell wrote:

> 
> Hi Kat,
> I'm not quite sure what you are driving at either.
> 
> When you run this code of yours, what do you get displayed?
> 
> When I run it, it only reports that 'is' is found. And this is exactly what I
> would have expected.
> 
> My assumption is that 'parse()' converts :
> 
>  "this is a test of match()"
> 
> to:
> 
>  {"this","is","a","test","of","match()"}

Correct.
 
> In which case the match({"is"},test) call will return 2. Because "is" is the
> second element of the test sequence.

> In which case the match({"s"},test) call will return 0. Because "s" is not any
> of the elements in the test sequence.
> 
> What did you find surprising with your test?

I had never seen the match({"is"},test) in any code i have read in the 
archives. I have seen match("is",test). Frankly, the use of {} occasionally 
confuses me, and to put the "is" inside the { } told me it was a nested 
sequence, and should have returned nothing, zero, or an error,, because the 
help files say all the parms for match() are to be sequences, and a nested 
seq as the first parm barely makes sence. But i was looking for a way to get 
some code running faster, and tried it.

Kat


> ------
> Derek.
> 
> 
> 25/02/2002 1:43:42 PM, Kat <gertie at PELL.NET> wrote:
> 
> >
> >On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:
> >
> >> 
> >> OK. But what is the strange thing about it? I don't understand.
> >
> >The {bracketed} and "quoted" parts in the examples below. 
> >
> >Kat
> >
> > 
> >> ----- Original Message ----- 
> >> From: "Kat" <gertie at PELL.NET> 
> >> To: "EUforum" <EUforum at topica.com>
> >> Sent: Sunday, February 24, 2002 12:49 AM
> >> Subject: match() in depth!
> >> 
> >> 
> >> > I didn't know we could do this!
> >> > 
> >> > include strtok.e
> >> > with trace
> >> > sequence test
> >> > 
> >> > test = "this is a test of match()"
> >> > test = parse(test,32)
> >> > 
> >> > 
> >> > if match({"is"},test) then 
> >> >  puts(1,"found is\n") 
> >> > end if
> >> > 
> >> > if match({"s"},test) then 
> >> >  puts(1,"found s\n") 
> >> > end if
> >> > 
> >> > trace(1)
> >> > abort(0)
> >> > 
> >> > 
> ---------
> Cheers,
> Derek Parnell 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

6. Re: match() in depth!

Hi Kat,

however the result of parse() is a nested sequence.

IOW this ==> {"this","is","a","test","of","match()"} is a sequence that
 contains six other
sequences (aka strings). And match is actually looking for matching slices. When
we are dealing with
sequences that look like strings, we can easily think of match as looking for
substrings. But in
Euphoria palance, a substring is just a slice of a sequence.

If we use some symbolic representation of the test sequence above it could be
coded as :

 {TIAtOM} 

where I've simple replaced each word with a letter representing that word.

Thus the find for "is" similarly coded is match({I}, test). This will return 2.
The scan for the
"s" word is coded match({S},test) and returns 0 because there is no "s" word in
the test sequence.

May I recommend trying to locate "is" word in a standard Euphoria sequence
string by doing :

find (" is ", test) -- Note the spaces around the word. This might save you
   parsing it too often.



25/02/2002 3:23:51 PM, Kat <gertie at PELL.NET> wrote:

>
>On 25 Feb 2002, at 14:01, Derek Parnell wrote:
>
>> 
>> Hi Kat,
>> I'm not quite sure what you are driving at either.
>> 
>> When you run this code of yours, what do you get displayed?
>> 
>> When I run it, it only reports that 'is' is found. And this is exactly what I
>> would have expected.
>> 
>> My assumption is that 'parse()' converts :
>> 
>>  "this is a test of match()"
>> 
>> to:
>> 
>>  {"this","is","a","test","of","match()"}
>
>Correct.
> 
>> In which case the match({"is"},test) call will return 2. Because "is" is the
>> second element of the test sequence.
>
>> In which case the match({"s"},test) call will return 0. Because "s" is not
>> any
>> of the elements in the test sequence.
>> 
>> What did you find surprising with your test?
>
>I had never seen the match({"is"},test) in any code i have read in the 
>archives. I have seen match("is",test). Frankly, the use of {} occasionally 
>confuses me, and to put the "is" inside the { } told me it was a nested 
>sequence, and should have returned nothing, zero, or an error,, because the 
>help files say all the parms for match() are to be sequences, and a nested 
>seq as the first parm barely makes sence. But i was looking for a way to get 
>some code running faster, and tried it.
>
>Kat
>
>
>> ------
>> Derek.
>> 
>> 
>> 25/02/2002 1:43:42 PM, Kat <gertie at PELL.NET> wrote:
>> 
>> >
>> >On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:
>> >
>> >> 
>> >> OK. But what is the strange thing about it? I don't understand.
>> >
>> >The {bracketed} and "quoted" parts in the examples below. 
>> >
>> >Kat
>> >
>> > 
>> >> ----- Original Message ----- 
>> >> From: "Kat" <gertie at PELL.NET> 
>> >> To: "EUforum" <EUforum at topica.com>
>> >> Sent: Sunday, February 24, 2002 12:49 AM
>> >> Subject: match() in depth!
>> >> 
>> >> 
>> >> > I didn't know we could do this!
>> >> > 
>> >> > include strtok.e
>> >> > with trace
>> >> > sequence test
>> >> > 
>> >> > test = "this is a test of match()"
>> >> > test = parse(test,32)
>> >> > 
>> >> > 
>> >> > if match({"is"},test) then 
>> >> >  puts(1,"found is\n") 
>> >> > end if
>> >> > 
>> >> > if match({"s"},test) then 
>> >> >  puts(1,"found s\n") 
>> >> > end if
>> >> > 
>> >> > trace(1)
>> >> > abort(0)
>> >> > 
>> >> > 
>> ---------
>> Cheers,
>> Derek Parnell 
>> 
>> 
>
>
>
---------
Cheers,
Derek Parnell

new topic     » goto parent     » topic index » view message » categorize

7. Re: match() in depth!

On 25 Feb 2002, at 15:58, Derek Parnell wrote:

> 
> Hi Kat,
> 
> however the result of parse() is a nested sequence.

Correct.

>  IOW this ==> {"this","is","a","test","of","match()"} is a sequence that
>  contains six other 
> sequences (aka strings). And match is actually looking for matching slices.
> When
> we are dealing with sequences that look like strings, we can easily think of
> match as looking for substrings. But in Euphoria palance, a substring is just
> a
> slice of a sequence.

Correct, which is the whole point of parsing, makes each word (token) a 
single slice, so word number 2 in sentence is same as sentence[2].

> If we use some symbolic representation of the test sequence above it could be
> coded as :
> 
>  {TIAtOM} 
> 
> where I've simple replaced each word with a letter representing that word.
> 
> Thus the find for "is" similarly coded is match({I}, test). This will return
> 2.
> The scan for the "s" word is coded match({S},test) and returns 0 because there
> is no "s" word in the test sequence.

You lost me there. You said match({I}, test), which i understand, but that is 
{I}, same as "I", which meets the specs for match() in the help files. I used 
{"I"}.

> May I recommend trying to locate "is" word in a standard Euphoria sequence
> string by doing :
> 
>    find (" is ", test) -- Note the spaces around the word. This might save you
>    parsing it too often.

I know that works too, same thing about the {" "}  though. But about parsing 
it too often, i parse the lines once, then each item is a slice, and i deparse 
before writing each out again. There is at least one change on each of 
200,000+ lines, and some have dozens of changes.

Kat
 
> 25/02/2002 3:23:51 PM, Kat <gertie at PELL.NET> wrote:
> 
> >
> >On 25 Feb 2002, at 14:01, Derek Parnell wrote:
> >
> >> 
> >> Hi Kat,
> >> I'm not quite sure what you are driving at either.
> >> 
> >> When you run this code of yours, what do you get displayed?
> >> 
> >> When I run it, it only reports that 'is' is found. And this is exactly what
> >> I
> >> would have expected.
> >> 
> >> My assumption is that 'parse()' converts :
> >> 
> >>  "this is a test of match()"
> >> 
> >> to:
> >> 
> >>  {"this","is","a","test","of","match()"}
> >
> >Correct.
> > 
> >> In which case the match({"is"},test) call will return 2. Because "is" is
> >> the
> >> second element of the test sequence.
> >
> >> In which case the match({"s"},test) call will return 0. Because "s" is not
> >> any of the elements in the test sequence.
> >> 
> >> What did you find surprising with your test?
> >
> >I had never seen the match({"is"},test) in any code i have read in the 
> >archives. I have seen match("is",test). Frankly, the use of {} occasionally
> >confuses me, and to put the "is" inside the { } told me it was a nested
> >sequence, and should have returned nothing, zero, or an error,, because the
> >help files say all the parms for match() are to be sequences, and a nested
> >seq
> >as the first parm barely makes sence. But i was looking for a way to get some
> >code running faster, and tried it.
> >
> >Kat
> >
> >
> >> ------
> >> Derek.
> >> 
> >> 
> >> 25/02/2002 1:43:42 PM, Kat <gertie at PELL.NET> wrote:
> >> 
> >> >
> >> >On 24 Feb 2002, at 22:52, rforno at tutopia.com wrote:
> >> >
> >> >> 
> >> >> OK. But what is the strange thing about it? I don't understand.
> >> >
> >> >The {bracketed} and "quoted" parts in the examples below. 
> >> >
> >> >Kat
> >> >
> >> > 
> >> >> ----- Original Message ----- 
> >> >> From: "Kat" <gertie at PELL.NET> 
> >> >> To: "EUforum" <EUforum at topica.com>
> >> >> Sent: Sunday, February 24, 2002 12:49 AM
> >> >> Subject: match() in depth!
> >> >> 
> >> >> 
> >> >> > I didn't know we could do this!
> >> >> > 
> >> >> > include strtok.e
> >> >> > with trace
> >> >> > sequence test
> >> >> > 
> >> >> > test = "this is a test of match()"
> >> >> > test = parse(test,32)
> >> >> > 
> >> >> > 
> >> >> > if match({"is"},test) then 
> >> >> >  puts(1,"found is\n") 
> >> >> > end if
> >> >> > 
> >> >> > if match({"s"},test) then 
> >> >> >  puts(1,"found s\n") 
> >> >> > end if
> >> >> > 
> >> >> > trace(1)
> >> >> > abort(0)
> >> >> > 
> >> >> > 
> >> ---------
> >> Cheers,
> >> Derek Parnell 
> >> 
> >> 
> ---------
> Cheers,
> Derek Parnell 
<snip>

> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

8. Re: match() in depth!

----- Original Message -----
From: <bensler at mail.com>
To: "EUforum" <EUforum at topica.com>
Subject: RE: match() in depth!


>
> > > If we use some symbolic representation of the test sequence above it
> > > could be
> > > coded as :
> > >
> > >  {TIAtOM}
> > >
> > > where I've simple replaced each word with a letter representing that
> > > word.
> > >
> > > Thus the find for "is" similarly coded is match({I}, test). This will
> > > return 2.
>
>
> {TIAtOM} = parse("this is a test of match()",32}
> ^ this should really be {T,I,A,T,O,M}
>

Not really. I was using SYMBOLS, not Euphoria!  I was trying to describe a
list of unique words, repesented by single letters.

  T === this
  I === is
  A === a
  t === test
  O === of
  M === match()

The string of letters TIAtOM is NOT a Euphoria variable - it is just a list
of symbols.

I represented this list of words as concatenated symbols enclosed in braces.
Sorry for the confusion.

In the first "match" we were looking for the word "is", which I had
represented by the symbol I. Thus I wrote :

    match(I, test) would return 2 because I is the second symbol in the list
TIAtOM

the second match was looking for the word "s", which I decided to represent
with the symbol S.

This is not a Euphoria variable. It is just a symbol representing the word
"s".

Thus match(S, test) would return 0 because S is not one of the symbols in
the list TIAtOM

Now if we rewrite this, substituting the words for the symbols...

  match("is", {"this","is","a","test","of","match()"}) ==> 2

  match("i", {"this","is","a","test","of","match()"}) ==> 0

> I = "is"
> I[1] = 'i'
>
> > > The scan for the "s" word is coded match({S},test) and returns 0
because
> > > there
> > > is no "s" word in the test sequence.
> >
>
> This is a bit confusing. match({S},test) will not return 0, it will fail
> with "variable not defined", because there is no S variable.
>
> > You lost me there. You said match({I}, test), which i understand, but
> > that is
> > {I}, same as "I", which meets the specs for match() in the help files. I
> > used
> > {"I"}.
>
> {I} != "I"
> {I} = {"is"}
> {"I"} = {{'I'}} -- (confused again? :P)
>
>

It seems that to get a faster match() or find(), we would need to specify a
starting point rather than the hardcoded starting index of 1.

  findfrom(10, "ai", "the rain in spain falls mainly in the plains") ==> 15

to do this sort of thing in current Euphoria, we have to get it to create
temporary slices, which is really just an unneccessary overhead.

--------
Derek.

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu