1. Looking for the best "Parse"...

I want to be able to let the user specify a string of text with 
variables in it that my program will replace.  The goal is to put the 
current winamp song into a line of text.  I wrote this function, which 
works, but after reading the "Best Join" discussion, I realized there 
might be a better way of doing it.  So I am looking for any faster or 
simpler methods, if there are any.  I'll gladly accept any comments on 
my coding style, too.  I want to improve it.

[begin code]
include wildcard.e

global function parseUserText(sequence unParsedText, sequence escape, 
sequence replace )
sequence ParsedText
integer varPos
varPos = match(lower(escape), lower(unParsedText))
if not varPos then
   return unParsedText
elsif varPos = (length(unParsedText) - length(escape) + 1 ) then
         ParsedText = unParsedText[1..(length(unParsedText) - 
length(escape))] & replace
         return parseUserText(ParsedText, escape, replace)
else
   if varPos = 1 then
         ParsedText = replace & unParsedText[(length(escape) + 
1)..length(unParsedText)]
         return parseUserText(ParsedText, escape, replace)
   else
       ParsedText = unParsedText[1..(varPos - 1)] & replace & 
unParsedText[(varPos + (length(escape)))..length(unParsedText)]
       return parseUserText(ParsedText, escape, replace)
   end if
end if
end function
--ex. text = parseUserText("The $A jumped over the lazy 
dog.","$A","quick brown fox")

puts (1, parseUserText("The $A jumped over the lazy dog.","$A","quick 
brown fox"))
[end code]

new topic     » topic index » view message » categorize

2. Re: Looking for the best "Parse"...


new topic     » goto parent     » topic index » view message » categorize

3. Re: Looking for the best "Parse"...

Hi, Cassidy,

the function below is probably faster, because it avoids both
recursion and repeated conversions of the same text to lower case. But
you can gain more by simply replacing Rob's elegant, but inefficient
lower() function in wildcard.e by something like this:


constant DIFF = 'a' - 'A'

global function lower(object x)
    integer c
    if atom(x) then
        if x >= 'A' then
            if x <= 'Z' then
                return x + DIFF
            end if
        end if
    else
        for i = 1 to length(x) do
            c = x[i]
            if c >= 'A' then
                if c <= 'Z' then
                    x[i] += DIFF
                end if
            end if
        end for
    end if
    return x
end function

(Note: I did not use 'short-circuit' evaluation, simply because it is
much slower than its equivalent. I hope Rob can fix that soon.)


function parse(sequence text, sequence esc, sequence replace)
    sequence e, r, s
    integer m, n, i, j, p

    e = lower(esc)
    n = length(esc)
    s = lower(text)
    i = 0
    r = {}
    j = match(e, s)
    while j do
        r = (i + j) & r
        i += j + n - 1
        s = s[j+n..length(s)]
        j = match(e, s)
    end while
    m = length(r)
    if m then
        for k = 1 to m  do
            p = r[k]
            text = text[1..p-1] & replace & text[p+n..length(text)]
        end for
    end if
    return text
end function

--jiri

----- Original Message -----
From: "Cassidy Napoli" <gonzotek at yahoo.com>
To: "EUforum" <EUforum at topica.com>
Sent: Friday, 26 October 2001 03:39
Subject: Looking for the best "Parse"...


>
> I want to be able to let the user specify a string of text with
> variables in it that my program will replace.  The goal is to put
the
> current winamp song into a line of text.  I wrote this function,
which
> works, but after reading the "Best Join" discussion, I realized
there
> might be a better way of doing it.  So I am looking for any faster
or
> simpler methods, if there are any.  I'll gladly accept any comments
on
> my coding style, too.  I want to improve it.
>
> [begin code]
> include wildcard.e
>
> global function parseUserText(sequence unParsedText, sequence
escape,
> sequence replace )
> sequence ParsedText
> integer varPos
> varPos = match(lower(escape), lower(unParsedText))
> if not varPos then
>    return unParsedText
> elsif varPos = (length(unParsedText) - length(escape) + 1 ) then
>          ParsedText = unParsedText[1..(length(unParsedText) -
> length(escape))] & replace
>          return parseUserText(ParsedText, escape, replace)
> else
>    if varPos = 1 then
>          ParsedText = replace & unParsedText[(length(escape) +
> 1)..length(unParsedText)]
>          return parseUserText(ParsedText, escape, replace)
>    else
>        ParsedText = unParsedText[1..(varPos - 1)] & replace &
> unParsedText[(varPos + (length(escape)))..length(unParsedText)]
>        return parseUserText(ParsedText, escape, replace)
>    end if
> end if
> end function
> --ex. text = parseUserText("The $A jumped over the lazy
> dog.","$A","quick brown fox")
>
> puts (1, parseUserText("The $A jumped over the lazy
dog.","$A","quick
> brown fox"))
> [end code]
>
>
>

new topic     » goto parent     » topic index » view message » categorize

4. Re: Looking for the best "Parse"...


new topic     » goto parent     » topic index » view message » categorize

5. Re: Looking for the best "Parse"...

Hello Cassidy Napoli,

Just once more iteration with your program 
and your very interesting *recursive* idea,
which you can make with the 
*delete doubl* method to get a simple 
and understandable code.

Try please:

--[begin code]
include wildcard.e

global function parseUserText
(sequence u, --unParsedText -- u
 sequence e, --escape       -- e
 sequence r) --replace      -- r
-- 
 sequence p  --ParsedText   -- p
  integer v  --varPos       -- v
  integer m -- le
  integer n -- lu
  integer k
--  
	  v = match(lower(e), lower(u)) -- add explanations in your
   if not v then return u end if        -- native language and in English
--
	  m = length(e) -- place for explanation
	  n = length(u) -- place for explanation
	  k = n - m -- place for explanation
       if v = 1 -- place for explanation
     then p = r & u[m+1..n] -- place for explanation
    elsif v = 1 + k -- place for explanation
     then p = u[1..k] & r -- place for explanation
     else p = u[1..v-1] & r & u[v+m..n] -- place for explanation
   end if -- place for explanation
  return parseUserText(p,e,r) -- place for explanation
end function

sequence text

text = parseUserText("The $A jumped over the lazy dog.","$A","quick brown
fox")

puts (1, text)

--[end code]

The speed is another question, it requires
the bench-testings on monotasking OS,
in plain Dos-32 mode, for example.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

6. Re: Looking for the best "Parse"...


new topic     » goto parent     » topic index » view message » categorize

7. Re: Looking for the best "Parse"...

Ok Cassidy,

I think, my part of the work is near
the finish, so please see your
function in the usual compact form:


-- program parse.ex
 include wildcard.e
 
 global function parseUserText(sequence u, sequence e, sequence r)
  sequence p
   integer v,m,n,k
 	  v = match(lower(e), lower(u))
    if not v then return u end if
 	  m = length(e)
 	  n = length(u)
 	  k = n - m
     if v = 1 then 
              p = r & u[m+1..n]
     elsif v = 1 + k then 
              p = u[1..k] & r
     else 

              p = u[1..v-1] & r & u[v+m..n]
     end if
   return parseUserText(p,e,r)
 end function
 
 sequence text
 
 text = parseUserText("The $A jumped over the lazy dog.","$A","quick brown
 fox")
 
 puts (1, text)
--end of program

When I was testing the last variant, there was
misprint in the code, and recursive construction
run out of memory on my machine without
other messages about bug.

So, code with recursions are very nice and
compact, but it is some more dangerous
for the stability.

Good luck, I'll be off line some time,
so there may be delay with my answers,
if you have questions to me personally.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

8. Re: Looking for the best "Parse"...


new topic     » goto parent     » topic index » view message » categorize

9. Re: Looking for the best "Parse"...

Cassidy wrote:

>Jiri - Could you explain more about what is happening using your
>method? I'm not exactly sure I follow it, what makes it better than
>recursion? I'm willing to take it as a matter of faith that it's
>faster; even though I'm not concerned with marginal speed differences
>for this particular program, I'll store the knowledge away for when I
>AM concerned with speed.


Ok. This kind of recursion is relatively easy to 'flatten' (or
'unroll') into a plain iterative structure. Recursive solutions are
usually very appealing, but tend to be expensive both in terms of
memory and in used run time. Euphoria is very good in this respect,
but even in your case a 'quick and dirty' bench mark will show you
your solution is probably twice or three time slower than mine. As I
mentioned earlier, your recursive function is also handicapped by the
repeated use of the case conversion function - it is applied to both
the text and the 'escape' sequence every time your formula is
re-entered.

Now back to my solution. Sorry, originally I did not use any comments,
because I thought it was quite obvious what I was doing...

The repeat of my function below has line numbers indicated so that I
can refer to them.

In lines 1 and 2 I just declare a few private variables.

In lines 5 and 7 I create lower case copies of input parameters, once
only ;).

The length of the escape sequence is stored in 'n', because it will be
used repeatedly, and we do not want to waste time by calculating it
again and again.

'i = 0' in line 8 indicates we have not yet removed anything from the
front, start of text string - that will come later.

'r' in line 9 is a tricky customer, as you will see later. It is used
to store indices (locations) of all matches within the text string.

In line 10 we are looking for the first match.

If we find one, we enter the 'while' loop, and in line 12 we store the
index of the first character of the match in our above mentioned 'r'
sequence. This is the place, where a bit of elegance is pumped in: the
indices are stored in *reversed* order, effectively *prepended*.

In line 13 we calculate how many characters of the original text we
have surveyed so far, and in line 14 we simply chop them off.

In line 15 we are looking for the next match in the reduced string.

When all the matches have been found, we count them in line 17.

In lines 18 to 23 we replace every found occurrence of the escape
sequence with our replace string. Just note, because we cleverly
stored replacement locations in reversed order, we can now use a
simple 'for' loop to do the replacement from the *back* of the string,
so that the remaining indices in 'r' are *not* affected by the changes
of length of the original string as the replacements progress.


I hope this verbal diarrhoea is of some use to some one.

jiri


 1: function parse(sequence text, sequence esc, sequence replace)
 2:     sequence e, r, s
 3:     integer m, n, i, j, p
 4:
 5:     e = lower(esc)
 6:     n = length(esc)
 7:     s = lower(text)
 8:     i = 0
 9:     r = {}
10:     j = match(e, s)
11:     while j do
12:         r = (i + j) & r
13:         i += j + n - 1
14:         s = s[j+n..length(s)]
15:         j = match(e, s)
16:     end while
17:     m = length(r)
18:     if m then
19:         for k = 1 to m  do
20:             p = r[k]
21:             text = text[1..p-1] & replace &
text[p+n..length(text)]
22:         end for
23:     end if
24:     return text
25: end function

new topic     » goto parent     » topic index » view message » categorize

10. Re: Looking for the best "Parse"...

This is a multi-part message in MIME format.

------=_NextPart_000_0030_01C15EBD.5B54D280
	charset="Windows-1252"

Here is my offering plus some timing tests.

I have opted for a non-recursive approach with minimum sequence/slice
operations.

----
Derek.


------=_NextPart_000_0030_01C15EBD.5B54D280
Content-Type: application/octet-stream;
	name="parse.e"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="parse.e"

with trace
include wildcard.e
include get.e

global function cparse(sequence unParsedText, sequence escape,
sequence replace )
sequence ParsedText
integer varPos
varPos =3D match(lower(escape), lower(unParsedText))
if not varPos then
   return unParsedText
elsif varPos =3D (length(unParsedText) - length(escape) + 1 ) then
         ParsedText =3D unParsedText[1..(length(unParsedText) -=20
length(escape))] & replace
         return cparse(ParsedText, escape, replace)
else
   if varPos =3D 1 then
         ParsedText =3D replace & unParsedText[(length(escape) +=20
1)..length(unParsedText)]
         return cparse(ParsedText, escape, replace)
   else
       ParsedText =3D unParsedText[1..(varPos - 1)] & replace &=20
unParsedText[(varPos + (length(escape)))..length(unParsedText)]
       return cparse(ParsedText, escape, replace)
   end if
end if
end function

function jparse(sequence text, sequence esc, sequence replace)
    sequence e, r, s
    integer m, n, i, j, p

    e =3D lower(esc)
    n =3D length(esc)
    s =3D lower(text)
    i =3D 0
    r =3D {}
    j =3D match(e, s)
    while j do
        r =3D (i + j) & r
        i +=3D j + n - 1
        s =3D s[j+n..length(s)]
        j =3D match(e, s)
    end while
    m =3D length(r)
    if m then
        for k =3D 1 to m  do
            p =3D r[k]
            text =3D text[1..p-1] & replace & text[p+n..length(text)]
        end for
    end if
    return text
end function

-- Assumption: All sequences contain ASCII text.
-- Each element is an integer from 0 to 255.
global function dparse(sequence text, sequence esc, sequence replace)
    sequence t,ltext
    integer pos,  -- posn of esc in text
            epos, -- end of text in t
            spos, -- pos in t to copy to
            ppos  -- previous 'pos' value

    -- Do a case-insensitive compare
    esc =3D lower(esc)
    ltext =3D lower(text)
    -- Find the first match
    pos =3D match(esc, ltext)
    -- If no matches, return the original text
    if pos =3D 0 then
        return text
    end if

    -- Create a sequence long enough to hold the result
    ppos =3D length(text) + floor(1 + =
(length(replace)*length(text)/length(esc)))
    t =3D repeat(-1, ppos)

    -- Initialise the position holders
    epos =3D 0
    spos =3D 0
    ppos =3D 0

    -- Keep making replacements until no more matches
    while pos do
        -- Copy all the text up to the current match
        -- Note special case ('cos Eu doesn't like slices starting with =
0)
        if pos !=3D 1 then
            epos =3D spos + pos - ppos - 1
            t[spos .. epos] =3D ltext[ppos .. pos-1]
        end if

        -- Hide the current match by changing its value to a non-text =
value,
        -- this way the next iteration won't find it again.
        ltext[pos] =3D 256

        -- Skip over the escape text
        ppos =3D pos + length(esc)

        -- Copy the replacement text into the result
        spos =3D epos + 1
        epos =3D epos + length(replace)
        t[spos .. epos] =3D replace

        -- Skip over the replacement text in the result area
        spos =3D epos + 1

        -- Look for the next match
        pos =3D match(esc, ltext)
    end while

    -- Return only the copied text, as there could be some residual
    -- junk in the result area.
    return t[1..epos]
end function

 global function parseUserText(sequence u, sequence e, sequence r)
  sequence p
   integer v,m,n,k
    v =3D match(lower(e), lower(u))
    if not v then return u end if
    m =3D length(e)
    n =3D length(u)
    k =3D n - m
     if v =3D 1 then=20
              p =3D r & u[m+1..n]
     elsif v =3D 1 + k then=20
              p =3D u[1..k] & r
     else=20

              p =3D u[1..v-1] & r & u[v+m..n]
     end if
   return parseUserText(p,e,r)
 end function
=20

atom e
atom c
integer p
constant rtns =3D {routine_id("cparse"),
                 routine_id("dparse"),
                 routine_id("jparse"),
                 routine_id("parseUserText")
                }
constant names =3D{"cassidy's",
                 "derek's",
                 "jiri's",
                 "igor's"
                }

without warning
procedure null(sequence s)
end procedure

p =3D 5
puts(1, "parse Test starts\n")
for i =3D 1 to length(rtns) do

c =3D 0
e =3D time() + p
while e > time() do
null(call_func(rtns[i],{"$A one two $A three $A", "$A", "q"}))
null(call_func(rtns[i],{"$A one two $A three $A", "$A", "qxz"}))
null(call_func(rtns[i],{"$A one two $A three $A", "$A", ""}))
null(call_func(rtns[i],{"$A one two $A three $A", "$B", "$A"}))
null(call_func(rtns[i],{"$ABC one two $ABC three $ABC", "$ABC", "1"}))
c+=3D1
end while
printf(1,"%s did %d iterations in %d seconds\n", {names[i],c, p})
end for

if platform() =3D 2 then
    puts(1, "END\n")
    c =3D wait_key()
end if

------=_NextPart_000_0030_01C15EBD.5B54D280--

new topic     » goto parent     » topic index » view message » categorize

11. Re: Looking for the best "Parse"...

I dont think these multi-run test are any good......

What Im saying  is that often times the second pass will be 5 times faster than
the first.

Anyone want to comment on this?

Euman
euman at bellsouth.net

> 
> Here is my offering plus some timing tests.
> 
> I have opted for a non-recursive approach with minimum sequence/slice
> operations.
> 
> ----
> Derek.
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

12. Re: Looking for the best "Parse"...

----- Original Message -----
From: <euman at bellsouth.net>
To: "EUforum" <EUforum at topica.com>
Subject: Re: Looking for the best "Parse"...


> I dont think these multi-run test are any good......
>
> What Im saying  is that often times the second pass will be 5 times faster
than the first.
>
> Anyone want to comment on this?

I have noticed this also. It has to do with garbage collection processes. In
order to compensate for this phenomena, I have styled the tests using fake
calls to routines which seem to trick the garbage collection. To
demonstrate, I swapped the order of the routines around and still get the
same sort of results.

However, I have been reminded never to write code while having breakfast,
and while getting the kids ready for club tennis competitions - there is a
major mistake in the routine I sent out. It doesn't return any text after
the last match! Ooops. Below is the corrected routine.

--------------
-- Assumption: All sequences contain ASCII text.
-- That is, each element is an integer from 0 to 255.
global function dparse(sequence text, sequence esc, sequence replace)
    sequence t,ltext
    integer pos,  -- posn of esc in text
            epos, -- end of text in t
            spos, -- pos in t to copy to
            ppos  -- previous 'pos' value

    -- Do a case-insensitive compare
    esc = lower(esc)
    ltext = lower(text)
    -- Find the first match
    pos = match(esc, ltext)
    -- If no matches, return the original text
    if pos = 0 or length(esc) = 0 then
        return text
    end if

    -- Create a sequence long enough to hold the result
    ppos = length(text) + floor(1 +
(length(replace)*length(text)/length(esc)))
    t = repeat(-1, ppos)

    -- Initialise the position holders
    epos = 0
    spos = 0
    ppos = 0

    -- Keep making replacements until no more matches
    while pos do
        -- Copy all the text up to the current match
        -- Note special case ('cos Eu doesn't like slices starting with 0)
        if pos != 1 then
            epos = spos + pos - ppos - 1
            t[spos .. epos] = ltext[ppos .. pos-1]
        end if

        -- Hide the current match by changing its value to a non-text value,
        -- this way the next iteration won't find it again.
        ltext[pos] = -1

        -- Skip over the escape text
        ppos = pos + length(esc)

        -- Copy the replacement text into the result
        spos = epos + 1
        epos = epos + length(replace)
        t[spos .. epos] = replace

        -- Skip over the replacement text in the result area
        spos = epos + 1

        -- Look for the next match
        pos = match(esc, ltext)
    end while

    -- Copy any left over text
    if ppos <= length(ltext) then
        pos = length(ltext)
        epos = spos + pos - ppos
        t[spos .. epos] = ltext[ppos .. pos]
    end if
    -- Return only the copied text, as there could be some residual
    -- junk in the result area.
    return t[1..epos]
end function

--------------

Some more observations, if I may.

The recursive solution presented so far have a problem when the replacement
text contains the escape text. They go into a never ending loop.

They choke on this sort of thing..
   res = parse("one$Atwo", "$A", "three$Afour")

Secondly, these sort of routines would go a lot faster and be coded a lot
more simply if RDS provided a variation to match() and find() that allowed
one to specify the starting offset.  Such as ...
   pos = match_from(",", "abc,def,ghi,klm", 5)
would start at the fifth element and return 8.

Finally, it would be nice if RDS could treat zero-length slices as a NOP,
regardless of what the actual start and end pos values are.


---
Derek.

new topic     » goto parent     » topic index » view message » categorize

13. Re: Looking for the best "Parse"...

Okay, I'm having a bad head day. Did anyone else notice the next mistake in
my code? I only handles text which starts with the escape text. Anyhow, to
fix this mistake, replace the lines ...

    -- Initialise the position holders
    epos = 0
    spos = 0
    ppos = 0

with ...
    -- Initialise the position holders
    epos = 0
    spos = 1
    ppos = 1

------
cheers,
Derek.

new topic     » goto parent     » topic index » view message » categorize

14. Re: Looking for the best "Parse"...

Derek, I hope your day improved after the correction flurry.  And
since your girls  beat our girls at netball again tonight, I had to
try a little harder to compensate. I think the simplified function
below is just marginally faster than yours. But do not lose any sleep
over it...

jiri


function jparse2(sequence s, sequence s1, sequence s2)
    -- replace every occurrence of s1 in s by s2
    -- case insensitive search
    sequence os,u,v
    integer i,j,m

    os = {}
    u = lower(s)
    v = lower(s1)
    m = length(s1)
    i = match(v, u)
    j = 1
    while i do
        os &= s[j..j+i-2] & s2
        j += i+m-1
        u = u[i+m..length(u)]
        i = match(v, u)
    end while
    return os & s[j..length(s)]
end function


----- Original Message -----
From: "Derek Parnell" <ddparnell at bigpond.com>
To: "EUforum" <EUforum at topica.com>
Sent: Saturday, 27 October 2001 12:34
Subject: Re: Looking for the best "Parse"...


>
> Okay, I'm having a bad head day. Did anyone else notice the next
mistake in
> my code? I only handles text which starts with the escape text.
Anyhow, to
> fix this mistake, replace the lines ...
>
>     -- Initialise the position holders
>     epos = 0
>     spos = 0
>     ppos = 0
>
> with ...
>     -- Initialise the position holders
>     epos = 0
>     spos = 1
>     ppos = 1
>
> ------
> cheers,
> Derek.

new topic     » goto parent     » topic index » view message » categorize

15. Re: Looking for the best "Parse"...

Beautiful. Fast and simple. Jiri, you da man.

No sleep lost. Go Aussies!

BTW, my wife is over there cycling around the Nelson area for another week.
She went to a conference in Christchurch. Took her bike over. Now doing some
touring.
---
cheers,
Derek.
----- Original Message -----
From: "Jiri Babor" <jbabor at PARADISE.NET.NZ>
To: "EUforum" <EUforum at topica.com>
Sent: Saturday, October 27, 2001 10:15 PM
Subject: Re: Looking for the best "Parse"...


>
> Derek, I hope your day improved after the correction flurry.  And
> since your girls  beat our girls at netball again tonight, I had to
> try a little harder to compensate. I think the simplified function
> below is just marginally faster than yours. But do not lose any sleep
> over it...
>
> jiri
>
>
> function jparse2(sequence s, sequence s1, sequence s2)
>     -- replace every occurrence of s1 in s by s2
>     -- case insensitive search
>     sequence os,u,v
>     integer i,j,m
>
>     os = {}
>     u = lower(s)
>     v = lower(s1)
>     m = length(s1)
>     i = match(v, u)
>     j = 1
>     while i do
>         os &= s[j..j+i-2] & s2
>         j += i+m-1
>         u = u[i+m..length(u)]
>         i = match(v, u)
>     end while
>     return os & s[j..length(s)]
> end function
>
>
> ----- Original Message -----
> From: "Derek Parnell" <ddparnell at bigpond.com>
> To: "EUforum" <EUforum at topica.com>
> Sent: Saturday, 27 October 2001 12:34
> Subject: Re: Looking for the best "Parse"...
>
>
> > Okay, I'm having a bad head day. Did anyone else notice the next
> mistake in
> > my code? I only handles text which starts with the escape text.
> Anyhow, to
> > fix this mistake, replace the lines ...
> >
> >     -- Initialise the position holders
> >     epos = 0
> >     spos = 0
> >     ppos = 0
> >
> > with ...
> >     -- Initialise the position holders
> >     epos = 0
> >     spos = 1
> >     ppos = 1
> >
> > ------
> > cheers,
> > Derek.
>
>
>

new topic     » goto parent     » topic index » view message » categorize

16. Re: Looking for the best "Parse"...

Clever girl. Nelson is my favourite provincial town. I like it best
when the cherries are ripening, at around Christmas. But I suppose,
spring is better, probably ideal, for touring. I hope she will venture
into the Tasman National Park, a real jewel.

Good night for now,  it's almost 3 o'clock  in the morning... jiri

----- Original Message -----
From: "Derek Parnell" <ddparnell at bigpond.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: Looking for the best "Parse"...


>
> Beautiful. Fast and simple. Jiri, you da man.
>
> No sleep lost. Go Aussies!
>
> BTW, my wife is over there cycling around the Nelson area for
another week.
> She went to a conference in Christchurch. Took her bike over. Now
doing some
> touring.
> ---
> cheers,
> Derek.

new topic     » goto parent     » topic index » view message » categorize

17. Re: Looking for the best "Parse"...

Ýòî ñîîáùåíèå â ôîðìàòå MIME ñîñòîèò èç íåñêîëüêèõ ÷àñòåé.

------=_NextPart_000_01C15F3C.6EFF8300

Hi Derek, Hi Jiri, 

I have prepared new variant of this parse play.
I apply the "delete double" method to jparse2()
function.

This new iteration is in the attachment 
with my own bench.

You can see my results below.
These are times of 10000 iterations  on 
my a 386 in plain Dos-32 mode

jparse2()            jparse3() -- new

21.59                21.48
21.59                21.48
21.64                21.47
21.59                21.47
21.59                21.47
21.58                21.48
21.58                21.53
21.59                21.53
21.59                21.47
----------------------------------average
21.59                21.48

Note, I don't know algorithms well enough,
both Cassidy's and any new, I just
read about that fox, which jumps over lazy dog,
to see if there is misprint in my new
iterations to the best function or no there is not.
I am not real author of any task and solution
in this play, I just search for doubled
things and try delete them with the help of 
Euphoria's inner power, so my name and these tasks
and functions are not well compatible things,
I think. And "delete double" method is an old one,
same as ... I don't know as. 

If you'll see Colin's solution near the modifired
Cassidy's function, you can see Colin just
apply "double delete" method to that recursive
solution.

You can also to apply this "delete double" method 
to Cassidy's *test frase* and delete mistakes from
that classic "delete double" test   smile

OK to finish this subject for the robast 
Eu Dos-32 programming ?

I say, there is OK with humor here, no ?

Regards,
Igor Kachan
kinz at peterlink.ru








------=_NextPart_000_01C15F3C.6EFF8300
Content-Type: application/octet-stream; name="Parse1.e"
Content-Transfer-Encoding: 7bit
Content-Description: Parse1.e (E )
Content-Disposition: attachment; filename="Parse1.e"


include wildcard.e

 function jparse2(sequence s, sequence s1, sequence s2)
     -- replace every occurrence of s1 in s by s2
     -- case insensitive search
     sequence os,u,v
     integer i,j,m

     os = {}
     u = lower(s)
     v = lower(s1)
     m = length(s1)
     i = match(v, u)
     j = 1
     while i do
	 os &= s[j..j+i-2] & s2
	 j += i+m-1
	 u = u[i+m..length(u)]
	 i = match(v, u)
     end while
     return os & s[j..length(s)]
 end function

integer k,i,j,m
sequence os,u,v

 function jparse3(sequence s, sequence s1, sequence s2)
     -- replace every occurrence of s1 in s by s2
     -- case insensitive search
--     sequence os,u,v
--     integer k,i,j,m

     os = {}
     u = lower(s)
     v = lower(s1)
     m = length(s1)
     j = 1
     for t=1 to 111111111 do
       i = match(v, u) if not i then exit end if   
	   k = i+m
	 os &= s[j..j+i-2] & s2
	  j += k-1
	   u = u[k..length(u)]
     end for
     return os & s[j..length(s)]
 end function

sequence text

text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox")

puts (1, text & "\n")


atom T

T=time()
for p=0 to 10000 do
text = jparse2 ("The $A jumped over the lazy dog.","$A","quick brown fox")
end for
T=time()-T
? T

T=time()
for p=0 to 10000 do
text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox")
end for
T=time()-T
?T


------=_NextPart_000_01C15F3C.6EFF8300--

new topic     » goto parent     » topic index » view message » categorize

18. Re: Looking for the best "Parse"...

Hi again,

I have noticed, that the useful
thing may be new loop in Euphoria,
it is dangerous thing, but very
handy for speed and delete double.

just do
  -- code
         if  condition then exit end if
  -- code
end do

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

19. Re: Looking for the best "Parse"...

----- Original Message -----
From: "Igor Kachan" <kinz at peterlink.ru>
To: "EUforum" <EUforum at topica.com>
Subject: Re: Looking for the best "Parse"...

> I have noticed, that the useful
> thing may be new loop in Euphoria,
> it is dangerous thing, but very
> handy for speed and delete double.
>
> just do
>   -- code
>          if  condition then exit end if
>   -- code
> end do
>

Yes, a simple loop construct would be handy. The standard method of doing
this...

  while 1 do
  . . .
  end while

introduces an extra comparison for each iteration. An alternative syntax of
...

  repeat do
  . . .
  end repeat

might not be such a bad idea.
-------
Derek

new topic     » goto parent     » topic index » view message » categorize

20. Re: Looking for the best "Parse"...

Yes, Derek, *just* is not good and your
variant is more similar to  Euphoria's
syntax.

> Yes, a simple loop construct would be handy. The standard method of doing
> this...
> 
>   while 1 do
>   . . .
>   end while
> 
> introduces an extra comparison for each iteration. An alternative syntax
of
> ...
> 
>   repeat do
>   . . .
>   end repeat
> 
> might not be such a bad idea.
> -------
> Derek


Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

21. Re: Looking for the best "Parse"...


new topic     » goto parent     » topic index » view message » categorize

22. Re: Looking for the best "Parse"...

Derek Parnell writes:
> The standard method of doing this...
>
>  while 1 do
>  . . .
>  end while
>
> introduces an extra comparison for each iteration.

Actually, both the interpreter and the translator optimize
this case, so no comparison happens at run-time.
There's a simple unconditional jump from the end of the loop
back to the first statement inside the loop.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

23. Re: Looking for the best "Parse"...

Thank you, Robert. That is a useful bit of information. I'm will now be more
inclined to do try this. Maybe then, a mention in the documentation for this
special case might be in order.

----- Original Message -----
From: "Robert Craig" <rds at RapidEuphoria.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: Looking for the best "Parse"...


>
> Derek Parnell writes:
> > The standard method of doing this...
> >
> >  while 1 do
> >  . . .
> >  end while
> >
> > introduces an extra comparison for each iteration.
>
> Actually, both the interpreter and the translator optimize
> this case, so no comparison happens at run-time.
> There's a simple unconditional jump from the end of the loop
> back to the first statement inside the loop.
>
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    http://www.RapidEuphoria.com
>
>
>

new topic     » goto parent     » topic index » view message » categorize

24. Re: Looking for the best "Parse"...

Ýòî ñîîáùåíèå â ôîðìàòå MIME ñîñòîèò èç íåñêîëüêèõ ÷àñòåé.

------=_NextPart_000_01C15FF0.2DDE07C0

Derek wrote:

> Thank you, Robert. That is a useful bit of information.
> I'm will now be more inclined to do try this. Maybe then,
> a mention in the documentation for this
> special case might be in order.

Hi Derek, I have tested the jparse4() with that extremly 
fast loop just now.

Loop

 while 1 do
 . . .
 end while

really works as I want, so I do not want new loop in Euphoria now.

Thanks, Robert, from me, sorry, I didn't know about this feature.

My results with jparse2(), jparse3(), jparse4() are below:
(times of 10000 iterations in plain Dos-32 mode on a 386)

jparse2()            jparse3()            jparse4()

21.58                21.48                 21.31
21.59                21.47                 21.37
21.59                21.48                 21.31
21.59                21.48                 21.36

But remember please, I don't know any algorithms,
implemented in these functions, and thanks to Cassidy
for such useful his task and for his working solution.

Regards,
Igor Kachan
kinz at peterlink.ru



------=_NextPart_000_01C15FF0.2DDE07C0
Content-Type: application/octet-stream; name="Parse2.ex"
Content-Transfer-Encoding: 7bit
Content-Description: Parse2.ex (EX )
Content-Disposition: attachment; filename="Parse2.ex"


include wildcard.e

 function jparse2(sequence s, sequence s1, sequence s2)
     -- replace every occurrence of s1 in s by s2
     -- case insensitive search
     sequence os,u,v
     integer i,j,m

     os = {}
     u = lower(s)
     v = lower(s1)
     m = length(s1)
     i = match(v, u)
     j = 1
     while i do
	 os &= s[j..j+i-2] & s2
	 j += i+m-1
	 u = u[i+m..length(u)]
	 i = match(v, u)
     end while
     return os & s[j..length(s)]
 end function

integer k,i,j,m
sequence os,u,v

 function jparse3(sequence s, sequence s1, sequence s2)
     -- replace every occurrence of s1 in s by s2
     -- case insensitive search
--     sequence os,u,v
--     integer k,i,j,m

     os = {}
     u = lower(s)
     v = lower(s1)
     m = length(s1)
     j = 1
     for t=1 to 111111111 do
       i = match(v, u) if not i then exit end if   
	   k = i+m
	 os &= s[j..j+i-2] & s2
	  j += k-1
	   u = u[k..length(u)]
     end for
     return os & s[j..length(s)]
 end function

 function jparse4(sequence s, sequence s1, sequence s2)
     -- replace every occurrence of s1 in s by s2
     -- case insensitive search
--     sequence os,u,v
--     integer k,i,j,m

     os = {}
     u = lower(s)
     v = lower(s1)
     m = length(s1)
     j = 1
     while 1 do
       i = match(v, u) 
	  if not i then exit end if   
	   k = i+m
	 os &= s[j..j+i-2] & s2
	  j += k-1
	   u = u[k..length(u)]
     end while
     return os & s[j..length(s)]
 end function

sequence text

text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox")

puts (1, text & "\n")


atom T

T=time()
for p=0 to 10000 do
text = jparse2 ("The $A jumped over the lazy dog.","$A","quick brown fox")
end for
T=time()-T
? T

T=time()
for p=0 to 10000 do
text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox")
end for
T=time()-T
?T

T=time()
for p=0 to 10000 do
text = jparse4 ("The $A jumped over the lazy dog.","$A","quick brown fox")
end for
T=time()-T
?T


------=_NextPart_000_01C15FF0.2DDE07C0--

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu