OpenEuphoria: Forum: Re: Replacing characters (Matt: bug)

1. Re: Replacing characters (Matt: bug)

Posted by Dan Moyer <DANIELMOYER at prodigy.net> Sep 17, 2002
441 views
Matt,

you wrote:
> It seems to replace the last char when I test it.

You're right, both your original and #2 *do* replace the last char,
normally; BUT, if TWO of the same character to be replaced are together in
the text, like "}}", *then* the last of the two aren't replaced by either of
your routines.

The test file I used (one of the supplied routines) happened to have two
"))" in it, and when I saw that the second ")" wasn't replaced, I seemed to
only see that  & didn't realize that all other instances of ")" *had* been
replaced, including those at the end of a line.  Not sure if your routines
would replace *doubled* at end of line, but I suspect not.

"Doubled chars" not replaced is still a flaw, just not the one I pointed you
towards.  Sorry 'bout that.

> Actually, neither one will handle replacements of different lengths.

The test I ran (& zipped previously) *did* show Andy's replacing one with
two. That same instance of doubled parentheses mentioned above ("))") was
replaced by "ZZZZ" in my test.


> <snip> Also, neither routine will handle:
> R/replace_in_string("abc cba", "ab", "12" )

Hmm, that's true, but they're supposed to replace characters, not strings;
seems we may need at least 2 different routines, one like Henri sent, which
does what it intends,  taking every instance of each *character* supplied &
replaces it with a different supplied char; and another which should take
every *string* supplied & replace it with a different *string*, like your
example above should take "ab" & replace every instance with "12".

ReplaceChars & ReplaceStrings?

Dan Moyer



----- Original Message -----
From: "Matthew Lewis" <matthewwalkerlewis at YAHOO.COM>
To: "EUforum" <EUforum at topica.com>
Sent: Tuesday, September 17, 2002 4:23 AM
Subject: RE: Replacing characters (Matt: bug)


>
>
> > From: Dan Moyer [mailto:DANIELMOYER at prodigy.net]
>
> > Matt's now doesn't fail on replace last character, but
> > doesn't actually
> > replace the last character, & is *very* much slower than
> > either of the other
> > two
> >  (one test: Henri: 3.9, Andy:  2.75 , Matt:  24.06) ;
>
> It seems to replace the last char when I test it.  I believe that the
> slowness is due to the use of subscripting within the call to match().
> There was some discussion about this recently.  I typically run fairly
short
> strings through this routine (maybe 2 or 3 sentence lengths) at a time, so
> I've never needed any more speed.
>
> > Henri/Mike "vulcan" routine is faster than Matt's but won't
> > replace one char
> > with 2;
> >
> > Andy's is fastest. How much faster seems to vary.  On a
> > "long" sequence to
> > peruse, it seemed twice as fast as Henri/Mike "vulcan", but on many
> > repetitions of replacing in smaller sequence, it seems only
> > 1.4 times faster
> > (see test results above).
>
> Actually, neither one will handle replacements of different lengths.  This
> was a definite requirement for what I needed, where, for instance you
might
> want to replace "." with "..".  Also, neither routine will handle:
>
> R/replace_in_string("abc cba", "ab", "12" )
>
> Both return "12c c21", which is fine if you've got a cipher (did anyone
use
> this method in the contest? :), but not if you're trying to replace words
> with other words, in which case you end up with garbage.  Henri talks
about
> this in the docs for the routine.  Of course, if the searched for object
> isn't there, or there aren't many of them, my routine will speed up.  So
if
> you're doing it on a large file, you might try calling replace_all each
line
> or every few lines.  You'd probably have to test to see where the overhead
> of the call cancelled out the quicker return.  And even then, it probably
> would depend upon how often your match came up.
>
> Here is a tweaked version that doesn't use a subscripted match.  It seems
to
> be about 15-20% faster:
>
> global function replace_all2( sequence text, object a, object b )
>   integer ix, jx, m, buf, lent, lena, lenb, dlen
>
>   if atom(a) then
>     a = {a}
>   end if
>
>   if atom(b) then
>    b = {b}
>   end if
>
>   ix = match( a, text )
>   if not ix then
>     return text
>   end if
>
>   lena = length(a)
>   lenb = length(b)
>   dlen = lenb - lena
>   lent = length(text)
>   jx = lena + 1
>
>   while jx > 1 do
>     text = text[1..ix-1] & b & text[ix+lena..lent]
>     ix += lenb
>     jx = 1
>     lent += dlen
>
>     while ix <= lent and jx <= lena do
>       if text[ix] = a[jx] then
>         jx += 1
>       else
>         jx = 1
>       end if
>       ix += 1
>     end while
>
>     if ix > lent then
>        ix -= lena
>     end if
>   end while
>
>   return text
> end function
>
> Matt Lewis
>
>
>
>
OpenEuphoria

1. Re: Replacing characters (Matt: bug)

Search

Include:

Quick Links

User menu

Misc Menu