OpenEuphoria: Forum: RE: Replacing characters (Matt: bug)

RE: Replacing characters (Matt: bug)

new topic » goto parent » topic index » view thread » older message » newer message

Posted by Matthew Lewis <matthewwalkerlewis at YAHOO.COM> Sep 17, 2002
533 views

> From: Dan Moyer [mailto:DANIELMOYER at prodigy.net]

> Matt's now doesn't fail on replace last character, but 
> doesn't actually
> replace the last character, & is *very* much slower than 
> either of the other
> two
>  (one test: Henri: 3.9, Andy:  2.75 , Matt:  24.06) ;

It seems to replace the last char when I test it.  I believe that the
slowness is due to the use of subscripting within the call to match().
There was some discussion about this recently.  I typically run fairly short
strings through this routine (maybe 2 or 3 sentence lengths) at a time, so
I've never needed any more speed.
 
> Henri/Mike "vulcan" routine is faster than Matt's but won't 
> replace one char
> with 2;
> 
> Andy's is fastest. How much faster seems to vary.  On a 
> "long" sequence to
> peruse, it seemed twice as fast as Henri/Mike "vulcan", but on many
> repetitions of replacing in smaller sequence, it seems only 
> 1.4 times faster
> (see test results above).

Actually, neither one will handle replacements of different lengths.  This
was a definite requirement for what I needed, where, for instance you might
want to replace "." with "..".  Also, neither routine will handle:

R/replace_in_string("abc cba", "ab", "12" ) 

Both return "12c c21", which is fine if you've got a cipher (did anyone use
this method in the contest? :), but not if you're trying to replace words
with other words, in which case you end up with garbage.  Henri talks about
this in the docs for the routine.  Of course, if the searched for object
isn't there, or there aren't many of them, my routine will speed up.  So if
you're doing it on a large file, you might try calling replace_all each line
or every few lines.  You'd probably have to test to see where the overhead
of the call cancelled out the quicker return.  And even then, it probably
would depend upon how often your match came up.

Here is a tweaked version that doesn't use a subscripted match.  It seems to
be about 15-20% faster:

global function replace_all2( sequence text, object a, object b )
  integer ix, jx, m, buf, lent, lena, lenb, dlen

  if atom(a) then
    a = {a}
  end if

  if atom(b) then
   b = {b}
  end if

  ix = match( a, text )
  if not ix then
    return text
  end if

  lena = length(a)
  lenb = length(b)
  dlen = lenb - lena
  lent = length(text)
  jx = lena + 1

  while jx > 1 do
    text = text[1..ix-1] & b & text[ix+lena..lent]
    ix += lenb
    jx = 1
    lent += dlen

    while ix <= lent and jx <= lena do
      if text[ix] = a[jx] then
        jx += 1
      else
        jx = 1
      end if
      ix += 1
    end while

    if ix > lent then
       ix -= lena
    end if
  end while

  return text
end function

Matt Lewis

new topic » goto parent » topic index » view thread » older message » newer message

OpenEuphoria

RE: Replacing characters (Matt: bug)

Search

Include:

Quick Links

User menu

Misc Menu