1. Looking for the best "Parse"...
- Posted by Cassidy Napoli <gonzotek at yahoo.com> Oct 25, 2001
- 410 views
I want to be able to let the user specify a string of text with variables in it that my program will replace. The goal is to put the current winamp song into a line of text. I wrote this function, which works, but after reading the "Best Join" discussion, I realized there might be a better way of doing it. So I am looking for any faster or simpler methods, if there are any. I'll gladly accept any comments on my coding style, too. I want to improve it. [begin code] include wildcard.e global function parseUserText(sequence unParsedText, sequence escape, sequence replace ) sequence ParsedText integer varPos varPos = match(lower(escape), lower(unParsedText)) if not varPos then return unParsedText elsif varPos = (length(unParsedText) - length(escape) + 1 ) then ParsedText = unParsedText[1..(length(unParsedText) - length(escape))] & replace return parseUserText(ParsedText, escape, replace) else if varPos = 1 then ParsedText = replace & unParsedText[(length(escape) + 1)..length(unParsedText)] return parseUserText(ParsedText, escape, replace) else ParsedText = unParsedText[1..(varPos - 1)] & replace & unParsedText[(varPos + (length(escape)))..length(unParsedText)] return parseUserText(ParsedText, escape, replace) end if end if end function --ex. text = parseUserText("The $A jumped over the lazy dog.","$A","quick brown fox") puts (1, parseUserText("The $A jumped over the lazy dog.","$A","quick brown fox")) [end code]
2. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 25, 2001
- 388 views
3. Re: Looking for the best "Parse"...
- Posted by Jiri Babor <jbabor at PARADISE.NET.NZ> Oct 25, 2001
- 390 views
Hi, Cassidy, the function below is probably faster, because it avoids both recursion and repeated conversions of the same text to lower case. But you can gain more by simply replacing Rob's elegant, but inefficient lower() function in wildcard.e by something like this: constant DIFF = 'a' - 'A' global function lower(object x) integer c if atom(x) then if x >= 'A' then if x <= 'Z' then return x + DIFF end if end if else for i = 1 to length(x) do c = x[i] if c >= 'A' then if c <= 'Z' then x[i] += DIFF end if end if end for end if return x end function (Note: I did not use 'short-circuit' evaluation, simply because it is much slower than its equivalent. I hope Rob can fix that soon.) function parse(sequence text, sequence esc, sequence replace) sequence e, r, s integer m, n, i, j, p e = lower(esc) n = length(esc) s = lower(text) i = 0 r = {} j = match(e, s) while j do r = (i + j) & r i += j + n - 1 s = s[j+n..length(s)] j = match(e, s) end while m = length(r) if m then for k = 1 to m do p = r[k] text = text[1..p-1] & replace & text[p+n..length(text)] end for end if return text end function --jiri ----- Original Message ----- From: "Cassidy Napoli" <gonzotek at yahoo.com> To: "EUforum" <EUforum at topica.com> Sent: Friday, 26 October 2001 03:39 Subject: Looking for the best "Parse"... > > I want to be able to let the user specify a string of text with > variables in it that my program will replace. The goal is to put the > current winamp song into a line of text. I wrote this function, which > works, but after reading the "Best Join" discussion, I realized there > might be a better way of doing it. So I am looking for any faster or > simpler methods, if there are any. I'll gladly accept any comments on > my coding style, too. I want to improve it. > > [begin code] > include wildcard.e > > global function parseUserText(sequence unParsedText, sequence escape, > sequence replace ) > sequence ParsedText > integer varPos > varPos = match(lower(escape), lower(unParsedText)) > if not varPos then > return unParsedText > elsif varPos = (length(unParsedText) - length(escape) + 1 ) then > ParsedText = unParsedText[1..(length(unParsedText) - > length(escape))] & replace > return parseUserText(ParsedText, escape, replace) > else > if varPos = 1 then > ParsedText = replace & unParsedText[(length(escape) + > 1)..length(unParsedText)] > return parseUserText(ParsedText, escape, replace) > else > ParsedText = unParsedText[1..(varPos - 1)] & replace & > unParsedText[(varPos + (length(escape)))..length(unParsedText)] > return parseUserText(ParsedText, escape, replace) > end if > end if > end function > --ex. text = parseUserText("The $A jumped over the lazy > dog.","$A","quick brown fox") > > puts (1, parseUserText("The $A jumped over the lazy dog.","$A","quick > brown fox")) > [end code] > > >
4. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 25, 2001
- 397 views
5. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 26, 2001
- 391 views
Hello Cassidy Napoli, Just once more iteration with your program and your very interesting *recursive* idea, which you can make with the *delete doubl* method to get a simple and understandable code. Try please: --[begin code] include wildcard.e global function parseUserText (sequence u, --unParsedText -- u sequence e, --escape -- e sequence r) --replace -- r -- sequence p --ParsedText -- p integer v --varPos -- v integer m -- le integer n -- lu integer k -- v = match(lower(e), lower(u)) -- add explanations in your if not v then return u end if -- native language and in English -- m = length(e) -- place for explanation n = length(u) -- place for explanation k = n - m -- place for explanation if v = 1 -- place for explanation then p = r & u[m+1..n] -- place for explanation elsif v = 1 + k -- place for explanation then p = u[1..k] & r -- place for explanation else p = u[1..v-1] & r & u[v+m..n] -- place for explanation end if -- place for explanation return parseUserText(p,e,r) -- place for explanation end function sequence text text = parseUserText("The $A jumped over the lazy dog.","$A","quick brown fox") puts (1, text) --[end code] The speed is another question, it requires the bench-testings on monotasking OS, in plain Dos-32 mode, for example. Regards, Igor Kachan kinz at peterlink.ru
6. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 26, 2001
- 406 views
7. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 26, 2001
- 405 views
Ok Cassidy, I think, my part of the work is near the finish, so please see your function in the usual compact form: -- program parse.ex include wildcard.e global function parseUserText(sequence u, sequence e, sequence r) sequence p integer v,m,n,k v = match(lower(e), lower(u)) if not v then return u end if m = length(e) n = length(u) k = n - m if v = 1 then p = r & u[m+1..n] elsif v = 1 + k then p = u[1..k] & r else p = u[1..v-1] & r & u[v+m..n] end if return parseUserText(p,e,r) end function sequence text text = parseUserText("The $A jumped over the lazy dog.","$A","quick brown fox") puts (1, text) --end of program When I was testing the last variant, there was misprint in the code, and recursive construction run out of memory on my machine without other messages about bug. So, code with recursions are very nice and compact, but it is some more dangerous for the stability. Good luck, I'll be off line some time, so there may be delay with my answers, if you have questions to me personally. Regards, Igor Kachan kinz at peterlink.ru
8. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 26, 2001
- 390 views
9. Re: Looking for the best "Parse"...
- Posted by Jiri Babor <jbabor at PARADISE.NET.NZ> Oct 26, 2001
- 410 views
Cassidy wrote: >Jiri - Could you explain more about what is happening using your >method? I'm not exactly sure I follow it, what makes it better than >recursion? I'm willing to take it as a matter of faith that it's >faster; even though I'm not concerned with marginal speed differences >for this particular program, I'll store the knowledge away for when I >AM concerned with speed. Ok. This kind of recursion is relatively easy to 'flatten' (or 'unroll') into a plain iterative structure. Recursive solutions are usually very appealing, but tend to be expensive both in terms of memory and in used run time. Euphoria is very good in this respect, but even in your case a 'quick and dirty' bench mark will show you your solution is probably twice or three time slower than mine. As I mentioned earlier, your recursive function is also handicapped by the repeated use of the case conversion function - it is applied to both the text and the 'escape' sequence every time your formula is re-entered. Now back to my solution. Sorry, originally I did not use any comments, because I thought it was quite obvious what I was doing... The repeat of my function below has line numbers indicated so that I can refer to them. In lines 1 and 2 I just declare a few private variables. In lines 5 and 7 I create lower case copies of input parameters, once only ;). The length of the escape sequence is stored in 'n', because it will be used repeatedly, and we do not want to waste time by calculating it again and again. 'i = 0' in line 8 indicates we have not yet removed anything from the front, start of text string - that will come later. 'r' in line 9 is a tricky customer, as you will see later. It is used to store indices (locations) of all matches within the text string. In line 10 we are looking for the first match. If we find one, we enter the 'while' loop, and in line 12 we store the index of the first character of the match in our above mentioned 'r' sequence. This is the place, where a bit of elegance is pumped in: the indices are stored in *reversed* order, effectively *prepended*. In line 13 we calculate how many characters of the original text we have surveyed so far, and in line 14 we simply chop them off. In line 15 we are looking for the next match in the reduced string. When all the matches have been found, we count them in line 17. In lines 18 to 23 we replace every found occurrence of the escape sequence with our replace string. Just note, because we cleverly stored replacement locations in reversed order, we can now use a simple 'for' loop to do the replacement from the *back* of the string, so that the remaining indices in 'r' are *not* affected by the changes of length of the original string as the replacements progress. I hope this verbal diarrhoea is of some use to some one. jiri 1: function parse(sequence text, sequence esc, sequence replace) 2: sequence e, r, s 3: integer m, n, i, j, p 4: 5: e = lower(esc) 6: n = length(esc) 7: s = lower(text) 8: i = 0 9: r = {} 10: j = match(e, s) 11: while j do 12: r = (i + j) & r 13: i += j + n - 1 14: s = s[j+n..length(s)] 15: j = match(e, s) 16: end while 17: m = length(r) 18: if m then 19: for k = 1 to m do 20: p = r[k] 21: text = text[1..p-1] & replace & text[p+n..length(text)] 22: end for 23: end if 24: return text 25: end function
10. Re: Looking for the best "Parse"...
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 26, 2001
- 390 views
This is a multi-part message in MIME format. ------=_NextPart_000_0030_01C15EBD.5B54D280 charset="Windows-1252" Here is my offering plus some timing tests. I have opted for a non-recursive approach with minimum sequence/slice operations. ---- Derek. ------=_NextPart_000_0030_01C15EBD.5B54D280 Content-Type: application/octet-stream; name="parse.e" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="parse.e" with trace include wildcard.e include get.e global function cparse(sequence unParsedText, sequence escape, sequence replace ) sequence ParsedText integer varPos varPos =3D match(lower(escape), lower(unParsedText)) if not varPos then return unParsedText elsif varPos =3D (length(unParsedText) - length(escape) + 1 ) then ParsedText =3D unParsedText[1..(length(unParsedText) -=20 length(escape))] & replace return cparse(ParsedText, escape, replace) else if varPos =3D 1 then ParsedText =3D replace & unParsedText[(length(escape) +=20 1)..length(unParsedText)] return cparse(ParsedText, escape, replace) else ParsedText =3D unParsedText[1..(varPos - 1)] & replace &=20 unParsedText[(varPos + (length(escape)))..length(unParsedText)] return cparse(ParsedText, escape, replace) end if end if end function function jparse(sequence text, sequence esc, sequence replace) sequence e, r, s integer m, n, i, j, p e =3D lower(esc) n =3D length(esc) s =3D lower(text) i =3D 0 r =3D {} j =3D match(e, s) while j do r =3D (i + j) & r i +=3D j + n - 1 s =3D s[j+n..length(s)] j =3D match(e, s) end while m =3D length(r) if m then for k =3D 1 to m do p =3D r[k] text =3D text[1..p-1] & replace & text[p+n..length(text)] end for end if return text end function -- Assumption: All sequences contain ASCII text. -- Each element is an integer from 0 to 255. global function dparse(sequence text, sequence esc, sequence replace) sequence t,ltext integer pos, -- posn of esc in text epos, -- end of text in t spos, -- pos in t to copy to ppos -- previous 'pos' value -- Do a case-insensitive compare esc =3D lower(esc) ltext =3D lower(text) -- Find the first match pos =3D match(esc, ltext) -- If no matches, return the original text if pos =3D 0 then return text end if -- Create a sequence long enough to hold the result ppos =3D length(text) + floor(1 + = (length(replace)*length(text)/length(esc))) t =3D repeat(-1, ppos) -- Initialise the position holders epos =3D 0 spos =3D 0 ppos =3D 0 -- Keep making replacements until no more matches while pos do -- Copy all the text up to the current match -- Note special case ('cos Eu doesn't like slices starting with = 0) if pos !=3D 1 then epos =3D spos + pos - ppos - 1 t[spos .. epos] =3D ltext[ppos .. pos-1] end if -- Hide the current match by changing its value to a non-text = value, -- this way the next iteration won't find it again. ltext[pos] =3D 256 -- Skip over the escape text ppos =3D pos + length(esc) -- Copy the replacement text into the result spos =3D epos + 1 epos =3D epos + length(replace) t[spos .. epos] =3D replace -- Skip over the replacement text in the result area spos =3D epos + 1 -- Look for the next match pos =3D match(esc, ltext) end while -- Return only the copied text, as there could be some residual -- junk in the result area. return t[1..epos] end function global function parseUserText(sequence u, sequence e, sequence r) sequence p integer v,m,n,k v =3D match(lower(e), lower(u)) if not v then return u end if m =3D length(e) n =3D length(u) k =3D n - m if v =3D 1 then=20 p =3D r & u[m+1..n] elsif v =3D 1 + k then=20 p =3D u[1..k] & r else=20 p =3D u[1..v-1] & r & u[v+m..n] end if return parseUserText(p,e,r) end function =20 atom e atom c integer p constant rtns =3D {routine_id("cparse"), routine_id("dparse"), routine_id("jparse"), routine_id("parseUserText") } constant names =3D{"cassidy's", "derek's", "jiri's", "igor's" } without warning procedure null(sequence s) end procedure p =3D 5 puts(1, "parse Test starts\n") for i =3D 1 to length(rtns) do c =3D 0 e =3D time() + p while e > time() do null(call_func(rtns[i],{"$A one two $A three $A", "$A", "q"})) null(call_func(rtns[i],{"$A one two $A three $A", "$A", "qxz"})) null(call_func(rtns[i],{"$A one two $A three $A", "$A", ""})) null(call_func(rtns[i],{"$A one two $A three $A", "$B", "$A"})) null(call_func(rtns[i],{"$ABC one two $ABC three $ABC", "$ABC", "1"})) c+=3D1 end while printf(1,"%s did %d iterations in %d seconds\n", {names[i],c, p}) end for if platform() =3D 2 then puts(1, "END\n") c =3D wait_key() end if ------=_NextPart_000_0030_01C15EBD.5B54D280--
11. Re: Looking for the best "Parse"...
- Posted by euman at bellsouth.net Oct 26, 2001
- 395 views
I dont think these multi-run test are any good...... What Im saying is that often times the second pass will be 5 times faster than the first. Anyone want to comment on this? Euman euman at bellsouth.net > > Here is my offering plus some timing tests. > > I have opted for a non-recursive approach with minimum sequence/slice > operations. > > ---- > Derek. > > >
12. Re: Looking for the best "Parse"...
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 26, 2001
- 391 views
----- Original Message ----- From: <euman at bellsouth.net> To: "EUforum" <EUforum at topica.com> Subject: Re: Looking for the best "Parse"... > I dont think these multi-run test are any good...... > > What Im saying is that often times the second pass will be 5 times faster than the first. > > Anyone want to comment on this? I have noticed this also. It has to do with garbage collection processes. In order to compensate for this phenomena, I have styled the tests using fake calls to routines which seem to trick the garbage collection. To demonstrate, I swapped the order of the routines around and still get the same sort of results. However, I have been reminded never to write code while having breakfast, and while getting the kids ready for club tennis competitions - there is a major mistake in the routine I sent out. It doesn't return any text after the last match! Ooops. Below is the corrected routine. -------------- -- Assumption: All sequences contain ASCII text. -- That is, each element is an integer from 0 to 255. global function dparse(sequence text, sequence esc, sequence replace) sequence t,ltext integer pos, -- posn of esc in text epos, -- end of text in t spos, -- pos in t to copy to ppos -- previous 'pos' value -- Do a case-insensitive compare esc = lower(esc) ltext = lower(text) -- Find the first match pos = match(esc, ltext) -- If no matches, return the original text if pos = 0 or length(esc) = 0 then return text end if -- Create a sequence long enough to hold the result ppos = length(text) + floor(1 + (length(replace)*length(text)/length(esc))) t = repeat(-1, ppos) -- Initialise the position holders epos = 0 spos = 0 ppos = 0 -- Keep making replacements until no more matches while pos do -- Copy all the text up to the current match -- Note special case ('cos Eu doesn't like slices starting with 0) if pos != 1 then epos = spos + pos - ppos - 1 t[spos .. epos] = ltext[ppos .. pos-1] end if -- Hide the current match by changing its value to a non-text value, -- this way the next iteration won't find it again. ltext[pos] = -1 -- Skip over the escape text ppos = pos + length(esc) -- Copy the replacement text into the result spos = epos + 1 epos = epos + length(replace) t[spos .. epos] = replace -- Skip over the replacement text in the result area spos = epos + 1 -- Look for the next match pos = match(esc, ltext) end while -- Copy any left over text if ppos <= length(ltext) then pos = length(ltext) epos = spos + pos - ppos t[spos .. epos] = ltext[ppos .. pos] end if -- Return only the copied text, as there could be some residual -- junk in the result area. return t[1..epos] end function -------------- Some more observations, if I may. The recursive solution presented so far have a problem when the replacement text contains the escape text. They go into a never ending loop. They choke on this sort of thing.. res = parse("one$Atwo", "$A", "three$Afour") Secondly, these sort of routines would go a lot faster and be coded a lot more simply if RDS provided a variation to match() and find() that allowed one to specify the starting offset. Such as ... pos = match_from(",", "abc,def,ghi,klm", 5) would start at the fifth element and return 8. Finally, it would be nice if RDS could treat zero-length slices as a NOP, regardless of what the actual start and end pos values are. --- Derek.
13. Re: Looking for the best "Parse"...
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 26, 2001
- 405 views
Okay, I'm having a bad head day. Did anyone else notice the next mistake in my code? I only handles text which starts with the escape text. Anyhow, to fix this mistake, replace the lines ... -- Initialise the position holders epos = 0 spos = 0 ppos = 0 with ... -- Initialise the position holders epos = 0 spos = 1 ppos = 1 ------ cheers, Derek.
14. Re: Looking for the best "Parse"...
- Posted by Jiri Babor <jbabor at PARADISE.NET.NZ> Oct 27, 2001
- 401 views
Derek, I hope your day improved after the correction flurry. And since your girls beat our girls at netball again tonight, I had to try a little harder to compensate. I think the simplified function below is just marginally faster than yours. But do not lose any sleep over it... jiri function jparse2(sequence s, sequence s1, sequence s2) -- replace every occurrence of s1 in s by s2 -- case insensitive search sequence os,u,v integer i,j,m os = {} u = lower(s) v = lower(s1) m = length(s1) i = match(v, u) j = 1 while i do os &= s[j..j+i-2] & s2 j += i+m-1 u = u[i+m..length(u)] i = match(v, u) end while return os & s[j..length(s)] end function ----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: "EUforum" <EUforum at topica.com> Sent: Saturday, 27 October 2001 12:34 Subject: Re: Looking for the best "Parse"... > > Okay, I'm having a bad head day. Did anyone else notice the next mistake in > my code? I only handles text which starts with the escape text. Anyhow, to > fix this mistake, replace the lines ... > > -- Initialise the position holders > epos = 0 > spos = 0 > ppos = 0 > > with ... > -- Initialise the position holders > epos = 0 > spos = 1 > ppos = 1 > > ------ > cheers, > Derek.
15. Re: Looking for the best "Parse"...
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 27, 2001
- 398 views
Beautiful. Fast and simple. Jiri, you da man. No sleep lost. Go Aussies! BTW, my wife is over there cycling around the Nelson area for another week. She went to a conference in Christchurch. Took her bike over. Now doing some touring. --- cheers, Derek. ----- Original Message ----- From: "Jiri Babor" <jbabor at PARADISE.NET.NZ> To: "EUforum" <EUforum at topica.com> Sent: Saturday, October 27, 2001 10:15 PM Subject: Re: Looking for the best "Parse"... > > Derek, I hope your day improved after the correction flurry. And > since your girls beat our girls at netball again tonight, I had to > try a little harder to compensate. I think the simplified function > below is just marginally faster than yours. But do not lose any sleep > over it... > > jiri > > > function jparse2(sequence s, sequence s1, sequence s2) > -- replace every occurrence of s1 in s by s2 > -- case insensitive search > sequence os,u,v > integer i,j,m > > os = {} > u = lower(s) > v = lower(s1) > m = length(s1) > i = match(v, u) > j = 1 > while i do > os &= s[j..j+i-2] & s2 > j += i+m-1 > u = u[i+m..length(u)] > i = match(v, u) > end while > return os & s[j..length(s)] > end function > > > ----- Original Message ----- > From: "Derek Parnell" <ddparnell at bigpond.com> > To: "EUforum" <EUforum at topica.com> > Sent: Saturday, 27 October 2001 12:34 > Subject: Re: Looking for the best "Parse"... > > > > Okay, I'm having a bad head day. Did anyone else notice the next > mistake in > > my code? I only handles text which starts with the escape text. > Anyhow, to > > fix this mistake, replace the lines ... > > > > -- Initialise the position holders > > epos = 0 > > spos = 0 > > ppos = 0 > > > > with ... > > -- Initialise the position holders > > epos = 0 > > spos = 1 > > ppos = 1 > > > > ------ > > cheers, > > Derek. > > >
16. Re: Looking for the best "Parse"...
- Posted by Jiri Babor <jbabor at PARADISE.NET.NZ> Oct 27, 2001
- 391 views
Clever girl. Nelson is my favourite provincial town. I like it best when the cherries are ripening, at around Christmas. But I suppose, spring is better, probably ideal, for touring. I hope she will venture into the Tasman National Park, a real jewel. Good night for now, it's almost 3 o'clock in the morning... jiri ----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: "EUforum" <EUforum at topica.com> Subject: Re: Looking for the best "Parse"... > > Beautiful. Fast and simple. Jiri, you da man. > > No sleep lost. Go Aussies! > > BTW, my wife is over there cycling around the Nelson area for another week. > She went to a conference in Christchurch. Took her bike over. Now doing some > touring. > --- > cheers, > Derek.
17. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 27, 2001
- 388 views
Ýòî ñîîáùåíèå â ôîðìàòå MIME ñîñòîèò èç íåñêîëüêèõ ÷àñòåé. ------=_NextPart_000_01C15F3C.6EFF8300 Hi Derek, Hi Jiri, I have prepared new variant of this parse play. I apply the "delete double" method to jparse2() function. This new iteration is in the attachment with my own bench. You can see my results below. These are times of 10000 iterations on my a 386 in plain Dos-32 mode jparse2() jparse3() -- new 21.59 21.48 21.59 21.48 21.64 21.47 21.59 21.47 21.59 21.47 21.58 21.48 21.58 21.53 21.59 21.53 21.59 21.47 ----------------------------------average 21.59 21.48 Note, I don't know algorithms well enough, both Cassidy's and any new, I just read about that fox, which jumps over lazy dog, to see if there is misprint in my new iterations to the best function or no there is not. I am not real author of any task and solution in this play, I just search for doubled things and try delete them with the help of Euphoria's inner power, so my name and these tasks and functions are not well compatible things, I think. And "delete double" method is an old one, same as ... I don't know as. If you'll see Colin's solution near the modifired Cassidy's function, you can see Colin just apply "double delete" method to that recursive solution. You can also to apply this "delete double" method to Cassidy's *test frase* and delete mistakes from that classic "delete double" test OK to finish this subject for the robast Eu Dos-32 programming ? I say, there is OK with humor here, no ? Regards, Igor Kachan kinz at peterlink.ru ------=_NextPart_000_01C15F3C.6EFF8300 Content-Type: application/octet-stream; name="Parse1.e" Content-Transfer-Encoding: 7bit Content-Description: Parse1.e (E ) Content-Disposition: attachment; filename="Parse1.e" include wildcard.e function jparse2(sequence s, sequence s1, sequence s2) -- replace every occurrence of s1 in s by s2 -- case insensitive search sequence os,u,v integer i,j,m os = {} u = lower(s) v = lower(s1) m = length(s1) i = match(v, u) j = 1 while i do os &= s[j..j+i-2] & s2 j += i+m-1 u = u[i+m..length(u)] i = match(v, u) end while return os & s[j..length(s)] end function integer k,i,j,m sequence os,u,v function jparse3(sequence s, sequence s1, sequence s2) -- replace every occurrence of s1 in s by s2 -- case insensitive search -- sequence os,u,v -- integer k,i,j,m os = {} u = lower(s) v = lower(s1) m = length(s1) j = 1 for t=1 to 111111111 do i = match(v, u) if not i then exit end if k = i+m os &= s[j..j+i-2] & s2 j += k-1 u = u[k..length(u)] end for return os & s[j..length(s)] end function sequence text text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox") puts (1, text & "\n") atom T T=time() for p=0 to 10000 do text = jparse2 ("The $A jumped over the lazy dog.","$A","quick brown fox") end for T=time()-T ? T T=time() for p=0 to 10000 do text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox") end for T=time()-T ?T ------=_NextPart_000_01C15F3C.6EFF8300--
18. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 27, 2001
- 387 views
Hi again, I have noticed, that the useful thing may be new loop in Euphoria, it is dangerous thing, but very handy for speed and delete double. just do -- code if condition then exit end if -- code end do Regards, Igor Kachan kinz at peterlink.ru
19. Re: Looking for the best "Parse"...
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 27, 2001
- 401 views
----- Original Message ----- From: "Igor Kachan" <kinz at peterlink.ru> To: "EUforum" <EUforum at topica.com> Subject: Re: Looking for the best "Parse"... > I have noticed, that the useful > thing may be new loop in Euphoria, > it is dangerous thing, but very > handy for speed and delete double. > > just do > -- code > if condition then exit end if > -- code > end do > Yes, a simple loop construct would be handy. The standard method of doing this... while 1 do . . . end while introduces an extra comparison for each iteration. An alternative syntax of ... repeat do . . . end repeat might not be such a bad idea. ------- Derek
20. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 27, 2001
- 400 views
Yes, Derek, *just* is not good and your variant is more similar to Euphoria's syntax. > Yes, a simple loop construct would be handy. The standard method of doing > this... > > while 1 do > . . . > end while > > introduces an extra comparison for each iteration. An alternative syntax of > ... > > repeat do > . . . > end repeat > > might not be such a bad idea. > ------- > Derek Regards, Igor Kachan kinz at peterlink.ru
21. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 27, 2001
- 389 views
22. Re: Looking for the best "Parse"...
- Posted by Robert Craig <rds at RapidEuphoria.com> Oct 27, 2001
- 394 views
Derek Parnell writes: > The standard method of doing this... > > while 1 do > . . . > end while > > introduces an extra comparison for each iteration. Actually, both the interpreter and the translator optimize this case, so no comparison happens at run-time. There's a simple unconditional jump from the end of the loop back to the first statement inside the loop. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
23. Re: Looking for the best "Parse"...
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 27, 2001
- 392 views
Thank you, Robert. That is a useful bit of information. I'm will now be more inclined to do try this. Maybe then, a mention in the documentation for this special case might be in order. ----- Original Message ----- From: "Robert Craig" <rds at RapidEuphoria.com> To: "EUforum" <EUforum at topica.com> Subject: Re: Looking for the best "Parse"... > > Derek Parnell writes: > > The standard method of doing this... > > > > while 1 do > > . . . > > end while > > > > introduces an extra comparison for each iteration. > > Actually, both the interpreter and the translator optimize > this case, so no comparison happens at run-time. > There's a simple unconditional jump from the end of the loop > back to the first statement inside the loop. > > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com > > >
24. Re: Looking for the best "Parse"...
- Posted by Igor Kachan <kinz at peterlink.ru> Oct 28, 2001
- 399 views
Ýòî ñîîáùåíèå â ôîðìàòå MIME ñîñòîèò èç íåñêîëüêèõ ÷àñòåé. ------=_NextPart_000_01C15FF0.2DDE07C0 Derek wrote: > Thank you, Robert. That is a useful bit of information. > I'm will now be more inclined to do try this. Maybe then, > a mention in the documentation for this > special case might be in order. Hi Derek, I have tested the jparse4() with that extremly fast loop just now. Loop while 1 do . . . end while really works as I want, so I do not want new loop in Euphoria now. Thanks, Robert, from me, sorry, I didn't know about this feature. My results with jparse2(), jparse3(), jparse4() are below: (times of 10000 iterations in plain Dos-32 mode on a 386) jparse2() jparse3() jparse4() 21.58 21.48 21.31 21.59 21.47 21.37 21.59 21.48 21.31 21.59 21.48 21.36 But remember please, I don't know any algorithms, implemented in these functions, and thanks to Cassidy for such useful his task and for his working solution. Regards, Igor Kachan kinz at peterlink.ru ------=_NextPart_000_01C15FF0.2DDE07C0 Content-Type: application/octet-stream; name="Parse2.ex" Content-Transfer-Encoding: 7bit Content-Description: Parse2.ex (EX ) Content-Disposition: attachment; filename="Parse2.ex" include wildcard.e function jparse2(sequence s, sequence s1, sequence s2) -- replace every occurrence of s1 in s by s2 -- case insensitive search sequence os,u,v integer i,j,m os = {} u = lower(s) v = lower(s1) m = length(s1) i = match(v, u) j = 1 while i do os &= s[j..j+i-2] & s2 j += i+m-1 u = u[i+m..length(u)] i = match(v, u) end while return os & s[j..length(s)] end function integer k,i,j,m sequence os,u,v function jparse3(sequence s, sequence s1, sequence s2) -- replace every occurrence of s1 in s by s2 -- case insensitive search -- sequence os,u,v -- integer k,i,j,m os = {} u = lower(s) v = lower(s1) m = length(s1) j = 1 for t=1 to 111111111 do i = match(v, u) if not i then exit end if k = i+m os &= s[j..j+i-2] & s2 j += k-1 u = u[k..length(u)] end for return os & s[j..length(s)] end function function jparse4(sequence s, sequence s1, sequence s2) -- replace every occurrence of s1 in s by s2 -- case insensitive search -- sequence os,u,v -- integer k,i,j,m os = {} u = lower(s) v = lower(s1) m = length(s1) j = 1 while 1 do i = match(v, u) if not i then exit end if k = i+m os &= s[j..j+i-2] & s2 j += k-1 u = u[k..length(u)] end while return os & s[j..length(s)] end function sequence text text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox") puts (1, text & "\n") atom T T=time() for p=0 to 10000 do text = jparse2 ("The $A jumped over the lazy dog.","$A","quick brown fox") end for T=time()-T ? T T=time() for p=0 to 10000 do text = jparse3 ("The $A jumped over the lazy dog.","$A","quick brown fox") end for T=time()-T ?T T=time() for p=0 to 10000 do text = jparse4 ("The $A jumped over the lazy dog.","$A","quick brown fox") end for T=time()-T ?T ------=_NextPart_000_01C15FF0.2DDE07C0--