1. faster deparse() for string library strtok.e
- Posted by jordah ferguson <jorfergie03 at yahoo.com> Apr 02, 2002
- 472 views
hi all, i use strtok.e to do parsing of downloaded mail. But i found out the library was terribly slow. So i decided to add Chris Benslers find_all() which was 20% faster but that didn't do the trick. I also added Andy Serpas turbo parse(). that too didn't do the trick. I decided to do time rofiling and found out what was chewing up most time and memory. I found out it was deparse(). G.boehme made a programming error that made the library take a lot of time doing nothing. Many won't notice this, cause they have fast computers. but i'm sure KAT will appreciate my modification. --**************************************************** --Test for deparse() include get.e --original derparse() global function deparse_old(sequence list, integer c) object t sequence s atom len t= time() len = length(list) if len then s = list[1] for i = 2 to len by 1 do s =s&c&list[i] ---///area of error end for printf(1,"%.32f deparse G.Boehme\n",time()-t) return s end if return "" end function ------------------------ --my version global function deparse_new(sequence list, integer c) object t sequence s t= time() if length(list) then s = list[1] for i = 2 to length(list) by 1 do s =s&c s=s&list[i] end for printf(1,"%.32f my version of deparse\n",time()-t) return s end if return "" end function ----------------------- --Andy Serpas Turbo version 3X faster global function parse(sequence s, integer c) integer slen, spt, flag sequence parsed parsed = {} slen = length(s) spt = 1 flag = 0 for i = 1 to slen do if s[i] = c then if flag = 1 then parsed = append(parsed,s[spt..i-1]) flag = 0 spt = i+1 else spt += 1 end if else flag = 1 end if end for if flag = 1 then parsed = append(parsed,s[spt..slen]) end if return parsed end function -------------------------------- global procedure wait_abort(sequence msg) integer wait printf(1, "%s\n", {msg} ) wait = wait_key() abort(1) end procedure -------------------------------- object data,buffer,test,test1,data1 buffer = {} -------------------Create a large test sequence data1={} puts(1," Starting test:\n") puts(1,"\nCreating test sequence\n") data = repeat("a ",1000) --create a large sequence for i = 1 to length(data) do data1 = data1 & data[1] -- -- }double the sequence data1 = data1 & data[i] -- end for ---------------- puts(1,"\nParsing test sequence\n") buffer = parse(data1,32) puts(1,"\nDeparsing (G.Boehme) test sequence\n") test = deparse_old(buffer,32) puts(1,"\nDeparsing (Jordah Ferguson) test sequence\n") test1 = deparse_new(buffer,32) ------------------- wait_abort(sprintf("Length of parsed sequence used for test: %d",length(buffer))) jordah Ferguson jorfergie03 at yahoo.com --Processed and saved using: Cirrus Mail ver 0.2a
2. Re: faster deparse() for string library strtok.e
- Posted by petelomax at blueyonder.co.uk Apr 02, 2002
- 435 views
On Tue, 2 Apr 2002 13:42:43 +0000, jordah ferguson <jorfergie03 at yahoo.com> wrote: That's a huge, and exponential difference. At 800, its about 100 times slower. At 4,000 it is over 1,000 times slower. Changing > s =s&c&list[i] ---///area of error to > s =s&c > s=s&list[i] should not cause such a huge difference, surely. Given that the latter finishes in 0.007 seconds, the same number of iterations for the former should definitely not take 11.6 seconds. To process a sequence 5 times as long, the latter takes as expected about 5 times as long; the former however about 47 times as long Any thoughts Rob? Pete
3. Re: faster deparse() for string library strtok.e
- Posted by petelomax at blueyonder.co.uk Apr 02, 2002
- 463 views
On Tue, 2 Apr 2002 12:47:17 -0500 , Matthew Lewis <matthewwalkerlewis at YAHOO.COM> wrote: Oh Matt, the perils & embarrassment of posting untested code... Your routine when passed {"I","did","not","test","this"} returns "Ididnottestthis" rather than "I did not test this" Anyway , works fine with: > ix += len > end for > return s[1..ix-1] replaced by: ix += len+1 end for return s[1..ix-2] I suspect there's a nasty slice error waiting there though [0..-1]; it still needs an if len/else return "" clause, I think. Also, rather than pick slen=256 I would suggest guessing the average length of a word & pre-allocating that; extending by length of this word+av.length*remaining words left if necessary. Pete