Re: Interesting Experiment With String/Sequence Slicing
- Posted by slie at theage.fairfax.com.au Aug 21, 2001
- 399 views
hi Robert, i have tried as you have suggested but have came across more problems that i havent anticipated because the string that i am searching for are tag types eg. <TD> ** NAME ** </TD> and i run into problem when </TD> or parts of it occur on the next line. so i will still have to read all the text into one big sequence and then do the string slicing and chopping. i suppose i can very much live with slicing but i am just wondering could there be a function similar to slicing but only works on strings and is much faster by moving blocks of memory rather than copy one sequence element to another one at a time. however maybe thats not possible because the sequence is not represented as a set of continuous memory block. today i also came across something very interestings. example 1 for i=1 to 1000 do sLine2 = sLine sLine2 = sLine2[2..length(sLine2)] end for example 2 for i=1 to 1000 do sLine2 = sLine sLine3 = sLine2[length(sLine2)-10..length(sLine2)] end for example 2 was faster than example 1 by a factor of 100 so my thought was that example 1 was slower because it had to copy more items when doing the slicing. but then again i thought why is it copying items, as Derek had suggested in the case of example 1 we could just blank out/remove the first sequence element and return the sLine2 as is. If this case is true then the term 'slicing' could not be taken literally because it is more like a copy operation. flame me if i am wrong. regards, sam lie Down Under, Australia t = time() for i=1 to 1000 do sLine2 = sLine sLine3 = sLine2[length(sLine2)-10..length(sLine2)] ----- Original Message ----- From: "Robert Craig" <rds at RapidEuphoria.com> To: "EUforum" <EUforum at topica.com> Sent: Wednesday, August 22, 2001 2:00 AM Subject: Re: Interesting Experiment With String/Sequence Slicing > > Sam Lie writes: > > i have ran the profiler and the bottle neck still > > shows up when doing string slicing. > > I gather from your earlier private e-mail that you are reading huge > strings from a file and running match() on them. > Or perhaps the string *is* the whole file? > > Since match() starts at the beginning of a sequence, > and stops when it encounters a match, I gather > you are slicing the 50K strings to look for further matches. > > Try reading and searching one line at a time using gets(). > You won't be copying a huge string when you have a match, > and you'll make better use of the Pentium on-chip cache, > e.g. > object line > integer fn > > fn = open("myfile.html", "r") > while 1 do > line = gets(fn) > if atom(line) then > exit > end if > if match("foobar", line) then > ..... > end if > end while > close(fn) > > After doing the above, if you are still convinced that > slicing is the culprit, you could write your own match() > that locates multiple occurrences without slicing. > (There may be an example in the mailing list archives.) > > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com > > > > > > ********************************************************************************* This email and any files transmitted with it may be legally privileged and confidential. If you are not the intended recipient of this email, you must not disclose or use the information contained in it. If you have received this email in error, please notify us by return email and permanently delete the document. *********************************************************************************