Re: Interesting Experiment With String/Sequence Slicing

new topic     » goto parent     » topic index » view thread      » older message » newer message

hi Robert,

i have tried as you have suggested but have came
across more problems that i havent anticipated
because the string that i am searching for
are tag types eg.  <TD> ** NAME ** </TD>
and i run into problem when </TD> or parts of
it occur on the next line.  so i will still have to
read all the text into one big sequence and then
do the string slicing and chopping. i suppose i can
very much live with slicing but i am just wondering
could there be a function similar to slicing but only
works on strings and is much faster by moving blocks
of memory rather than copy one sequence element
to another one at a time.  however maybe thats not possible
because the sequence is not represented as a set of
continuous memory block.   

today i also came across something very interestings.

example 1
for i=1 to 1000 do
     sLine2 = sLine
     sLine2 = sLine2[2..length(sLine2)]
end for

example 2
for i=1 to 1000 do
     sLine2 = sLine
     sLine3 = sLine2[length(sLine2)-10..length(sLine2)]
end for

example 2 was faster than example 1 by a factor of 100
so my thought was that example 1 was slower because it
had to copy more items when doing the slicing.  but then again i 
thought why is it copying items, as Derek had suggested 
in the case of example 1 we could just blank out/remove  the first 
sequence element and return the sLine2 as is.  If this case is true
then the term 'slicing' could not be taken literally because it is
more like a copy operation.  flame me if i am wrong.

regards,
sam lie
Down Under, Australia



t = time()
for i=1 to 1000 do
 sLine2 = sLine
 sLine3 = sLine2[length(sLine2)-10..length(sLine2)]


----- Original Message ----- 
From: "Robert Craig" <rds at RapidEuphoria.com>
To: "EUforum" <EUforum at topica.com>
Sent: Wednesday, August 22, 2001 2:00 AM
Subject: Re: Interesting Experiment With String/Sequence Slicing


> 
> Sam Lie writes:
> > i have ran the profiler and the bottle neck still
> > shows up when doing string slicing. 
> 
> I gather from your earlier private e-mail that you are reading huge
> strings from a file and running match() on them.
> Or perhaps the string *is* the whole file?
> 
> Since match() starts at the beginning of a sequence,
> and stops when it encounters a match, I gather
> you are slicing the 50K strings to look for further matches.
> 
> Try reading and searching one line at a time using gets().
> You won't be copying a huge string when you have a match,
> and you'll make better use of the Pentium on-chip cache,
> e.g.
>       object line
>       integer fn
> 
>       fn = open("myfile.html", "r")
>       while 1 do
>             line = gets(fn)
>             if atom(line) then
>                 exit
>             end if
>             if match("foobar", line) then
>                 .....
>             end if
>       end while
>       close(fn)
> 
> After doing the above, if you are still convinced that 
> slicing is the culprit, you could write your own match() 
> that locates multiple occurrences without slicing. 
> (There may be an example in the mailing list archives.)
> 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    http://www.RapidEuphoria.com
> 
> 
> 
> 
> 
> 



*********************************************************************************
This email and any files transmitted with it may be legally privileged 
and confidential.  If you are not the intended recipient of this email,
you must not disclose or use the information contained in it.  If you 
have received this email in error, please notify us by return email and 
permanently delete the document.
*********************************************************************************

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu