1. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 14, 2002
- 433 views
Coders often have to slice off the head of a sequence after find()ing some specific item, so that the next find() is able to locate the next occurrence of the item. Life would be SO MUCH easier if there was a findfrom() routine in which we could specify the starting index. This would eliminate a whole class of slicing operations. eg. s = "a,i,ab,am,an,as,at,abe,..." t = ',' x = findfrom(1, t, s ) while x != 0 do ... x = findfrom(x + 1, t, s) end while ------ Derek. -----Original Message----- From: Robert Craig [mailto:rds at RapidEuphoria.com] Sent: Friday, 15 March 2002 4:47 To: EUforum Subject: Re: Sequence Slicing (Was RE: Tough ciphers?) C.K. Lester writes: > Anybody got any tips, tricks, or hints about faster slicing of > sequences? > > They are of the pattern > > x = x[a+1..length(x)] > > so, I'm assigning a slice of x back to x. > Should I assign it to another variable instead? In some cases it will be faster if you assign it to itself. Given the right conditions (single reference count, fairly large slice), it will not copy anything. It will simply adjust the start and end pointers of the sequence. > What can I do to make a significant difference, if anything? Obviously you are doing this slice statement many times inside some kind of loop. You'd have to ask yourself if that's really necessary. I don't think we can help you without knowing how this fits into your overall algorithm. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
2. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by "C. K. Lester" <cklester at yahoo.com> Mar 15, 2002
- 429 views
Derek Parnell wrote: > Life would be SO MUCH easier if there was a findfrom() routine > in which we could specify the starting index. This would > eliminate a whole class of slicing operations. Derek, doesn't the following do what you want? s = "a,i,ab,am,an,as,at,abe,..." t = ',' x = find(t, s) lastx = x while x != 0 do -- ... x = find(t, s[lastx..length(s)]) lastx += x end while Something like that... I actually tried this... finding on a slice of the sequence... but it didn't seem to work any faster. :(
3. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by bensler at mail.com Mar 15, 2002
- 425 views
The problem that Derek is pointing out, is that you still have to slice your sequence to do consecutive finds. Chris C. K. Lester wrote: > > Derek Parnell wrote: > > Life would be SO MUCH easier if there was a findfrom() routine > > in which we could specify the starting index. This would > > eliminate a whole class of slicing operations. > > Derek, doesn't the following do what you want? > > s = "a,i,ab,am,an,as,at,abe,..." > t = ',' > x = find(t, s) > lastx = x > while x != 0 do > -- ... > x = find(t, s[lastx..length(s)]) > lastx += x > end while > > Something like that... > > I actually tried this... finding on a slice of the sequence... but it > didn't seem to work any faster. :( > >
4. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by "C. K. Lester" <cklester at yahoo.com> Mar 15, 2002
- 411 views
bensler at mail.com wrote: > The problem that Derek is pointing out, is that you still > have to slice your sequence to do consecutive finds. I guess what he's suggesting is "give us an optimized find on a slice," right? Rob, why not?! :)
5. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by Kat <gertie at PELL.NET> Mar 15, 2002
- 400 views
On 15 Mar 2002, at 19:21, C. K. Lester wrote: > > bensler at mail.com wrote: > > The problem that Derek is pointing out, is that you still > > have to slice your sequence to do consecutive finds. > > I guess what he's suggesting is "give us an optimized find on a slice," > right? > > Rob, why not?! :) Yeas, mirc has this!: $pos(text,string,N)Kat
6. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by bensler at mail.com Mar 15, 2002
- 417 views
This is fast and works on any slice of a sequence. Sanity checks could be added. function range_find(object x, sequence s, integer pos1, integer pos2) atom found found = 0 while not found and pos1 <= pos2 do found = equal(x,s[pos1]) * pos1 pos1 +=1 end while return found end function This is the test that I used. I wasn't concerned with accuracy, because the routine beats find() by a long shot. <TEST> include file.e integer fn,fl fn = open("words.txt","rb") fl = seek(fn,-1) fl = where(fn) if seek(fn,0) then end if sequence temp temp = repeat(0,fl) for i = 1 to fl do temp[i] = getc(fn) end for close(fn) function range_find(object x, sequence s, integer pos1, integer pos2) atom found found = 0 while not found and pos1 <= pos2 do found = equal(x,s[pos1]) * pos1 pos1 +=1 end while return found end function atom t t=time() integer count count = 0 integer f,found f=0 found=find('\n',temp) while found != 0 do count +=1 f = found+1 found = range_find('\n',temp,f,length(temp)) end while ? f ? count ? time()-t while get_key()=-1 do end while <END TEST> The control test used: found = find('\n',temp[f..length(temp)] instead of: found = range_find('\n',temp,f,length(temp)) I didn't try with shorter sequences. Chris C. K. Lester wrote: > bensler at mail.com wrote: > > The problem that Derek is pointing out, is that you still > > have to slice your sequence to do consecutive finds. > > I guess what he's suggesting is "give us an optimized find on a slice," > right? > > Rob, why not?! :) > >
7. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by Rod Jackson <rodjackson_x at hotmail.com> Mar 15, 2002
- 426 views
??? Am I reading this right? Are you saying that this routine will actually find elements in a sequence faster than taking a slice of the sequence and using find() to search the slice? Rod Jackson bensler at mail.com wrote: > This is fast and works on any slice of a sequence. Sanity checks could > be added. > > function range_find(object x, sequence s, integer pos1, integer pos2) > atom found > found = 0 > while not found and pos1 <= pos2 do > found = equal(x,s[pos1]) * pos1 > pos1 +=1 > end while > return found > end function > > This is the test that I used. I wasn't concerned with accuracy, because > the routine beats find() by a long shot. > > <TEST> <snip> > <END TEST> > > The control test used: > found = find('\n',temp[f..length(temp)] > > instead of: > found = range_find('\n',temp,f,length(temp))
8. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by bensler at mail.com Mar 15, 2002
- 399 views
If I did my test correctly, and you're repetitively searching slices of a sequence, yes. Because of your question, I did a bit of further testing. I've discovered, the turning point between the benefit of find() vs range_find() is somewhere between 200 to 300 elements. At 300 elements, iterated 1000000/fl times, range_find() beats find() with 2.09 vs 2.58 consistently. The higher the number of elements, the larger the difference. I haven't tested with nested sequences, or sequences with varied data. Here are both complete tests: <CONTROL TEST> constant fl=300 integer fn fn = open("words.txt","rb") sequence temp temp = repeat(0,fl) for i = 1 to fl do temp[i] = getc(fn) end for close(fn) atom t t=time() integer count count = 0 integer f,found for i = 1 to 10000000/fl do f=1 found=find('\n',temp) while found != 0 do count +=1 f +=found found = find('\n',temp[f..length(temp)]) end while end for ? f ? count ? time()-t while get_key()=-1 do end while <END CONTROL TEST> <range_find() TEST> constant fl = 300 integer fn fn = open("words.txt","rb") sequence temp temp = repeat(0,fl) for i = 1 to fl do temp[i] = getc(fn) end for close(fn) function range_find(object x, sequence s, integer pos1, integer pos2) atom found found = 0 while not found and pos1 <= pos2 do found = equal(x,s[pos1]) * pos1 pos1 +=1 end while return found end function atom t t=time() integer count count = 0 integer f,found for i =1 to 10000000/fl do f=0 found=find('\n',temp) while found != 0 do count +=1 f =found+1 found = range_find('\n',temp,f,length(temp)) end while end for ? f ? count ? time()-t while get_key()=-1 do end while <END range_find() TEST> Chris Rod Jackson wrote: > ??? > > Am I reading this right? Are you saying that this routine > will actually find elements in a sequence faster than > taking a slice of the sequence and using find() to search > the slice? > > Rod Jackson > > > bensler at mail.com wrote: > > This is fast and works on any slice of a sequence. Sanity checks could > > be added. > > > > function range_find(object x, sequence s, integer pos1, integer pos2) > > atom found > > found = 0 > > while not found and pos1 <= pos2 do > > found = equal(x,s[pos1]) * pos1 > > pos1 +=1 > > end while > > return found > > end function > > > > This is the test that I used. I wasn't concerned with accuracy, because > > the routine beats find() by a long shot. > > > > <TEST> > <snip> > > <END TEST> > > > > The control test used: > > found = find('\n',temp[f..length(temp)] > > > > instead of: > > found = range_find('\n',temp,f,length(temp)) > >
9. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by bensler at mail.com Mar 15, 2002
- 406 views
I should point out, that using sequences of length < 200, range_find() is drastically slower. Using 200 elements or less, range_find() clocks in at 2 seconds. It doesn't seem to drop less than 2 secs. fastest time I got for find() was 0.68 using 10 elements. Chris bensler at mail.com wrote: > If I did my test correctly, and you're repetitively searching slices of > a sequence, yes. > > Because of your question, I did a bit of further testing. I've > discovered, the turning point between the benefit of find() vs > range_find() is somewhere between 200 to 300 elements. > > At 300 elements, iterated 1000000/fl times, range_find() beats find() > with 2.09 vs 2.58 consistently. The higher the number of elements, the > larger the difference. > > I haven't tested with nested sequences, or sequences with varied data. > > Here are both complete tests: > <CONTROL TEST> > constant fl=300 > integer fn > fn = open("words.txt","rb") > > sequence temp temp = repeat(0,fl) > > for i = 1 to fl do > temp[i] = getc(fn) > end for > > close(fn) > > atom t t=time() > integer count count = 0 > integer f,found > for i = 1 to 10000000/fl do > f=1 > found=find('\n',temp) > while found != 0 do > count +=1 > f +=found > found = find('\n',temp[f..length(temp)]) > end while > end for > > ? f > ? count > ? time()-t > while get_key()=-1 do end while > <END CONTROL TEST> > > <range_find() TEST> > constant fl = 300 > integer fn > fn = open("words.txt","rb") > > sequence temp temp = repeat(0,fl) > > for i = 1 to fl do > temp[i] = getc(fn) > end for > > close(fn) > > function range_find(object x, sequence s, integer pos1, integer pos2) > atom found > found = 0 > while not found and pos1 <= pos2 do > found = equal(x,s[pos1]) * pos1 > pos1 +=1 > end while > return found > end function > > atom t t=time() > integer count count = 0 > integer f,found > for i =1 to 10000000/fl do > f=0 > found=find('\n',temp) > while found != 0 do > count +=1 > f =found+1 > found = range_find('\n',temp,f,length(temp)) > end while > end for > > ? f > ? count > ? time()-t > while get_key()=-1 do end while > <END range_find() TEST> > > Chris > > Rod Jackson wrote: > > ??? > > > > Am I reading this right? Are you saying that this routine > > will actually find elements in a sequence faster than > > taking a slice of the sequence and using find() to search > > the slice? > > > > Rod Jackson > > <snip>
10. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by bensler at mail.com Mar 15, 2002
- 408 views
I just made range_find() even faster. They even out at 100 elements now, clocking at 1.27 secs. More than 100 elements, and range_find() has the benefit. Here it is: function range_find(object x, sequence s, integer pos1, integer pos2) integer pos pos = pos1 while pos <= pos2 do if equal(x,s[pos]) then exit end if pos +=1 end while if pos > pos2 then return 0 elsif pos = pos1 then return 0 else return pos end if end function Chris bensler at mail.com wrote: > I should point out, that using sequences of length < 200, range_find() > is drastically slower. Using 200 elements or less, range_find() clocks > in at 2 seconds. It doesn't seem to drop less than 2 secs. > fastest time I got for find() was 0.68 using 10 elements. > > Chris > > bensler at mail.com wrote: > > If I did my test correctly, and you're repetitively searching slices of > > a sequence, yes. > > > > Because of your question, I did a bit of further testing. I've > > discovered, the turning point between the benefit of find() vs > > range_find() is somewhere between 200 to 300 elements. > > > > At 300 elements, iterated 1000000/fl times, range_find() beats find() > > with 2.09 vs 2.58 consistently. The higher the number of elements, the > > larger the difference. > > > > I haven't tested with nested sequences, or sequences with varied data. > > > > Here are both complete tests: > > <CONTROL TEST> > > constant fl=300 > > integer fn > > fn = open("words.txt","rb") > > > > sequence temp temp = repeat(0,fl) > > > > for i = 1 to fl do > > temp[i] = getc(fn) > > end for > > > > close(fn) > > > > atom t t=time() > > integer count count = 0 > > integer f,found > > for i = 1 to 10000000/fl do > > f=1 > > found=find('\n',temp) > > while found != 0 do > > count +=1 > > f +=found > > found = find('\n',temp[f..length(temp)]) > > end while > > end for > > > > ? f > > ? count > > ? time()-t > > while get_key()=-1 do end while > > <END CONTROL TEST> > > > > <range_find() TEST> > > constant fl = 300 > > integer fn > > fn = open("words.txt","rb") > > > > sequence temp temp = repeat(0,fl) > > > > for i = 1 to fl do > > temp[i] = getc(fn) > > end for > > > > close(fn) > > > > function range_find(object x, sequence s, integer pos1, integer pos2) > > atom found > > found = 0 > > while not found and pos1 <= pos2 do > > found = equal(x,s[pos1]) * pos1 > > pos1 +=1 > > end while > > return found > > end function > > > > atom t t=time() > > integer count count = 0 > > integer f,found > > for i =1 to 10000000/fl do > > f=0 > > found=find('\n',temp) > > while found != 0 do > > count +=1 > > f =found+1 > > found = range_find('\n',temp,f,length(temp)) > > end while > > end for > > > > ? f > > ? count > > ? time()-t > > while get_key()=-1 do end while > > <END range_find() TEST> > > > > Chris > > > > Rod Jackson wrote: > > > ??? <snip>
11. RE: Sequence Slicing (Was RE: Tough ciphers?)
- Posted by Kat <gertie at PELL.NET> Mar 15, 2002
- 423 views
On 16 Mar 2002, at 0:19, bensler at mail.com wrote: > > I should point out, that using sequences of length < 200, range_find() > is drastically slower. Using 200 elements or less, range_find() clocks > in at 2 seconds. It doesn't seem to drop less than 2 secs. > fastest time I got for find() was 0.68 using 10 elements. In the spirit of a smart interpreter, how about if Eu switched algorithms in find() at the optimum sequence length? Kat