1. Fast "locate" function
		
		
Any elegant coder out there have a fast "locate" function, similar to
"find",
but returns the location of a defined byte sequence inside a large byte
sequence..?
As in, something without a "compare" or "match" inside a loop...
"find" seems to be the answer, but it does not work for me. Example:
sequence s1, s2
integer i1
s1 = {#00,#00,#00,#30}
s2 = {#27,#20,#00,#00,#00,#30,#20,#40}
i1 = find(s1,s2)
printf(1,"Marker found at location %d\n",i1)
-- i1 should be 3, but its 0, as in nothing found.
I'm attempting to extract some 27K records from a 6M(mainframe) file, by
looking for s1.
The records are variable length, so my current (slow) solution is a nested
loop containing "compare". The only consistency in my data file is the byte
sequence s1....
Any help would be appreciated.
Alan
		
	 
	
		
		2. Re: Fast "locate" function
		
			- Posted by euman at bellsouth.net
			Dec 06, 2001
This is the best Ive seen in a while....
 
function find_all_2(object test, sequence data)
        integer ix, jx, len
        sequence result
        result = {}
        ix = 1
        len = length(data)
        jx = find( test, data[ix..len] )
        while jx do
                result &= ix+jx-1
                ix += jx
                jx = find( test, data[ix..len] )
        end while
        return result
end function
Thank Matt Lewis for this one.
If this isnt what your looking for or want to gander "look at" some other
interesting twist "code" on this same subject then search prior post on the
mailing list
keyword = find_all
Euman
euman at bellsouth.net
----- Original Message ----- 
From: "Alan Oxley" <fizzpop at icon.co.za>
To: "EUforum" <EUforum at topica.com>
Sent: Thursday, December 06, 2001 3:07 AM
Subject: Fast "locate" function
> 
> Any elegant coder out there have a fast "locate" function, similar to
> "find",
> but returns the location of a defined byte sequence inside a large byte
> sequence..?
> As in, something without a "compare" or "match" inside a loop...
> "find" seems to be the answer, but it does not work for me. Example:
> 
> sequence s1, s2
> integer i1
> s1 = {#00,#00,#00,#30}
> s2 = {#27,#20,#00,#00,#00,#30,#20,#40}
> i1 = find(s1,s2)
> printf(1,"Marker found at location %d\n",i1)
> -- i1 should be 3, but its 0, as in nothing found.
> 
> I'm attempting to extract some 27K records from a 6M(mainframe) file, by
> looking for s1.
> The records are variable length, so my current (slow) solution is a nested
> loop containing "compare". The only consistency in my data file is the byte
> sequence s1....
> Any help would be appreciated.
> 
> Alan
> 
> 
> 
>
		
	 
	
		
		3. Re: Fast "locate" function
		
			- Posted by euman at bellsouth.net
			Dec 06, 2001
> sequence s1, s2
> integer i1
> s1 = {#00,#00,#00,#30}
> s2 = {#27,#20,#00,#00,#00,#30,#20,#40}
> i1 = find(s1,s2)
> printf(1,"Marker found at location %d\n",i1)
> -- i1 should be 3, but its 0, as in nothing found.
BTW, I would use what I just sent you and test for s1[1] to be #00
if this were true proceed up  s1 and s2 sequence at the same time.
Heres how I use find_all in one of my projects, Im looking for 0 (zero)
if I find 0 I test the next char in sequence to see if its 0 before I proceed
maybe not the fastest or most appropriate method but this hasnt failed me.
The good thing about this routine is that you are only searching using find( )
from the last (test) encountered which should be faster than other methods
on large sequences of data.
function find_all(object test, sequence data)
integer ix, jx, len
sequence result
  result = {}
  ix = 1
  len = length(data)
  jx = find( test, data[ix..len] )
  while jx do
     result &= {data[ix..ix+jx-2]}
     ix += jx
     jx = find( test, data[ix..len])
     if ix < len and data[ix] = 0 then
        ix += 1
        jx = find( test, data[ix..len])
     end if 
  end while
  return result
end function
		
	 
	
		
		4. Re: Fast "locate" function
		
		
Howzat Alan,
have you tried the match() function?
  sequence s1, s2
  integer i1
  s1 = {#00,#00,#00,#30}
  s2 = {#27,#20,#00,#00,#00,#30,#20,#40}
  i1 = match(s1,s2)
  printf(1,"Marker found at location %d\n",i1)
----- Original Message -----
From: "Alan Oxley" <fizzpop at icon.co.za>
To: "EUforum" <EUforum at topica.com>
Sent: Thursday, December 06, 2001 7:07 PM
Subject: Fast "locate" function
>
> Any elegant coder out there have a fast "locate" function, similar to
> "find",
> but returns the location of a defined byte sequence inside a large byte
> sequence..?
> As in, something without a "compare" or "match" inside a loop...
> "find" seems to be the answer, but it does not work for me. Example:
>
> sequence s1, s2
> integer i1
> s1 = {#00,#00,#00,#30}
> s2 = {#27,#20,#00,#00,#00,#30,#20,#40}
> i1 = find(s1,s2)
> printf(1,"Marker found at location %d\n",i1)
> -- i1 should be 3, but its 0, as in nothing found.
>
> I'm attempting to extract some 27K records from a 6M(mainframe) file, by
> looking for s1.
> The records are variable length, so my current (slow) solution is a nested
> loop containing "compare". The only consistency in my data file is the
byte
> sequence s1....
> Any help would be appreciated.
>
> Alan
>
>
>
>
		
	 
	
		
		5. Re: Fast "locate" function
		
		
Hi...
ahem... yes, the match does indeed work...
After some RTFM,  I had tried match first, but kept getting type check
errors; so I moved
on to try "equal","compare" etc, all of which involved looping to get a
slice of the large sequence.
My earlier assignments for s1 or s2 during match attempts must have been
wrong, as per the error messages....
I feel real stuuupid about now...
Thanks Derek!
BTW, Derek, noticing your greeting, are you an ex-South African?
Alan
		
	 
	
		
		6. Re: Fast "locate" function
		
			- Posted by euman at bellsouth.net
			Dec 06, 2001
This is a shot in the dark so be carefull I havent tested this....
Let me know if it works...
> sequence s1, s2
> integer i1
> s1 = {#00,#00,#00,#30} -- constant data
> s2 = {#27,#20,#00,#00,#00,#30,#20,#40}
> i1 = find(s1,s2)
function find_all(object test, sequence data)
        integer ix, jx, kx, len
        sequence loc
        loc = repeat(0,length(data))
        ix = 1  kx = 1
        len = length(data)
        jx = find( test, data[ix..len] )
        loc[kx] = jx
        while jx do
                ix += jx
                jx = find( test, data[ix..len] )
                kx += 1
                loc[kx] = jx
                jx += 4
        end while
        ix = find(0,data) 
        loc = loc[1..ix]     
        return loc
end function
sequence locations
locations = find_all(#00, s2)
		
	 
	
		
		7. Re: Fast "locate" function
		
			- Posted by euman at bellsouth.net
			Dec 06, 2001
or, MATCH( )  is good!
either way.
Euman
>