1. Fast "locate" function
- Posted by Alan Oxley <fizzpop at icon.co.za> Dec 06, 2001
- 384 views
Any elegant coder out there have a fast "locate" function, similar to "find", but returns the location of a defined byte sequence inside a large byte sequence..? As in, something without a "compare" or "match" inside a loop... "find" seems to be the answer, but it does not work for me. Example: sequence s1, s2 integer i1 s1 = {#00,#00,#00,#30} s2 = {#27,#20,#00,#00,#00,#30,#20,#40} i1 = find(s1,s2) printf(1,"Marker found at location %d\n",i1) -- i1 should be 3, but its 0, as in nothing found. I'm attempting to extract some 27K records from a 6M(mainframe) file, by looking for s1. The records are variable length, so my current (slow) solution is a nested loop containing "compare". The only consistency in my data file is the byte sequence s1.... Any help would be appreciated. Alan
2. Re: Fast "locate" function
- Posted by euman at bellsouth.net Dec 06, 2001
- 360 views
This is the best Ive seen in a while.... function find_all_2(object test, sequence data) integer ix, jx, len sequence result result = {} ix = 1 len = length(data) jx = find( test, data[ix..len] ) while jx do result &= ix+jx-1 ix += jx jx = find( test, data[ix..len] ) end while return result end function Thank Matt Lewis for this one. If this isnt what your looking for or want to gander "look at" some other interesting twist "code" on this same subject then search prior post on the mailing list keyword = find_all Euman euman at bellsouth.net ----- Original Message ----- From: "Alan Oxley" <fizzpop at icon.co.za> To: "EUforum" <EUforum at topica.com> Sent: Thursday, December 06, 2001 3:07 AM Subject: Fast "locate" function > > Any elegant coder out there have a fast "locate" function, similar to > "find", > but returns the location of a defined byte sequence inside a large byte > sequence..? > As in, something without a "compare" or "match" inside a loop... > "find" seems to be the answer, but it does not work for me. Example: > > sequence s1, s2 > integer i1 > s1 = {#00,#00,#00,#30} > s2 = {#27,#20,#00,#00,#00,#30,#20,#40} > i1 = find(s1,s2) > printf(1,"Marker found at location %d\n",i1) > -- i1 should be 3, but its 0, as in nothing found. > > I'm attempting to extract some 27K records from a 6M(mainframe) file, by > looking for s1. > The records are variable length, so my current (slow) solution is a nested > loop containing "compare". The only consistency in my data file is the byte > sequence s1.... > Any help would be appreciated. > > Alan > > > >
3. Re: Fast "locate" function
- Posted by euman at bellsouth.net Dec 06, 2001
- 339 views
> sequence s1, s2 > integer i1 > s1 = {#00,#00,#00,#30} > s2 = {#27,#20,#00,#00,#00,#30,#20,#40} > i1 = find(s1,s2) > printf(1,"Marker found at location %d\n",i1) > -- i1 should be 3, but its 0, as in nothing found. BTW, I would use what I just sent you and test for s1[1] to be #00 if this were true proceed up s1 and s2 sequence at the same time. Heres how I use find_all in one of my projects, Im looking for 0 (zero) if I find 0 I test the next char in sequence to see if its 0 before I proceed maybe not the fastest or most appropriate method but this hasnt failed me. The good thing about this routine is that you are only searching using find( ) from the last (test) encountered which should be faster than other methods on large sequences of data. function find_all(object test, sequence data) integer ix, jx, len sequence result result = {} ix = 1 len = length(data) jx = find( test, data[ix..len] ) while jx do result &= {data[ix..ix+jx-2]} ix += jx jx = find( test, data[ix..len]) if ix < len and data[ix] = 0 then ix += 1 jx = find( test, data[ix..len]) end if end while return result end function
4. Re: Fast "locate" function
- Posted by Derek Parnell <ddparnell at bigpond.com> Dec 06, 2001
- 352 views
Howzat Alan, have you tried the match() function? sequence s1, s2 integer i1 s1 = {#00,#00,#00,#30} s2 = {#27,#20,#00,#00,#00,#30,#20,#40} i1 = match(s1,s2) printf(1,"Marker found at location %d\n",i1) ----- Original Message ----- From: "Alan Oxley" <fizzpop at icon.co.za> To: "EUforum" <EUforum at topica.com> Sent: Thursday, December 06, 2001 7:07 PM Subject: Fast "locate" function > > Any elegant coder out there have a fast "locate" function, similar to > "find", > but returns the location of a defined byte sequence inside a large byte > sequence..? > As in, something without a "compare" or "match" inside a loop... > "find" seems to be the answer, but it does not work for me. Example: > > sequence s1, s2 > integer i1 > s1 = {#00,#00,#00,#30} > s2 = {#27,#20,#00,#00,#00,#30,#20,#40} > i1 = find(s1,s2) > printf(1,"Marker found at location %d\n",i1) > -- i1 should be 3, but its 0, as in nothing found. > > I'm attempting to extract some 27K records from a 6M(mainframe) file, by > looking for s1. > The records are variable length, so my current (slow) solution is a nested > loop containing "compare". The only consistency in my data file is the byte > sequence s1.... > Any help would be appreciated. > > Alan > > > >
5. Re: Fast "locate" function
- Posted by Alan Oxley <fizzpop at icon.co.za> Dec 06, 2001
- 350 views
Hi... ahem... yes, the match does indeed work... After some RTFM, I had tried match first, but kept getting type check errors; so I moved on to try "equal","compare" etc, all of which involved looping to get a slice of the large sequence. My earlier assignments for s1 or s2 during match attempts must have been wrong, as per the error messages.... I feel real stuuupid about now... Thanks Derek! BTW, Derek, noticing your greeting, are you an ex-South African? Alan
6. Re: Fast "locate" function
- Posted by euman at bellsouth.net Dec 06, 2001
- 342 views
This is a shot in the dark so be carefull I havent tested this.... Let me know if it works... > sequence s1, s2 > integer i1 > s1 = {#00,#00,#00,#30} -- constant data > s2 = {#27,#20,#00,#00,#00,#30,#20,#40} > i1 = find(s1,s2) function find_all(object test, sequence data) integer ix, jx, kx, len sequence loc loc = repeat(0,length(data)) ix = 1 kx = 1 len = length(data) jx = find( test, data[ix..len] ) loc[kx] = jx while jx do ix += jx jx = find( test, data[ix..len] ) kx += 1 loc[kx] = jx jx += 4 end while ix = find(0,data) loc = loc[1..ix] return loc end function sequence locations locations = find_all(#00, s2)
7. Re: Fast "locate" function
- Posted by euman at bellsouth.net Dec 06, 2001
- 350 views
or, MATCH( ) is good! either way. Euman >