1. find if any memers of a set are in another set (blank line finder)?
- Posted by Dan Moyer <DANIELMOYER at prodigy.net> Jul 18, 2002
- 413 views
I want to be able to discern whether a sequence (a line of text) is "empty" (ie, has *only* spaces or tabs or CR or any combination of those), or if it has *any* alpha/num/punctuation content at all. In other words,a "blank" line finder. And it needs to function as quickly as possible. I thought there might be a spiffy way similar to how: w = {1, 2, 3} = {1, 2, 4} gives: w={1,1,0} and then I could make a sequence of the numbers 33-126 (for all the al/num/punc), and test any line against that sequence with an "or" in place of the "="; but as a test, w = {1, 2, 3} or {1, 2, 4} gives me: w= {1,1,1}, which I don't understand. (I'm thinking it means that neither 3 nor 4 are zero.) So, is there some way to find if any member of a given set is found in another set, or some different, good (fast) way to find "blank" lines? Dan Moyer
2. Re: find if any memers of a set are in another set (blank line finder)?
- Posted by Dan Moyer <DANIELMOYER at prodigy.net> Jul 18, 2002
- 412 views
I haven't tested this yet against text read in from a file, but it seems like it *might* work, does anyone have any thing better/faster? When I put a tab in sequence "a" below, it didn't like that, but I think a tab in a file might be different? Dan Moyer <code begins> sequence a, b a = " " b = "this is a line of text!" function IsBlankLine(sequence aLine) sequence w,x w = aLine x = w w = w > 32 -- catches any characters *less* than 33 x = w < 126 -- catches any characters more than 126 w = w and x if find(1,w) then return 0 -- no, is *not* blank line else return 1 -- yes, is blank line end if end function if IsBlankLine(a) = 1 then puts(1, "yes, \"" & a & "\" is a blank line") else puts(1, "no, \"" & a & "\" is not a blank line") end if puts(1, "\n") if IsBlankLine(b) = 1 then puts(1, "yes, \"" & b & "\" is a blank line") else puts(1, "no, \"" & b & "\" is not a blank line") end if <code ends> ----- Original Message ----- From: "Dan Moyer" <DANIELMOYER at prodigy.net> To: "EUforum" <EUforum at topica.com> Sent: Thursday, July 18, 2002 5:19 AM Subject: find if any memers of a set are in another set (blank line finder)? > > I want to be able to discern whether a sequence (a line of text) is "empty" > (ie, has *only* spaces or tabs or CR or any combination of those), or if it > has *any* alpha/num/punctuation content at all. In other words,a "blank" > line finder. And it needs to function as quickly as possible. > > I thought there might be a spiffy way similar to how: > > w = {1, 2, 3} = {1, 2, 4} gives: w={1,1,0} > > and then I could make a sequence of the numbers 33-126 (for all the > al/num/punc), and test any line against that sequence with an "or" in place > of the "="; but as a test, > > w = {1, 2, 3} or {1, 2, 4} gives me: w= {1,1,1}, which I don't understand. > (I'm thinking it means that neither 3 nor 4 are zero.) > > So, is there some way to find if any member of a given set is found in > another set, or some different, good (fast) way to find "blank" lines? > > Dan Moyer > > > >
3. Re: find if any memers of a set are in another set (blank line finder)?
- Posted by Kat <gertie at PELL.NET> Jul 18, 2002
- 414 views
On 18 Jul 2002, at 5:59, Dan Moyer wrote: > > I haven't tested this yet against text read in from a file, but it seems > like it *might* work, does anyone have any thing better/faster? When I put a > tab in sequence "a" below, it didn't like that, but I think a tab in a file > might be different? A tab is ascii 9, doesn't matter where you get it from. This is a lil shorter than what you have below: include strtok-v2.e line = " " -- or whatever junk = parse(line,32) -- 32 is ascii for blank, or ' ' or " " if length(junk) = 0 then -- it's blank else -- it's not blank end if Here is a way to see if there is any member of a set in a string: punctset = "-_^`'<>][?*\\/}{+&^ at !~|;:,." line = "a b at Cat.d" junk = parse(line,punctset) if length(junk) > 1 then -- length is how many of punctset in line else -- none of punctset is in line end if Kat > Dan Moyer > > <code begins> > sequence a, b > a = " " > b = "this is a line of text!" > > function IsBlankLine(sequence aLine) > sequence w,x > w = aLine > x = w > w = w > 32 -- catches any characters *less* than 33 > x = w < 126 -- catches any characters more than 126 > w = w and x > > if find(1,w) then > return 0 -- no, is *not* blank line > else > return 1 -- yes, is blank line > end if > end function > > if IsBlankLine(a) = 1 then > puts(1, "yes, \"" & a & "\" is a blank line") > else > puts(1, "no, \"" & a & "\" is not a blank line") > end if > puts(1, "\n") > if IsBlankLine(b) = 1 then > puts(1, "yes, \"" & b & "\" is a blank line") > else > puts(1, "no, \"" & b & "\" is not a blank line") > end if > > <code ends> > > > ----- Original Message ----- > From: "Dan Moyer" <DANIELMOYER at prodigy.net> > To: "EUforum" <EUforum at topica.com> > Sent: Thursday, July 18, 2002 5:19 AM > Subject: find if any memers of a set are in another set (blank line finder)? > > > > I want to be able to discern whether a sequence (a line of text) is > "empty" > > (ie, has *only* spaces or tabs or CR or any combination of those), or if > it > > has *any* alpha/num/punctuation content at all. In other words,a "blank" > > line > > finder. And it needs to function as quickly as possible. > > > > I thought there might be a spiffy way similar to how: > > > > w = {1, 2, 3} = {1, 2, 4} gives: w={1,1,0} > > > > and then I could make a sequence of the numbers 33-126 (for all the > > al/num/punc), and test any line against that sequence with an "or" in > place > > of the "="; but as a test, > > > > w = {1, 2, 3} or {1, 2, 4} gives me: w= {1,1,1}, which I don't > understand. > > (I'm thinking it means that neither 3 nor 4 are zero.) > > > > So, is there some way to find if any member of a given set is found in > > another set, or some different, good (fast) way to find "blank" lines? > > > > Dan Moyer > > > > > > >
4. Re: find if any memers of a set are in another set (blank line finder)?
- Posted by cetaylor at compuserve.com Jul 18, 2002
- 403 views
Dan, Sometimes the direct approach is the simplest and fastest: --- type blank_line(sequence s) for i = 1 to length(s) do if not find(s[i], " \t\n") then return 0 end if end for return 1 end type --- Colin Taylor ----- Original Message ----- From: "Dan Moyer" <DANIELMOYER at prodigy.net> To: "EUforum" <EUforum at topica.com> Sent: Thursday, July 18, 2002 8:59 AM Subject: Re: find if any memers of a set are in another set (blank line finder)? > > > I haven't tested this yet against text read in from a file, but it seems > like it *might* work, does anyone have any thing better/faster? When I put > a tab in sequence "a" below, it didn't like that, but I think a tab in a > file might be different? > > Dan Moyer > > <code begins> > sequence a, b > a = " " > b = "this is a line of text!" > > function IsBlankLine(sequence aLine) > sequence w,x > w = aLine > x = w > w = w > 32 -- catches any characters *less* than 33 > x = w < 126 -- catches any characters more than 126 > w = w and x > > if find(1,w) then > return 0 -- no, is *not* blank line > else > return 1 -- yes, is blank line > end if > end function > > if IsBlankLine(a) = 1 then > puts(1, "yes, \"" & a & "\" is a blank line") > else > puts(1, "no, \"" & a & "\" is not a blank line") > end if > puts(1, "\n") > if IsBlankLine(b) = 1 then > puts(1, "yes, \"" & b & "\" is a blank line") > else > puts(1, "no, \"" & b & "\" is not a blank line") > end if > > <code ends> > > > ----- Original Message ----- > From: "Dan Moyer" <DANIELMOYER at prodigy.net> > To: "EUforum" <EUforum at topica.com> > Sent: Thursday, July 18, 2002 5:19 AM > Subject: find if any memers of a set are in another set (blank line finder)? > > > > I want to be able to discern whether a sequence (a line of text) is > "empty" > > (ie, has *only* spaces or tabs or CR or any combination of those), or if > it > > has *any* alpha/num/punctuation content at all. In other words,a "blank" > > line finder. And it needs to function as quickly as possible. > > > > I thought there might be a spiffy way similar to how: > > > > w = {1, 2, 3} = {1, 2, 4} gives: w={1,1,0} > > > > and then I could make a sequence of the numbers 33-126 (for all the > > al/num/punc), and test any line against that sequence with an "or" in > place > > of the "="; but as a test, > > > > w = {1, 2, 3} or {1, 2, 4} gives me: w= {1,1,1}, which I don't > understand. > > (I'm thinking it means that neither 3 nor 4 are zero.) > > > > So, is there some way to find if any member of a given set is found in > > another set, or some different, good (fast) way to find "blank" lines? > > > > Dan Moyer > > > > > > >
5. Re: find if any memers of a set are in another set (blank line finder)?
- Posted by Dan Moyer <DANIELMOYER at prodigy.net> Jul 18, 2002
- 410 views
Colin, I did think of that, but I haven't made user-defined types before & when I read the manual, I thought it said that if your program encounters something in a user-typed variable that isn't the right type, the program *halts* & gives an error message. That wouldn't be what I wanted at all! Did I misunderstand the manual? Dan ----- Original Message ----- From: <cetaylor at compuserve.com> > > Dan, > > Sometimes the direct approach is the simplest and fastest: > --- > type blank_line(sequence s) > for i = 1 to length(s) do > if not find(s[i], " \t\n") then > return 0 > end if > end for > return 1 > end type > --- > > Colin Taylor >
6. Re: find if any memers of a set are in another set (blank line finder)?
- Posted by Dan Moyer <DANIELMOYER at prodigy.net> Jul 18, 2002
- 404 views
Thanks Matt, I'm gonna have to try your method one to see what it does, I don't get it on just reading it (ditto the one liner variation). I'd considered your last "test each character in the line against the white_space set" method, but had just assumed it would be slower than some kind of "test whole set against whole set" method. Dan ----- Original Message ----- From: "Matthew Lewis" <matthewwalkerlewis at YAHOO.COM> To: "EUforum" <EUforum at topica.com> Sent: Thursday, July 18, 2002 5:43 AM Subject: RE: find if any memers of a set are in another set (blank line finder)? > > > > -----Original Message----- > > From: Dan Moyer [mailto:DANIELMOYER at prodigy.net] > > > I want to be able to discern whether a sequence (a line of > > text) is "empty" > > (ie, has *only* spaces or tabs or CR or any combination of > > those), or if it > > has *any* alpha/num/punctuation content at all. In other > > words,a "blank" > > line finder. And it needs to function as quickly as possible. > > > > I thought there might be a spiffy way similar to how: > > > > w = {1, 2, 3} = {1, 2, 4} gives: w={1,1,0} > > You could do it this way: > > -- your line is in sequence line > constant white_space = { ' ', '\t', '\r', '\n' } > sequence l > > l = repeat(0,length(line)) > for i = 1 to length( white_space ) do > l += ( line = white_space[i] ) > end for > > if find( 0, l ) then > -- something else in there > end if > > Rewritten a-la Carl's one liners: > not_blank = find(0, (line = ' ') + (line = '\t') + (line = '\r') + (line = > '\n') ) > > Functionally, but not necessarily speedwise equivalent: > not_blank = find(0, (line = ' ') or (line = '\t') or (line = '\r') or (line > = '\n') ) > > > w = {1, 2, 3} or {1, 2, 4} gives me: w= {1,1,1}, which I > > don't understand. > > (I'm thinking it means that neither 3 nor 4 are zero.) > > Here's why: > w = { 1 or 1, 2 or 2, 3 or 4 } = {1,1,1} > > > So, is there some way to find if any member of a given set is found in > > another set, or some different, good (fast) way to find "blank" lines? > > I often use this method: > > for i = 1 to length(line) do > if not find(line[i], white_space) then > -- found something else! > end if > end for > > This would most likely be faster if the first non-whitespace character were > toward the beginning of the line. You might also get different results > depending on the length of the lines to be tested. While the first method > is cool because it uses slick sequence math, I suspect that the last version > will be the fastest for your purposes. > > Matt Lewis > >
7. Re: find if any memers of a set are in another set (blank line finder)?
- Posted by Dan Moyer <DANIELMOYER at prodigy.net> Jul 18, 2002
- 419 views
Thanks Kat, I'll see if it's any faster than some of the other ways. tab: what I meant was, if you're typing characters into a sequence within your program, and hit the "tab" key while doing so, when that sequence is encountered by your program, it will error and say something like, "use \t to put a tab character into a sequence". But if you have read a *file* into a sequence, and the file was a text file with a tab in it, I think it will have the correct tab value there. Dan ----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Sent: Thursday, July 18, 2002 7:07 AM Subject: Re: find if any memers of a set are in another set (blank line finder)? > > On 18 Jul 2002, at 5:59, Dan Moyer wrote: > > > > > I haven't tested this yet against text read in from a file, but it seems > > like it *might* work, does anyone have any thing better/faster? When I put a > > tab in sequence "a" below, it didn't like that, but I think a tab in a file > > might be different? > > A tab is ascii 9, doesn't matter where you get it from. > > This is a lil shorter than what you have below: > > include strtok-v2.e > > line = " " -- or whatever > junk = parse(line,32) -- 32 is ascii for blank, or ' ' or " " > if length(junk) = 0 > then -- it's blank > else -- it's not blank > end if > > Here is a way to see if there is any member of a set in a string: > > punctset = "-_^`'<>][?*\\/}{+&^ at !~|;:,." > line = "a b at Cat.d" > > junk = parse(line,punctset) > if length(junk) > 1 > then -- length is how many of punctset in line > else -- none of punctset is in line > end if > > Kat > > > Dan Moyer > > > > <code begins> > > sequence a, b > > a = " " > > b = "this is a line of text!" > > > > function IsBlankLine(sequence aLine) > > sequence w,x > > w = aLine > > x = w > > w = w > 32 -- catches any characters *less* than 33 > > x = w < 126 -- catches any characters more than 126 > > w = w and x > > > > if find(1,w) then > > return 0 -- no, is *not* blank line > > else > > return 1 -- yes, is blank line > > end if > > end function > > > > if IsBlankLine(a) = 1 then > > puts(1, "yes, \"" & a & "\" is a blank line") > > else > > puts(1, "no, \"" & a & "\" is not a blank line") > > end if > > puts(1, "\n") > > if IsBlankLine(b) = 1 then > > puts(1, "yes, \"" & b & "\" is a blank line") > > else > > puts(1, "no, \"" & b & "\" is not a blank line") > > end if > > > > <code ends> > > > > > > ----- Original Message ----- > > From: "Dan Moyer" <DANIELMOYER at prodigy.net> > > To: "EUforum" <EUforum at topica.com> > > Sent: Thursday, July 18, 2002 5:19 AM > > Subject: find if any memers of a set are in another set (blank line finder)? > > > > > > > I want to be able to discern whether a sequence (a line of text) is > > "empty" > > > (ie, has *only* spaces or tabs or CR or any combination of those), or if > > it > > > has *any* alpha/num/punctuation content at all. In other words,a "blank" line > > > finder. And it needs to function as quickly as possible. > > > > > > I thought there might be a spiffy way similar to how: > > > > > > w = {1, 2, 3} = {1, 2, 4} gives: w={1,1,0} > > > > > > and then I could make a sequence of the numbers 33-126 (for all the > > > al/num/punc), and test any line against that sequence with an "or" in > > place > > > of the "="; but as a test, > > > > > > w = {1, 2, 3} or {1, 2, 4} gives me: w= {1,1,1}, which I don't > > understand. > > > (I'm thinking it means that neither 3 nor 4 are zero.) > > > > > > So, is there some way to find if any member of a given set is found in > > > another set, or some different, good (fast) way to find "blank" lines? > > > > > > Dan Moyer > > > > > > > > >
8. Re: find if any memers of a set are in another set (blank line finder)?
- Posted by cetaylor at compuserve.com Jul 18, 2002
- 418 views
Dan, A type can be used just like a function that returns TRUE or FALSE. If it makes you feel better you can use a function call instead of a type call (just substitute the word "function" for "type" in the code). Use it like this: if blank_line(line) then -- do this else -- do that end if Testing character by character will usually be faster than testing the whole string, since the routine exits at the first non-whitespace character it finds. Colin Taylor ----- Original Message ----- From: "Dan Moyer" <DANIELMOYER at prodigy.net> To: "EUforum" <EUforum at topica.com> Sent: Thursday, July 18, 2002 5:38 PM Subject: Re: find if any memers of a set are in another set (blank line finder)? > > > Colin, > > I did think of that, but I haven't made user-defined types before & when I > read the manual, I thought it said that if your program encounters something > in a user-typed variable that isn't the right type, the program *halts* & > gives an error message. That wouldn't be what I wanted at all! Did I > misunderstand the manual? > > Dan > > > ----- Original Message ----- > From: <cetaylor at compuserve.com> > > > > > Dan, > > > > Sometimes the direct approach is the simplest and fastest: > > --- > > type blank_line(sequence s) > > for i = 1 to length(s) do > > if not find(s[i], " \t\n") then > > return 0 > > end if > > end for > > return 1 > > end type > > --- > > > > Colin Taylor > > > > > >