1. help with storing user input
- Posted by Jason Dube <dubetyrant at hotmail.com> May 15, 2003
- 458 views
This is a multi-part message in MIME format. ------=_NextPart_000_0005_01C31A7C.B6E0D280 charset="iso-8859-1" Hello, What would be an efficient way of seperating words in a user inputted = sentence? For example to break apart the words in a sentence. Im = specifically trying to develop a GOOD algoritm to seperate words from = user inputed sentence and store them as individual sequences.Like: user input:"mary had a little lamb"=20 results:sequence 1st_sentence=3D{"mary","had","a","little","lamb") Im having difficulties skipping whitespaces and converting to string how would euphoria do this:? while not end of line get one letter at a time untill you see a whitespace store all letters previous to the encountered whitespace in = 1st_sentence get one letter at a time untill you encounter a whitespace append 1st_sentence with all letters previous to the whitespace as a = new sequence, but not the letters previous to the first whitespace end while Can someone please help? Thanks!! ------=_NextPart_000_0005_01C31A7C.B6E0D280 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META content=3D"text/html; charset=3Diso-8859-1" = http-equiv=3DContent-Type> <META content=3D"MSHTML 5.00.3103.1000" name=3DGENERATOR> <STYLE></STYLE> </HEAD> <BODY bgColor=3D#ffffff> <DIV><FONT face=3DArial size=3D2>Hello,</FONT></DIV> <DIV><FONT face=3DArial size=3D2>What would be an efficient way of = seperating words=20 in a user inputted sentence? For example to break apart the words in a = sentence.=20 Im specifically trying to develop a GOOD algoritm to seperate words from = user=20 inputed sentence and store them as individual = sequences.Like:</FONT></DIV> <DIV><FONT face=3DArial size=3D2>user input:"mary had a little lamb" = </FONT></DIV> <DIV><FONT face=3DArial size=3D2>results:sequence=20 1st_sentence=3D{"mary","had","a","little","lamb")</FONT></DIV> <DIV><FONT face=3DArial size=3D2>Im having difficulties skipping = whitespaces and=20 converting to string</FONT></DIV> <DIV> </DIV> <DIV><FONT face=3DArial size=3D2>how would euphoria do = this:?</FONT></DIV> <DIV> </DIV> <DIV><FONT face=3DArial size=3D2>while not end of line</FONT></DIV> <DIV><FONT face=3DArial size=3D2> get one letter at a = time untill=20 you see a whitespace</FONT></DIV> <DIV><FONT face=3DArial size=3D2> store all letters = previous to=20 the encountered whitespace in 1st_sentence</FONT></DIV> <DIV><FONT face=3DArial size=3D2> get one letter at a = time untill=20 you encounter a whitespace</FONT></DIV> <DIV><FONT face=3DArial size=3D2> append 1st_sentence = with all=20 letters previous to the whitespace as a new sequence, but not=20 the letters previous to the first=20 whitespace</FONT></DIV> <DIV><FONT face=3DArial size=3D2> end while</FONT></DIV> <DIV> </DIV> <DIV> </DIV> <DIV><FONT face=3DArial size=3D2>Can someone please help?</FONT></DIV> <DIV><FONT face=3DArial size=3D2>Thanks!!</FONT></DIV> <DIV> </DIV> <DIV> </DIV> ------=_NextPart_000_0005_01C31A7C.B6E0D280--
2. Re: help with storing user input
- Posted by Derek Parnell <ddparnell at bigpond.com> May 15, 2003
- 433 views
----- Original Message ----- From: "Jason Dube" <dubetyrant at hotmail.com> To: "EUforum" <EUforum at topica.com> Subject: help with storing user input > > Hello, > What would be an efficient way of seperating words in a user inputted sentence? For example to break apart the words in a sentence. Im specifically trying to develop a GOOD algoritm to seperate words from user inputed sentence and store them as individual sequences.Like: > user input:"mary had a little lamb" > results:sequence 1st_sentence={"mary","had","a","little","lamb") > Im having difficulties skipping whitespaces and converting to string > > how would euphoria do this:? Here are a couple of routines that I use... ------------------------------------- global function Tokenize(sequence pText, object pWhiteSpace, object pNonword, object pQuotes) --> sequence ------------------------------------- -- pText is returned as a sequence of 'words'. -- Each word is delimited by a set of one or more Delimiters sequence lTokens integer lStartQuote, lEndQuote integer lTextLength integer lStart integer lPos -- Validate whitespace parameter if atom(pWhiteSpace) then if pWhiteSpace = 0 then pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13 else pWhiteSpace = {pWhiteSpace} end if end if -- Validate non-word parameter if atom(pNonword) then if pNonword = 0 then pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/" else pNonword = {pNonword} end if end if -- Validate quote marks parameter if sequence(pQuotes) then if length(pQuotes) = 0 then pQuotes = {{},{},{},{},{}} elsif (length(pQuotes) != 5 or atom(pQuotes[1]) or atom(pQuotes[2]) or atom(pQuotes[3]) or length(pQuotes[1]) != length(pQuotes[2]) or atom(pQuotes[4]) or atom(pQuotes[5]) or length(pQuotes[4]) != length(pQuotes[5]) ) then pQuotes = 0 end if end if if atom(pQuotes) then if pQuotes = 0 then pQuotes = {"\"'`", "\"'`", "\\~","",""} else pQuotes = {{pQuotes}, {pQuotes},{},{},{}} end if end if -- Initialize lTokens = {} lStart = 0 lStartQuote = 0 lEndQuote = 0 for i = 1 to length(pText) do if lStartQuote != 0 then if pText[i] = lEndQuote then if find(pText[i - 1], pQuotes[3]) then if i > 2 and find(pText[i - 2], pQuotes[3]) then lTokens = append(lTokens, pText[lStart .. i - 1]) lStart = 0 lStartQuote = 0 lEndQuote = 0 end if else lTokens = append(lTokens, pText[lStart .. i - 1]) lStart = 0 lStartQuote = 0 lEndQuote = 0 end if end if else lPos = find(pText[i], pQuotes[1]) if lPos != 0 then lStartQuote = lPos lStart = i + 1 lEndQuote = pQuotes[2][lPos] elsif find( pText[i], pWhiteSpace ) then if lStart != 0 then lTokens = append(lTokens, pText[lStart .. i - 1]) lStart = 0 end if else if lStart = 0 then lStart = i end if if find(pText[i], pNonword) > 0 then if lStart != 0 then -- Avoid empty tokens if lStart != i then lTokens = append(lTokens, pText[lStart .. i - 1]) end if lStart = 0 end if lTokens = append(lTokens, {pText[i]}) lStart = 0 end if end if end if end for if lStart != 0 then lTokens = append(lTokens, pText[lStart .. length(pText)]) lStart = 0 end if return lTokens end function ------------------------------------- global function SimpleTokenize(sequence s, object c) ------------------------------------- -- Returns 's', as a number of words delimited by one or more 'c' objects integer slen, spt, i sequence parsed parsed = {} slen = length(s) spt = 1 i = 1 while i <= slen do while i <= slen and equal(s[i], c) do i += 1 end while spt = i while i <= slen and not equal(s[i],c) do i += 1 end while parsed = append(parsed,s[spt..i-1]) i += 1 end while return parsed end function ---------------- cheers, Derek Parnell
3. Re: help with storing user input
- Posted by Jason Dube <dubetyrant at hotmail.com> May 15, 2003
- 420 views
wow! Thank you! Definately gonna look at this close. I'd like to try to use it in my program, if I have some questions about it, is it ok if I ask? >From: Derek Parnell <ddparnell at bigpond.com> >Subject: Re: help with storing user input > > >----- Original Message ----- >From: "Jason Dube" <dubetyrant at hotmail.com> >To: "EUforum" <EUforum at topica.com> >Subject: help with storing user input > > > > Hello, > > What would be an efficient way of seperating words in a user inputted >sentence? For example to break apart the words in a sentence. Im >specifically trying to develop a GOOD algoritm to seperate words from user >inputed sentence and store them as individual sequences.Like: > > user input:"mary had a little lamb" > > results:sequence 1st_sentence={"mary","had","a","little","lamb") > > Im having difficulties skipping whitespaces and converting to string > > > > how would euphoria do this:? >Here are a couple of routines that I use... > > >-- pText is returned as a sequence of 'words'. >-- Each word is delimited by a set of one or more Delimiters > > > sequence lTokens > integer lStartQuote, lEndQuote > integer lTextLength > integer lStart > integer lPos > > -- Validate whitespace parameter > if atom(pWhiteSpace) then > if pWhiteSpace = 0 then > pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13 > else > pWhiteSpace = {pWhiteSpace} > end if > end if > > -- Validate non-word parameter > if atom(pNonword) then > if pNonword = 0 then > pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/" > else > pNonword = {pNonword} > end if > end if > > -- Validate quote marks parameter > if sequence(pQuotes) then > if length(pQuotes) = 0 then > pQuotes = {{},{},{},{},{}} > elsif (length(pQuotes) != 5 > or > atom(pQuotes[1]) > or > atom(pQuotes[2]) > or > atom(pQuotes[3]) > or > length(pQuotes[1]) != length(pQuotes[2]) > or > atom(pQuotes[4]) > or > atom(pQuotes[5]) > or > length(pQuotes[4]) != length(pQuotes[5]) > ) > then > pQuotes = 0 > end if > end if > > if atom(pQuotes) then > if pQuotes = 0 then > pQuotes = {"\"'`", "\"'`", "\\~","",""} > else > pQuotes = {{pQuotes}, {pQuotes},{},{},{}} > end if > end if > > -- Initialize > lTokens = {} > lStart = 0 > lStartQuote = 0 > lEndQuote = 0 > for i = 1 to length(pText) do > if lStartQuote != 0 then > if pText[i] = lEndQuote then > if find(pText[i - 1], pQuotes[3]) then > if i > 2 and find(pText[i - 2], pQuotes[3]) then > lTokens = append(lTokens, pText[lStart .. i - 1]) > lStart = 0 <snip> > >
4. Re: help with storing user input
- Posted by gertie at visionsix.com May 15, 2003
- 454 views
On 15 May 2003, at 0:55, Jason Dube wrote: > > Hello, > What would be an efficient way of seperating words in a user inputted > sentence? > For example to break apart the words in a sentence. Im specifically trying to > develop a GOOD algoritm to seperate words from user inputed sentence and store > them as individual sequences.Like: user input:"mary had a little lamb" > results:sequence 1st_sentence={"mary","had","a","little","lamb") Im having > difficulties skipping whitespaces and converting to string > > how would euphoria do this:? parsedline = parse(input," ")
5. Re: help with storing user input
- Posted by gertie at visionsix.com May 15, 2003
- 446 views
On 15 May 2003, at 2:37, gertie at visionsix.com wrote: > > On 15 May 2003, at 0:55, Jason Dube wrote: > > > > > Hello, > > What would be an efficient way of seperating words in a user inputted > > sentence? For example to break apart the words in a sentence. Im > > specifically > > trying to develop a GOOD algoritm to seperate words from user inputed > > sentence > > and store them as individual sequences.Like: user input:"mary had a little > > lamb" results:sequence 1st_sentence={"mary","had","a","little","lamb") Im > > having difficulties skipping whitespaces and converting to string > > > > how would euphoria do this:? > > parsedline = parse(input," ") You can also do: parsedline = parse(input," ,.;:'") or other punctuation. Problem with some is in math, like "1,234.5" , with the comma and period, or ""blah", he said sadly"" becomes {"blah","he","said","sadly"} which carries much less info. Kat
6. Re: help with storing user input
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> May 15, 2003
- 421 views
On Thu, 15 May 2003 16:13:05 +1000, Derek Parnell <ddparnell at bigpond.com> wrote: <snip> Interesting, two quick points: > if find(pText[i], pNonword) > 0 > then > if lStart !=3D 0 > then > -- Avoid empty tokens > if lStart !=3D i then > lTokens =3D append(lTokens, pText[lStart .. = i - >1]) > end if > lStart =3D 0 > end if > ^^^^^ it looks to me an "else" has gone walkabouts here. > lTokens =3D append(lTokens, {pText[i]}) > lStart =3D 0 > end if > end if > end if > end for 2) I can't see they are used, what were pQuotes[4]&[5] supposed to be for? Just curious. Pete
7. Re: help with storing user input
- Posted by Derek Parnell <ddparnell at bigpond.com> May 15, 2003
- 426 views
----- Original Message ----- From: "Pete Lomax" <petelomax at blueyonder.co.uk> To: "EUforum" <EUforum at topica.com> Subject: Re: help with storing user input > > On Thu, 15 May 2003 16:13:05 +1000, Derek Parnell > <ddparnell at bigpond.com> wrote: > > <snip> > > Interesting, two quick points: > > if find(pText[i], pNonword) > 0 > > then > > if lStart != 0 > > then > > -- Avoid empty tokens > > if lStart != i then > > lTokens = append(lTokens, pText[lStart .. i - > >1]) > > end if > > lStart = 0 > > end if > > > ^^^^^ it looks to me an "else" has gone walkabouts here. > > lTokens = append(lTokens, {pText[i]}) > > lStart = 0 > > end if > > end if > > end if > > end for No, the code is correct. No 'else' is missing. > 2) I can't see they are used, what were pQuotes[4]&[5] supposed to be > for? Just curious. I never got around to this, but there were going to be used for nested tokens; brackets for example. Must complete that I guess. pQuotes[4] is a list of leading, or opening, symbols and pQuotes[5] is the matching closing symbols. ---------------- cheers, Derek Parnell
8. Re: help with storing user input
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> May 15, 2003
- 429 views
On Fri, 16 May 2003 00:53:30 +1000, Derek Parnell <ddparnell at bigpond.com> wrote: >> ^^^^^ it looks to me an "else" has gone walkabouts here. >No, the code is correct. No 'else' is missing. Good. The blank made me suspect, I see a -1 now. You happy, me happy. > >> 2) I can't see they are used, what were pQuotes[4]&[5] supposed to be >> for? Just curious. > >I never got around to this, but there were going to be used for nested >tokens;=20 Eeek(!) I have some pukka code, just for matching [{( & ]}) tho, if you are interested. (All it does is stack the openings and recurse on finding a matching close (?9/0 on mismatch); nothing special but you have mentioned you is busy, so when/if I can help...) Pete
9. Re: help with storing user input
- Posted by Derek Parnell <ddparnell at bigpond.com> May 16, 2003
- 456 views
On Fri, 16 May 2003 00:29:49 +0000, Jason Dube <dubetyrant at hotmail.com> wrote: Hi Jason, may I be of assistance (as it is my humble code ...) > > > ------------------------------------- > global function Tokenize(sequence pText, object pWhiteSpace, object > pNonword, > object pQuotes) --> sequence > ------------------------------------- > -- pText is returned as a sequence of 'words'. > -- Each word is delimited by a set of one or more Delimiters > > > sequence lTokens > integer lStartQuote, lEndQuote > integer lTextLength > integer lStart > integer lPos > > -- Validate whitespace parameter > if atom(pWhiteSpace) then > if pWhiteSpace = 0 then > pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13 > else > pWhiteSpace = {pWhiteSpace} > end if > end if Okay, let's start with this then. The parameter definition of 'pWhiteSpace' is 'object', implying that the caller can use either an atom or a sequence. I allow both for a good reason. But before we look at that, realize that 'pWhiteSpace' is meant to represent a set of characters that can ALL be considered as "white space characters". Now back to the story... if pWhiteSpace was passed as an atom then if that value is a zero this indicates that the caller wishes to use the 'default' set of white space characters. And that is the set of characters represented by "' ' & 8 & 9 & 10 & 11 & 12 & 13" - namely the SPACE, BACKSPACE, TAB, LINEFEED, VERTICALFEED, FORMFEED and CARRIAGE- RETURN. if the atom value passed is NOT a zero then I just convert it to a sequence by enclosing it in braces. You see, what I want in the program is a sequence, but I allow people to call the routine a number of ways... -- Just use the SPACE character as delimiter. Tokenize("derek parnell Level11", ' ', ... -- Use the SPACE and TAB characters as delimiters. Tokenize("derek parnell Level11", {"\t"}, ... -- Use the default characters as delimiters. Tokenize("derek parnell Level11", 0, ... My validation of the parameter is not perfect because it allows people to pass floating point atoms and nested sequences - which I really do not want. > -- Validate non-word parameter > if atom(pNonword) then > if pNonword = 0 then > pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/" > else > pNonword = {pNonword} > end if > end if Ditto. The pNonword parameter is a set of characters that and definitely not found inside words. I allow people to supply their own non-word characters or to use the default ones. -- cheers, Derek Parnell
10. Re: help with storing user input
- Posted by Jason Dube <dubetyrant at hotmail.com> May 16, 2003
- 432 views
-Derek, I was wondering if you could give me an overview of what this particular code is doing. -- Validate quote marks parameter if sequence(pQuotes) then if length(pQuotes) = 0 then pQuotes = {{},{},{},{},{}} elsif (length(pQuotes) != 5 or atom(pQuotes[1]) or atom(pQuotes[2]) or atom(pQuotes[3]) or length(pQuotes[1]) != length(pQuotes[2]) or atom(pQuotes[4]) or atom(pQuotes[5]) or length(pQuotes[4]) != length(pQuotes[5]) ) then pQuotes = 0 end if end if if atom(pQuotes) then if pQuotes = 0 then pQuotes = {"\"'`", "\"'`", "\\~","",""} else pQuotes = {{pQuotes}, {pQuotes},{},{},{}} end if end if ----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: "EUforum" <EUforum at topica.com> Sent: Thursday, May 15, 2003 9:04 PM Subject: Re: help with storing user input > > On Fri, 16 May 2003 00:29:49 +0000, Jason Dube <dubetyrant at hotmail.com> > wrote: > > > Hi Jason, > may I be of assistance (as it is my humble code ...) > > > > > -- pText is returned as a sequence of 'words'. > > -- Each word is delimited by a set of one or more Delimiters > > > > > > sequence lTokens > > integer lStartQuote, lEndQuote > > integer lTextLength > > integer lStart > > integer lPos > > > > -- Validate whitespace parameter > > if atom(pWhiteSpace) then > > if pWhiteSpace = 0 then > > pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13 > > else > > pWhiteSpace = {pWhiteSpace} > > end if > > end if > > Okay, let's start with this then. > > The parameter definition of 'pWhiteSpace' is 'object', implying that the > caller can use either an atom or a sequence. I allow both for a good > reason. But before we look at that, realize that 'pWhiteSpace' is meant to > represent a set of characters that can ALL be considered as "white space > characters". Now back to the story... > > if pWhiteSpace was passed as an atom then > if that value is a zero this indicates that the caller wishes to use > the 'default' set of white space characters. And that is the set of > characters represented by "' ' & 8 & 9 & 10 & 11 & 12 & 13" - namely the > SPACE, BACKSPACE, TAB, LINEFEED, VERTICALFEED, FORMFEED and CARRIAGE- > RETURN. > if the atom value passed is NOT a zero then I just convert it to a > sequence by enclosing it in braces. > > You see, what I want in the program is a sequence, but I allow people to > call the routine a number of ways... > > -- Just use the SPACE character as delimiter. > Tokenize("derek parnell Level11", ' ', ... > > -- Use the SPACE and TAB characters as delimiters. > Tokenize("derek parnell Level11", {"\t"}, ... > > -- Use the default characters as delimiters. > Tokenize("derek parnell Level11", 0, ... > > > My validation of the parameter is not perfect because it allows people to > pass floating point atoms and nested sequences - which I really do not > want. > > > -- Validate non-word parameter > > if atom(pNonword) then > > if pNonword = 0 then > > pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/" > > else > > pNonword = {pNonword} > > end if > > end if > > Ditto. The pNonword parameter is a set of characters that and definitely > not found inside words. I allow people to supply their own non-word > characters or to use the default ones. > > > -- > > cheers, > Derek Parnell > > > > TOPICA - Start your own email discussion group. FREE! > >
11. Re: help with storing user input
- Posted by Derek Parnell <ddparnell at bigpond.com> May 17, 2003
- 425 views
----- Original Message ----- From: "Jason Dube" <dubetyrant at hotmail.com> To: "EUforum" <EUforum at topica.com> Subject: Re: help with storing user input > > -Derek, I was wondering if you could give me an overview of what this > particular code is doing. > > -- Validate quote marks parameter > if sequence(pQuotes) then > if length(pQuotes) = 0 then > pQuotes = {{},{},{},{},{}} > elsif (length(pQuotes) != 5 > or > atom(pQuotes[1]) > or > atom(pQuotes[2]) > or > atom(pQuotes[3]) > or > length(pQuotes[1]) != length(pQuotes[2]) > or > atom(pQuotes[4]) > or > atom(pQuotes[5]) > or > length(pQuotes[4]) != length(pQuotes[5]) > ) > then > pQuotes = 0 > end if > end if > > if atom(pQuotes) then > if pQuotes = 0 then > pQuotes = {"\"'`", "\"'`", "\\~","",""} > else > pQuotes = {{pQuotes}, {pQuotes},{},{},{}} > end if > end if This is just validating and initializing the pQuotes group of data. pQuotes[1] and [2] is a pair of character sets. [1] is the start quote and [2] is the matching end quote. All the characters inside the quotes is considered to be a word. pQuotes[3] is a list of characters that are 'escape' characters to be used inside quoted strings. For example... 'abc~'def' with pQuotes = { "'", "'", '~', ...} gives the word value as abc'def pQuotes[4] and [5] are not being used yet. ---------------- cheers, Derek Parnell
12. Re: help with storing user input
- Posted by Jason Dube <dubetyrant at hotmail.com> May 17, 2003
- 415 views
Hey, I know your not responsible for teaching me to code with euphoria, but this algoritm uses the language in a lot of ways I couldn't imagine. This part of the code is kinda confusing for me, in order to understand the big for statement that follows, dont I have to find out whats going on with this pquotes variable? -Basically I'm wondering what kinds of parameters the caller of this function would specify for pquotes. > > -- Validate quote marks parameter > > if sequence(pQuotes) then > > if length(pQuotes) = 0 then --excuse me if I dont know euphoria syntax that well, but how could the length of pquotes possibly be zero. This function has to take four arguments right?In my mind control would never be passed here because euphoria wont run this program without getting four arguments.(Thinking out loud)so this line is simply testing to see if the caller has specified {} an empty sequence as a parameter? > > pQuotes = {{},{},{},{},{}} ok, so now it has five empty sequences in it, why? > > elsif (length(pQuotes) != 5 > > or > > atom(pQuotes[1]) > > or > > atom(pQuotes[2]) > > or > > atom(pQuotes[3]) > > or > > length(pQuotes[1]) != length(pQuotes[2]) > > or > > atom(pQuotes[4]) > > or > > atom(pQuotes[5]) > > or > > length(pQuotes[4]) != length(pQuotes[5]) > > ) > > then > > pQuotes = 0 > > end if > > end if --no idea why to test for all these things > > > > if atom(pQuotes) then > > if pQuotes = 0 then > > pQuotes = {"\"'`", "\"'`", "\\~","",""} > > else > > pQuotes = {{pQuotes}, {pQuotes},{},{},{}} > > end if > > end if -no clue:) > >This is just validating and initializing the pQuotes group of data. > pQuotes[1] and [2] is a pair of character sets. [1] is the start quote >and >[2] is the matching end quote. All the characters inside the quotes is >considered to be a word. okay I understand what your saying here...Im just not getting how you made that happen with your code. > > pQuotes[3] is a list of characters that are 'escape' characters to be >used >inside quoted strings. For example... --so an escape character like ~ would be listed as a seperate sequence? > > 'abc~'def' > >with pQuotes = { "'", "'", '~', ...} > >gives the word value as abc'def ok > --Overall I think I'll just use the simpler function, Im really not able yet to understand this function. I do understand the simpler one though!! --Thanks for taking the time to explain, I'll probably be able to figure it out sometime:) --And I'll make sure to credit you wherever I use it(the simpletokenize, that is) _________________________________________________________________ Add photos to your e-mail with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail
13. Re: help with storing user input
- Posted by Derek Parnell <ddparnell at bigpond.com> May 17, 2003
- 432 views
----- Original Message ----- From: "Jason Dube" <dubetyrant at hotmail.com> To: "EUforum" <EUforum at topica.com> Subject: Re: help with storing user input > > Hey, > I know your not responsible for teaching me to code with euphoria, but this > algoritm uses the language in a lot of ways I couldn't imagine. Hey, I don't mind. >This part of > the code is kinda confusing for me, in order to understand the big for > statement that follows, dont I have to find out whats going on with this > pquotes variable? Your choice. > -Basically I'm wondering what kinds of parameters the caller of this > function would specify for pquotes. > > > > -- Validate quote marks parameter > > > if sequence(pQuotes) then > > > if length(pQuotes) = 0 then > > --excuse me if I dont know euphoria syntax that well, but how could the > length of pquotes possibly be zero. This function has to take four arguments > right?In my mind control would never be passed here because euphoria wont > run this program without getting four arguments.(Thinking out loud)so this > line is simply testing to see if the caller has specified {} an empty > sequence as a parameter? Yes, that's right. > > > pQuotes = {{},{},{},{},{}} and if so it i just a short hand for 5 empty sequences. > ok, so now it has five empty sequences in it, why? in case the user doesn't need 'quote' processing. > > > elsif (length(pQuotes) != 5 > > > > > or > > > atom(pQuotes[1]) > > > or > > > atom(pQuotes[2]) > > > or > > > atom(pQuotes[3]) > > > or > > > length(pQuotes[1]) != length(pQuotes[2]) > > > or > > > atom(pQuotes[4]) > > > or > > > atom(pQuotes[5]) > > > or > > > length(pQuotes[4]) != length(pQuotes[5]) > > > ) > > > then > > > pQuotes = 0 > > > end if > > > end if All of this just makes sure that the parameter has 5 sub sequences and that [1] and [2] are the same length and that [4] and [5] are the same length. If the parameter fails this test, I force it to use the default values. > --no idea why to test for all these things > > > > > > if atom(pQuotes) then > > > if pQuotes = 0 then > > > pQuotes = {"\"'`", "\"'`", "\\~","",""} this is just a way of requesting the default values. The user calls this routine with a zero in this parameter. > > > else > > > pQuotes = {{pQuotes}, {pQuotes},{},{},{}} this is just a shorthand way of saying that the user is only interested in simple quote processing. They can call the routine like this... Tokenize( string, ws, nw, '|' ) so that all characters in the string between vertical bars forms a 'word'. > > > end if > > > end if > > -no clue:) > > > > >This is just validating and initializing the pQuotes group of data. > > pQuotes[1] and [2] is a pair of character sets. [1] is the start quote > >and > >[2] is the matching end quote. All the characters inside the quotes is > >considered to be a word. > > okay I understand what your saying here...Im just not getting how you made > that happen with your code. > > > > pQuotes[3] is a list of characters that are 'escape' characters to be > >used > >inside quoted strings. For example... > > > --so an escape character like ~ would be listed as a seperate sequence? Yes. > > > > 'abc~'def' > > > >with pQuotes = { "'", "'", '~', ...} > > > >gives the word value as abc'def > ok > > > > --Overall I think I'll just use the simpler function, Im really not able yet > to understand this function. I do understand the simpler one though!! > > --Thanks for taking the time to explain, I'll probably be able to figure it > out sometime:) > > --And I'll make sure to credit you wherever I use it(the simpletokenize, > that is) No problems. ---------------- cheers, Derek Parnell