1. help with storing user input
This is a multi-part message in MIME format.
------=_NextPart_000_0005_01C31A7C.B6E0D280
charset="iso-8859-1"
Hello,
What would be an efficient way of seperating words in a user inputted =
sentence? For example to break apart the words in a sentence. Im =
specifically trying to develop a GOOD algoritm to seperate words from =
user inputed sentence and store them as individual sequences.Like:
user input:"mary had a little lamb"=20
results:sequence 1st_sentence=3D{"mary","had","a","little","lamb")
Im having difficulties skipping whitespaces and converting to string
how would euphoria do this:?
while not end of line
get one letter at a time untill you see a whitespace
store all letters previous to the encountered whitespace in =
1st_sentence
get one letter at a time untill you encounter a whitespace
append 1st_sentence with all letters previous to the whitespace as a =
new sequence, but not the letters previous to the first whitespace
end while
Can someone please help?
Thanks!!
------=_NextPart_000_0005_01C31A7C.B6E0D280
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.3103.1000" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Hello,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>What would be an efficient way of =
seperating words=20
in a user inputted sentence? For example to break apart the words in a =
sentence.=20
Im specifically trying to develop a GOOD algoritm to seperate words from =
user=20
inputed sentence and store them as individual =
sequences.Like:</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>user input:"mary had a little lamb" =
</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>results:sequence=20
1st_sentence=3D{"mary","had","a","little","lamb")</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Im having difficulties skipping =
whitespaces and=20
converting to string</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face=3DArial size=3D2>how would euphoria do =
this:?</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face=3DArial size=3D2>while not end of line</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> get one letter at a =
time untill=20
you see a whitespace</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> store all letters =
previous to=20
the encountered whitespace in 1st_sentence</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> get one letter at a =
time untill=20
you encounter a whitespace</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> append 1st_sentence =
with all=20
letters previous to the whitespace as a new sequence, but not=20
the letters previous to the first=20
whitespace</FONT></DIV>
<DIV><FONT face=3DArial size=3D2> end while</FONT></DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV><FONT face=3DArial size=3D2>Can someone please help?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Thanks!!</FONT></DIV>
<DIV> </DIV>
<DIV> </DIV>
------=_NextPart_000_0005_01C31A7C.B6E0D280--
2. Re: help with storing user input
----- Original Message -----
From: "Jason Dube" <dubetyrant at hotmail.com>
To: "EUforum" <EUforum at topica.com>
Subject: help with storing user input
>
> Hello,
> What would be an efficient way of seperating words in a user inputted
sentence? For example to break apart the words in a sentence. Im
specifically trying to develop a GOOD algoritm to seperate words from user
inputed sentence and store them as individual sequences.Like:
> user input:"mary had a little lamb"
> results:sequence 1st_sentence={"mary","had","a","little","lamb")
> Im having difficulties skipping whitespaces and converting to string
>
> how would euphoria do this:?
Here are a couple of routines that I use...
-------------------------------------
global function Tokenize(sequence pText, object pWhiteSpace, object
pNonword,
object pQuotes) --> sequence
-------------------------------------
-- pText is returned as a sequence of 'words'.
-- Each word is delimited by a set of one or more Delimiters
sequence lTokens
integer lStartQuote, lEndQuote
integer lTextLength
integer lStart
integer lPos
-- Validate whitespace parameter
if atom(pWhiteSpace) then
if pWhiteSpace = 0 then
pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13
else
pWhiteSpace = {pWhiteSpace}
end if
end if
-- Validate non-word parameter
if atom(pNonword) then
if pNonword = 0 then
pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/"
else
pNonword = {pNonword}
end if
end if
-- Validate quote marks parameter
if sequence(pQuotes) then
if length(pQuotes) = 0 then
pQuotes = {{},{},{},{},{}}
elsif (length(pQuotes) != 5
or
atom(pQuotes[1])
or
atom(pQuotes[2])
or
atom(pQuotes[3])
or
length(pQuotes[1]) != length(pQuotes[2])
or
atom(pQuotes[4])
or
atom(pQuotes[5])
or
length(pQuotes[4]) != length(pQuotes[5])
)
then
pQuotes = 0
end if
end if
if atom(pQuotes) then
if pQuotes = 0 then
pQuotes = {"\"'`", "\"'`", "\\~","",""}
else
pQuotes = {{pQuotes}, {pQuotes},{},{},{}}
end if
end if
-- Initialize
lTokens = {}
lStart = 0
lStartQuote = 0
lEndQuote = 0
for i = 1 to length(pText) do
if lStartQuote != 0 then
if pText[i] = lEndQuote then
if find(pText[i - 1], pQuotes[3]) then
if i > 2 and find(pText[i - 2], pQuotes[3]) then
lTokens = append(lTokens, pText[lStart .. i - 1])
lStart = 0
lStartQuote = 0
lEndQuote = 0
end if
else
lTokens = append(lTokens, pText[lStart .. i - 1])
lStart = 0
lStartQuote = 0
lEndQuote = 0
end if
end if
else
lPos = find(pText[i], pQuotes[1])
if lPos != 0 then
lStartQuote = lPos
lStart = i + 1
lEndQuote = pQuotes[2][lPos]
elsif find( pText[i], pWhiteSpace ) then
if lStart != 0
then
lTokens = append(lTokens, pText[lStart .. i - 1])
lStart = 0
end if
else
if lStart = 0
then
lStart = i
end if
if find(pText[i], pNonword) > 0
then
if lStart != 0
then
-- Avoid empty tokens
if lStart != i then
lTokens = append(lTokens, pText[lStart .. i -
1])
end if
lStart = 0
end if
lTokens = append(lTokens, {pText[i]})
lStart = 0
end if
end if
end if
end for
if lStart != 0
then
lTokens = append(lTokens, pText[lStart .. length(pText)])
lStart = 0
end if
return lTokens
end function
-------------------------------------
global function SimpleTokenize(sequence s, object c)
-------------------------------------
-- Returns 's', as a number of words delimited by one or more 'c' objects
integer slen, spt, i
sequence parsed
parsed = {}
slen = length(s)
spt = 1
i = 1
while i <= slen do
while i <= slen and equal(s[i], c) do
i += 1
end while
spt = i
while i <= slen and not equal(s[i],c) do
i += 1
end while
parsed = append(parsed,s[spt..i-1])
i += 1
end while
return parsed
end function
----------------
cheers,
Derek Parnell
3. Re: help with storing user input
wow! Thank you! Definately gonna look at this close. I'd like to try to use
it in my program, if I have some questions about it, is it ok if I ask?
>From: Derek Parnell <ddparnell at bigpond.com>
>Subject: Re: help with storing user input
>
>
>----- Original Message -----
>From: "Jason Dube" <dubetyrant at hotmail.com>
>To: "EUforum" <EUforum at topica.com>
>Subject: help with storing user input
>
>
> > Hello,
> > What would be an efficient way of seperating words in a user inputted
>sentence? For example to break apart the words in a sentence. Im
>specifically trying to develop a GOOD algoritm to seperate words from user
>inputed sentence and store them as individual sequences.Like:
> > user input:"mary had a little lamb"
> > results:sequence 1st_sentence={"mary","had","a","little","lamb")
> > Im having difficulties skipping whitespaces and converting to string
> >
> > how would euphoria do this:?
>Here are a couple of routines that I use...
>
>
>-- pText is returned as a sequence of 'words'.
>-- Each word is delimited by a set of one or more Delimiters
>
>
> sequence lTokens
> integer lStartQuote, lEndQuote
> integer lTextLength
> integer lStart
> integer lPos
>
> -- Validate whitespace parameter
> if atom(pWhiteSpace) then
> if pWhiteSpace = 0 then
> pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13
> else
> pWhiteSpace = {pWhiteSpace}
> end if
> end if
>
> -- Validate non-word parameter
> if atom(pNonword) then
> if pNonword = 0 then
> pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/"
> else
> pNonword = {pNonword}
> end if
> end if
>
> -- Validate quote marks parameter
> if sequence(pQuotes) then
> if length(pQuotes) = 0 then
> pQuotes = {{},{},{},{},{}}
> elsif (length(pQuotes) != 5
> or
> atom(pQuotes[1])
> or
> atom(pQuotes[2])
> or
> atom(pQuotes[3])
> or
> length(pQuotes[1]) != length(pQuotes[2])
> or
> atom(pQuotes[4])
> or
> atom(pQuotes[5])
> or
> length(pQuotes[4]) != length(pQuotes[5])
> )
> then
> pQuotes = 0
> end if
> end if
>
> if atom(pQuotes) then
> if pQuotes = 0 then
> pQuotes = {"\"'`", "\"'`", "\\~","",""}
> else
> pQuotes = {{pQuotes}, {pQuotes},{},{},{}}
> end if
> end if
>
> -- Initialize
> lTokens = {}
> lStart = 0
> lStartQuote = 0
> lEndQuote = 0
> for i = 1 to length(pText) do
> if lStartQuote != 0 then
> if pText[i] = lEndQuote then
> if find(pText[i - 1], pQuotes[3]) then
> if i > 2 and find(pText[i - 2], pQuotes[3]) then
> lTokens = append(lTokens, pText[lStart .. i - 1])
> lStart = 0
<snip>
>
>
4. Re: help with storing user input
- Posted by gertie at visionsix.com
May 15, 2003
On 15 May 2003, at 0:55, Jason Dube wrote:
>
> Hello,
> What would be an efficient way of seperating words in a user inputted
> sentence?
> For example to break apart the words in a sentence. Im specifically trying to
> develop a GOOD algoritm to seperate words from user inputed sentence and store
> them as individual sequences.Like: user input:"mary had a little lamb"
> results:sequence 1st_sentence={"mary","had","a","little","lamb") Im having
> difficulties skipping whitespaces and converting to string
>
> how would euphoria do this:?
parsedline = parse(input," ")
5. Re: help with storing user input
- Posted by gertie at visionsix.com
May 15, 2003
On 15 May 2003, at 2:37, gertie at visionsix.com wrote:
>
> On 15 May 2003, at 0:55, Jason Dube wrote:
>
> >
> > Hello,
> > What would be an efficient way of seperating words in a user inputted
> > sentence? For example to break apart the words in a sentence. Im
> > specifically
> > trying to develop a GOOD algoritm to seperate words from user inputed
> > sentence
> > and store them as individual sequences.Like: user input:"mary had a little
> > lamb" results:sequence 1st_sentence={"mary","had","a","little","lamb") Im
> > having difficulties skipping whitespaces and converting to string
> >
> > how would euphoria do this:?
>
> parsedline = parse(input," ")
You can also do:
parsedline = parse(input," ,.;:'")
or other punctuation. Problem with some is in math, like "1,234.5" , with the
comma and period, or ""blah", he said sadly"" becomes
{"blah","he","said","sadly"} which carries much less info.
Kat
6. Re: help with storing user input
On Thu, 15 May 2003 16:13:05 +1000, Derek Parnell
<ddparnell at bigpond.com> wrote:
<snip>
Interesting, two quick points:
> if find(pText[i], pNonword) > 0
> then
> if lStart !=3D 0
> then
> -- Avoid empty tokens
> if lStart !=3D i then
> lTokens =3D append(lTokens, pText[lStart .. =
i -
>1])
> end if
> lStart =3D 0
> end if
>
^^^^^ it looks to me an "else" has gone walkabouts here.
> lTokens =3D append(lTokens, {pText[i]})
> lStart =3D 0
> end if
> end if
> end if
> end for
2) I can't see they are used, what were pQuotes[4]&[5] supposed to be
for? Just curious.
Pete
7. Re: help with storing user input
----- Original Message -----
From: "Pete Lomax" <petelomax at blueyonder.co.uk>
To: "EUforum" <EUforum at topica.com>
Subject: Re: help with storing user input
>
> On Thu, 15 May 2003 16:13:05 +1000, Derek Parnell
> <ddparnell at bigpond.com> wrote:
>
> <snip>
>
> Interesting, two quick points:
> > if find(pText[i], pNonword) > 0
> > then
> > if lStart != 0
> > then
> > -- Avoid empty tokens
> > if lStart != i then
> > lTokens = append(lTokens, pText[lStart .. i -
> >1])
> > end if
> > lStart = 0
> > end if
> >
> ^^^^^ it looks to me an "else" has gone walkabouts here.
> > lTokens = append(lTokens, {pText[i]})
> > lStart = 0
> > end if
> > end if
> > end if
> > end for
No, the code is correct. No 'else' is missing.
> 2) I can't see they are used, what were pQuotes[4]&[5] supposed to be
> for? Just curious.
I never got around to this, but there were going to be used for nested
tokens; brackets for example. Must complete that I guess. pQuotes[4] is a
list of leading, or opening, symbols and pQuotes[5] is the matching closing
symbols.
----------------
cheers,
Derek Parnell
8. Re: help with storing user input
On Fri, 16 May 2003 00:53:30 +1000, Derek Parnell
<ddparnell at bigpond.com> wrote:
>> ^^^^^ it looks to me an "else" has gone walkabouts here.
>No, the code is correct. No 'else' is missing.
Good. The blank made me suspect, I see a -1 now. You happy, me happy.
>
>> 2) I can't see they are used, what were pQuotes[4]&[5] supposed to be
>> for? Just curious.
>
>I never got around to this, but there were going to be used for nested
>tokens;=20
Eeek(!) I have some pukka code, just for matching [{( & ]}) tho, if
you are interested. (All it does is stack the openings and recurse on
finding a matching close (?9/0 on mismatch); nothing special but you
have mentioned you is busy, so when/if I can help...)
Pete
9. Re: help with storing user input
On Fri, 16 May 2003 00:29:49 +0000, Jason Dube <dubetyrant at hotmail.com>
wrote:
Hi Jason,
may I be of assistance (as it is my humble code ...)
>
>
> -------------------------------------
> global function Tokenize(sequence pText, object pWhiteSpace, object
> pNonword,
> object pQuotes) --> sequence
> -------------------------------------
> -- pText is returned as a sequence of 'words'.
> -- Each word is delimited by a set of one or more Delimiters
>
>
> sequence lTokens
> integer lStartQuote, lEndQuote
> integer lTextLength
> integer lStart
> integer lPos
>
> -- Validate whitespace parameter
> if atom(pWhiteSpace) then
> if pWhiteSpace = 0 then
> pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13
> else
> pWhiteSpace = {pWhiteSpace}
> end if
> end if
Okay, let's start with this then.
The parameter definition of 'pWhiteSpace' is 'object', implying that the
caller can use either an atom or a sequence. I allow both for a good
reason. But before we look at that, realize that 'pWhiteSpace' is meant to
represent a set of characters that can ALL be considered as "white space
characters". Now back to the story...
if pWhiteSpace was passed as an atom then
if that value is a zero this indicates that the caller wishes to use
the 'default' set of white space characters. And that is the set of
characters represented by "' ' & 8 & 9 & 10 & 11 & 12 & 13" - namely the
SPACE, BACKSPACE, TAB, LINEFEED, VERTICALFEED, FORMFEED and CARRIAGE-
RETURN.
if the atom value passed is NOT a zero then I just convert it to a
sequence by enclosing it in braces.
You see, what I want in the program is a sequence, but I allow people to
call the routine a number of ways...
-- Just use the SPACE character as delimiter.
Tokenize("derek parnell Level11", ' ', ...
-- Use the SPACE and TAB characters as delimiters.
Tokenize("derek parnell Level11", {"\t"}, ...
-- Use the default characters as delimiters.
Tokenize("derek parnell Level11", 0, ...
My validation of the parameter is not perfect because it allows people to
pass floating point atoms and nested sequences - which I really do not
want.
> -- Validate non-word parameter
> if atom(pNonword) then
> if pNonword = 0 then
> pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/"
> else
> pNonword = {pNonword}
> end if
> end if
Ditto. The pNonword parameter is a set of characters that and definitely
not found inside words. I allow people to supply their own non-word
characters or to use the default ones.
--
cheers,
Derek Parnell
10. Re: help with storing user input
-Derek, I was wondering if you could give me an overview of what this
particular code is doing.
-- Validate quote marks parameter
if sequence(pQuotes) then
if length(pQuotes) = 0 then
pQuotes = {{},{},{},{},{}}
elsif (length(pQuotes) != 5
or
atom(pQuotes[1])
or
atom(pQuotes[2])
or
atom(pQuotes[3])
or
length(pQuotes[1]) != length(pQuotes[2])
or
atom(pQuotes[4])
or
atom(pQuotes[5])
or
length(pQuotes[4]) != length(pQuotes[5])
)
then
pQuotes = 0
end if
end if
if atom(pQuotes) then
if pQuotes = 0 then
pQuotes = {"\"'`", "\"'`", "\\~","",""}
else
pQuotes = {{pQuotes}, {pQuotes},{},{},{}}
end if
end if
----- Original Message -----
From: "Derek Parnell" <ddparnell at bigpond.com>
To: "EUforum" <EUforum at topica.com>
Sent: Thursday, May 15, 2003 9:04 PM
Subject: Re: help with storing user input
>
> On Fri, 16 May 2003 00:29:49 +0000, Jason Dube <dubetyrant at hotmail.com>
> wrote:
>
>
> Hi Jason,
> may I be of assistance (as it is my humble code ...)
>
> >
> > -- pText is returned as a sequence of 'words'.
> > -- Each word is delimited by a set of one or more Delimiters
> >
> >
> > sequence lTokens
> > integer lStartQuote, lEndQuote
> > integer lTextLength
> > integer lStart
> > integer lPos
> >
> > -- Validate whitespace parameter
> > if atom(pWhiteSpace) then
> > if pWhiteSpace = 0 then
> > pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13
> > else
> > pWhiteSpace = {pWhiteSpace}
> > end if
> > end if
>
> Okay, let's start with this then.
>
> The parameter definition of 'pWhiteSpace' is 'object', implying that the
> caller can use either an atom or a sequence. I allow both for a good
> reason. But before we look at that, realize that 'pWhiteSpace' is meant to
> represent a set of characters that can ALL be considered as "white space
> characters". Now back to the story...
>
> if pWhiteSpace was passed as an atom then
> if that value is a zero this indicates that the caller wishes to use
> the 'default' set of white space characters. And that is the set of
> characters represented by "' ' & 8 & 9 & 10 & 11 & 12 & 13" - namely the
> SPACE, BACKSPACE, TAB, LINEFEED, VERTICALFEED, FORMFEED and CARRIAGE-
> RETURN.
> if the atom value passed is NOT a zero then I just convert it to a
> sequence by enclosing it in braces.
>
> You see, what I want in the program is a sequence, but I allow people to
> call the routine a number of ways...
>
> -- Just use the SPACE character as delimiter.
> Tokenize("derek parnell Level11", ' ', ...
>
> -- Use the SPACE and TAB characters as delimiters.
> Tokenize("derek parnell Level11", {"\t"}, ...
>
> -- Use the default characters as delimiters.
> Tokenize("derek parnell Level11", 0, ...
>
>
> My validation of the parameter is not perfect because it allows people to
> pass floating point atoms and nested sequences - which I really do not
> want.
>
> > -- Validate non-word parameter
> > if atom(pNonword) then
> > if pNonword = 0 then
> > pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/"
> > else
> > pNonword = {pNonword}
> > end if
> > end if
>
> Ditto. The pNonword parameter is a set of characters that and definitely
> not found inside words. I allow people to supply their own non-word
> characters or to use the default ones.
>
>
> --
>
> cheers,
> Derek Parnell
>
>
>
> TOPICA - Start your own email discussion group. FREE!
>
>
11. Re: help with storing user input
----- Original Message -----
From: "Jason Dube" <dubetyrant at hotmail.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: help with storing user input
>
> -Derek, I was wondering if you could give me an overview of what this
> particular code is doing.
>
> -- Validate quote marks parameter
> if sequence(pQuotes) then
> if length(pQuotes) = 0 then
> pQuotes = {{},{},{},{},{}}
> elsif (length(pQuotes) != 5
> or
> atom(pQuotes[1])
> or
> atom(pQuotes[2])
> or
> atom(pQuotes[3])
> or
> length(pQuotes[1]) != length(pQuotes[2])
> or
> atom(pQuotes[4])
> or
> atom(pQuotes[5])
> or
> length(pQuotes[4]) != length(pQuotes[5])
> )
> then
> pQuotes = 0
> end if
> end if
>
> if atom(pQuotes) then
> if pQuotes = 0 then
> pQuotes = {"\"'`", "\"'`", "\\~","",""}
> else
> pQuotes = {{pQuotes}, {pQuotes},{},{},{}}
> end if
> end if
This is just validating and initializing the pQuotes group of data.
pQuotes[1] and [2] is a pair of character sets. [1] is the start quote and
[2] is the matching end quote. All the characters inside the quotes is
considered to be a word.
pQuotes[3] is a list of characters that are 'escape' characters to be used
inside quoted strings. For example...
'abc~'def'
with pQuotes = { "'", "'", '~', ...}
gives the word value as abc'def
pQuotes[4] and [5] are not being used yet.
----------------
cheers,
Derek Parnell
12. Re: help with storing user input
Hey,
I know your not responsible for teaching me to code with euphoria, but this
algoritm uses the language in a lot of ways I couldn't imagine. This part of
the code is kinda confusing for me, in order to understand the big for
statement that follows, dont I have to find out whats going on with this
pquotes variable?
-Basically I'm wondering what kinds of parameters the caller of this
function would specify for pquotes.
> > -- Validate quote marks parameter
> > if sequence(pQuotes) then
> > if length(pQuotes) = 0 then
--excuse me if I dont know euphoria syntax that well, but how could the
length of pquotes possibly be zero. This function has to take four arguments
right?In my mind control would never be passed here because euphoria wont
run this program without getting four arguments.(Thinking out loud)so this
line is simply testing to see if the caller has specified {} an empty
sequence as a parameter?
> > pQuotes = {{},{},{},{},{}}
ok, so now it has five empty sequences in it, why?
> > elsif (length(pQuotes) != 5
> > or
> > atom(pQuotes[1])
> > or
> > atom(pQuotes[2])
> > or
> > atom(pQuotes[3])
> > or
> > length(pQuotes[1]) != length(pQuotes[2])
> > or
> > atom(pQuotes[4])
> > or
> > atom(pQuotes[5])
> > or
> > length(pQuotes[4]) != length(pQuotes[5])
> > )
> > then
> > pQuotes = 0
> > end if
> > end if
--no idea why to test for all these things
> >
> > if atom(pQuotes) then
> > if pQuotes = 0 then
> > pQuotes = {"\"'`", "\"'`", "\\~","",""}
> > else
> > pQuotes = {{pQuotes}, {pQuotes},{},{},{}}
> > end if
> > end if
-no clue:)
>
>This is just validating and initializing the pQuotes group of data.
> pQuotes[1] and [2] is a pair of character sets. [1] is the start quote
>and
>[2] is the matching end quote. All the characters inside the quotes is
>considered to be a word.
okay I understand what your saying here...Im just not getting how you made
that happen with your code.
>
> pQuotes[3] is a list of characters that are 'escape' characters to be
>used
>inside quoted strings. For example...
--so an escape character like ~ would be listed as a seperate sequence?
>
> 'abc~'def'
>
>with pQuotes = { "'", "'", '~', ...}
>
>gives the word value as abc'def
ok
>
--Overall I think I'll just use the simpler function, Im really not able yet
to understand this function. I do understand the simpler one though!!
--Thanks for taking the time to explain, I'll probably be able to figure it
out sometime:)
--And I'll make sure to credit you wherever I use it(the simpletokenize,
that is)
_________________________________________________________________
Add photos to your e-mail with MSN 8. Get 2 months FREE*.
http://join.msn.com/?page=features/featuredemail
13. Re: help with storing user input
----- Original Message -----
From: "Jason Dube" <dubetyrant at hotmail.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: help with storing user input
>
> Hey,
> I know your not responsible for teaching me to code with euphoria, but
this
> algoritm uses the language in a lot of ways I couldn't imagine.
Hey, I don't mind.
>This part of
> the code is kinda confusing for me, in order to understand the big for
> statement that follows, dont I have to find out whats going on with this
> pquotes variable?
Your choice.
> -Basically I'm wondering what kinds of parameters the caller of this
> function would specify for pquotes.
>
> > > -- Validate quote marks parameter
> > > if sequence(pQuotes) then
> > > if length(pQuotes) = 0 then
>
> --excuse me if I dont know euphoria syntax that well, but how could the
> length of pquotes possibly be zero. This function has to take four
arguments
> right?In my mind control would never be passed here because euphoria wont
> run this program without getting four arguments.(Thinking out loud)so this
> line is simply testing to see if the caller has specified {} an empty
> sequence as a parameter?
Yes, that's right.
> > > pQuotes = {{},{},{},{},{}}
and if so it i just a short hand for 5 empty sequences.
> ok, so now it has five empty sequences in it, why?
in case the user doesn't need 'quote' processing.
> > > elsif (length(pQuotes) != 5
>
>
> > > or
> > > atom(pQuotes[1])
> > > or
> > > atom(pQuotes[2])
> > > or
> > > atom(pQuotes[3])
> > > or
> > > length(pQuotes[1]) != length(pQuotes[2])
> > > or
> > > atom(pQuotes[4])
> > > or
> > > atom(pQuotes[5])
> > > or
> > > length(pQuotes[4]) != length(pQuotes[5])
> > > )
> > > then
> > > pQuotes = 0
> > > end if
> > > end if
All of this just makes sure that the parameter has 5 sub sequences and that
[1] and [2] are the same length and that [4] and [5] are the same length. If
the parameter fails this test, I force it to use the default values.
> --no idea why to test for all these things
> > >
> > > if atom(pQuotes) then
> > > if pQuotes = 0 then
> > > pQuotes = {"\"'`", "\"'`", "\\~","",""}
this is just a way of requesting the default values. The user calls this
routine with a zero in this parameter.
> > > else
> > > pQuotes = {{pQuotes}, {pQuotes},{},{},{}}
this is just a shorthand way of saying that the user is only interested in
simple quote processing. They can call the routine like this...
Tokenize( string, ws, nw, '|' )
so that all characters in the string between vertical bars forms a 'word'.
> > > end if
> > > end if
>
> -no clue:)
>
> >
> >This is just validating and initializing the pQuotes group of data.
> > pQuotes[1] and [2] is a pair of character sets. [1] is the start quote
> >and
> >[2] is the matching end quote. All the characters inside the quotes is
> >considered to be a word.
>
> okay I understand what your saying here...Im just not getting how you made
> that happen with your code.
> >
> > pQuotes[3] is a list of characters that are 'escape' characters to be
> >used
> >inside quoted strings. For example...
>
>
> --so an escape character like ~ would be listed as a seperate sequence?
Yes.
> >
> > 'abc~'def'
> >
> >with pQuotes = { "'", "'", '~', ...}
> >
> >gives the word value as abc'def
> ok
> >
>
> --Overall I think I'll just use the simpler function, Im really not able
yet
> to understand this function. I do understand the simpler one though!!
>
> --Thanks for taking the time to explain, I'll probably be able to figure
it
> out sometime:)
>
> --And I'll make sure to credit you wherever I use it(the simpletokenize,
> that is)
No problems.
----------------
cheers,
Derek Parnell