Re: help with storing user input

new topic     » goto parent     » topic index » view thread      » older message » newer message

On Fri, 16 May 2003 00:29:49 +0000, Jason Dube <dubetyrant at hotmail.com> 
wrote:


Hi Jason,
may I be of assistance (as it is my humble code ...)

>
>
> -------------------------------------
> global function Tokenize(sequence pText, object pWhiteSpace, object
> pNonword,
> object pQuotes) --> sequence
> -------------------------------------
> -- pText is returned as a sequence of 'words'.
> -- Each word is delimited by a set of one or more Delimiters
>
>
> sequence lTokens
> integer lStartQuote, lEndQuote
> integer lTextLength
> integer lStart
> integer lPos
>
> -- Validate whitespace parameter
> if atom(pWhiteSpace) then
> if pWhiteSpace = 0 then
> pWhiteSpace = ' ' & 8 & 9 & 10 & 11 & 12 & 13
> else
> pWhiteSpace = {pWhiteSpace}
> end if
> end if

Okay, let's start with this then.

The parameter definition of 'pWhiteSpace' is 'object', implying that the 
caller can use either an atom or a sequence. I allow both for a good 
reason. But before we look at that, realize that 'pWhiteSpace' is meant to 
represent a set of characters that can ALL be considered as "white space 
characters". Now back to the story...

  if pWhiteSpace was passed as an atom then
    if that value is a zero this indicates that the caller wishes to use 
the 'default' set of white space characters. And that is the set of 
characters represented by "' ' & 8 & 9 & 10 & 11 & 12 & 13" - namely the 
SPACE, BACKSPACE, TAB, LINEFEED, VERTICALFEED, FORMFEED and CARRIAGE- 
RETURN.
    if the atom value passed is NOT a zero then I just convert it to a 
sequence by enclosing it in braces.

You see, what I want in the program is a sequence, but I allow people to 
call the routine a number of ways...

   -- Just use the SPACE character as delimiter.
   Tokenize("derek parnell Level11", ' ', ...

   -- Use the SPACE and TAB characters as delimiters.
   Tokenize("derek parnell Level11", {"\t"}, ...

   -- Use the default characters as delimiters.
   Tokenize("derek parnell Level11", 0, ...


My validation of the parameter is not perfect because it allows people to 
pass floating point atoms and nested sequences - which I really do not 
want.

> -- Validate non-word parameter
> if atom(pNonword) then
> if pNonword = 0 then
> pNonword = "`~!@#$%^&*()_-+={[}]|\\:;\"'<,>.?/"
> else
> pNonword = {pNonword}
> end if
> end if

Ditto. The pNonword parameter is a set of characters that and definitely 
not found inside words. I allow people to supply their own non-word 
characters or to use the default ones.


-- 

cheers,
Derek Parnell

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu