OpenEuphoria: Forum: Strings in Euphoria (was: Stupid Newbie-sounding question.)

1. Strings in Euphoria (was: Stupid Newbie-sounding question.)

Posted by "Juergen Luethje" <j.lue at gmx.de> May 30, 2004
406 views

Nicholas Koceja wrote:

> Juergen Luethje wrote:
>>
>> Irv wrote:
>>
>>> Michelle wrote:
>>>
>>>> but the problem with that is that it prints "John" as a string of the ascii
>>>> codes,
>>>> instead of an easily editable "John"..
>>>> if i add if sequence() it counts "John" as a sequence, just like the
>>>> sequence
>>>> of flags..
>>>> so my question is...
>>>> how i can (within a loop) tell it to evaluate whether to puts or print?
>>>>
>>>> bah..told you it was a stupid newbie question
>>>>
>>>> Michelle Rogers

BTW: Michelle (and others), choosing a *meaningful* subject will help
     a lot. Thanks in advance.

>>> Not stupid at all.
>>> What the Eu docs call "flexible", I call a design flaw.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Euphoria cannot tell the difference between "John" and {74,111,104,110}
>>> where the latter might be a series of integers or flags or whatever.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

>> [snip]
>>
>> Yes, like "integer" is a special type of atom, there definitely should
>> be a "string" datatype, which is a special type of sequence.
>>
>> At the first glance, it might look as if an additional data type would
>> make Euphoria more complicated, but (latest) at the second glance it
>> is clear that Euphoria would become *much* more flexible (see Michelle's
>> problem here), simpler, safer, and better readable.
>>
>> Regards,
>>    Juergen
>>
> Well, I can see your problem, and I agree that that adding "string" would
> do so, I must say something.  There is a way to tell if it is a string or
> not.

The decisive sentence is the one by Irv, that I emphasized by two lines
of ++++++++++++ (see above).

> In your program, ""'s and {},s mean the same thing,

No. Just try:

puts(1, "74,111,104,110")
puts(1, "\n")
puts(1, {74,111,104,110})

and you'll see the difference.

> so they reccomend using ""'s for text(a string of typeable characters),
> and {}'s for other sequences.

I believe you mean, that in order to create a string, it's not necessary
to type all the ASCII numbers (e.g. {65,66,67,68,69,70,71}). This is
true, but:
<quote>
The Euphoria compiler will immediately convert "ABCDEFG" to the above
sequence of numbers. [...] A quoted string is really just a convenient
notation that saves you from having to type in all the ASCII codes.
</quote>
[Euphoria 2.4, refman_2.htm]

That means in other words, that there is no "string" datatype in
Euphoria, and that's actually the problem that we are talking about.

> However, for a way to find out, you can make two types, like this:
> }}}
<eucode>
> type char(object x)
>     if integer(x) then
>         if (x >= 'a' and x <= 'z') or        -- all lowercase characters
>            (x >= 'A' and x <= 'Z') or        -- all uppercase characters
>            (x >= '0' and x <= '9') or        -- all numericial characters
>            find(x, "?<>,./\\|[]{}`" &            --\ all of the other symbols
>                    "~!@#$%^&*()-=_" &            --- that you can type, even
>                    "+:\'\"\t\r\n" & {32}) then   --/ with "Shift" held down.
>             return 1 -- return true, since only one char. is checked.
>         end if
>     elsif sequence(x)    -- if x is not an integer

then           -- is required here

>         return (length(x) = 1 and char(x[1]))
>     end if
> end type
>
> global type string(sequence x)
>     for elem = 1 to length(x) do
>         if not char(x[elem]) then
>             return 0
>         end if
>     end for
>     return 1
> end type
> </eucode>
{{{


You are adding much confusion to the discussion. I'll try to sort it
out ...

Firstly, your types are coded in bad style.
For instance, try the following piece of code in addition to your two
types, and see what happens when you run the code:

string str
str = {74,111,104,11.5}


Or try this:

string str
str = {74,3,104,110}


Secondly, your type "string" does not define a string at all. So what is
a string? Only talking about Euphoria, this question cannot be answered,
because there are no strings in Euphoria. 
But there are strings in other languages, for instance BASIC. I think a
string can be defined as concatenation of characters. Nowadays, we have
different types of chracters: characters consisting of only 1 byte,
Unicode characters (and maybe more).
Obviously, your type char() is aimed to define characters consisting of
1 byte. But why do you think such a character also could be a sequence??
Also, e.g. ASCII 1 is a character, but is not covered by your type char().
Maybe you are thinking of *printable* characters.

This is a traditional (not Unicode or something) string as I understand
it:

global type string (object x)
   object t

   if atom(x) then return 0 end if
   for i = 1 to length(x) do
      t = x[i]
      if (not integer(t)) or (t < 0)  or (t > #FF) then
         return 0
      end if
   end for
   return 1
end type


Thirdly, and that's the point here, using such a user-defined string
type doesn't solve our problem: There is no way to find out, whether the
sequence {74,111,104,110} means "John", or the weight of the members of
my family, or whatever.

Any string has to be a sequence of special intergers, but not any such
sequence is a string!
And currently, when you have a string, there is no built-in way to tell
it Euphoria. The only way is, to add a "tag" to each variable to
indicate the type, as Irv pointed out.

HTH,
   Juergen

-- 
 /"\  ASCII ribbon campain  | "Everything should be made as simple
 \ /  against HTML in       |  as possible, but not simpler."
  X   e-mail and news,      |
 / \  and unneeded MIME     | [Albert Einstein]

new topic » topic index » view message » categorize

2. Re: Strings in Euphoria (was: Stupid Newbie-sounding question.)

Posted by Nickofurr at aol.com May 31, 2004
398 views

-------------------------------1085962945
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit

In a message dated 5/30/2004 5:01:30 PM Eastern Standard Time, j.lue at gmx.de 
writes:
> In your program, ""'s and {},s mean the same thing,

No. Just try:

puts(1, "74,111,104,110")
puts(1, "\n")
puts(1, {74,111,104,110})

and you'll see the difference.

What I meant here, was that {} = ""

> so they reccomend using ""'s for text(a string of typeable characters),
> and {}'s for other sequences.

I believe you mean, that in order to create a string, it's not necessary
to type all the ASCII numbers (e.g. {65,66,67,68,69,70,71}). This is
true, but:
<quote>
The Euphoria compiler will immediately convert "ABCDEFG" to the above
sequence of numbers. [...] A quoted string is really just a convenient
notation that saves you from having to type in all the ASCII codes.
</quote>
[Euphoria 2.4, refman_2.htm]

That means in other words, that there is no "string" datatype in
Euphoria, and that's actually the problem that we are talking about.

You are right on this fact.  However, what I meant was that ""s denote 
Sequences, as well as text.  For example:

sequence text
text = "abcdefg"
puts(1, text)


> However, for a way to find out, you can make two types, like this:
> }}}
<eucode>
> type char(object x)
>     if integer(x) then
>         if (x >= 'a' and x <= 'z') or        -- all lowercase characters
>            (x >= 'A' and x <= 'Z') or        -- all uppercase characters
>            (x >= '0' and x <= '9') or        -- all numericial characters
>            find(x, "?<>,./\\|[]{}`" &            --\ all of the other 
symbols
>                    "~!@#$%^&*()-=_" &            --- that you can type, even
>                    "+:\'\"\t\r\n" & {32}) then   --/ with "Shift" held down.
>             return 1 -- return true, since only one char. is checked.
>         end if
>     elsif sequence(x)    -- if x is not an integer

then           -- is required here

>         return (length(x) = 1 and char(x[1]))
>     end if
> end type
>
> global type string(sequence x)
>     for elem = 1 to length(x) do
>         if not char(x[elem]) then
>             return 0
>         end if
>     end for
>     return 1
> end type
> </eucode>
{{{


You are adding much confusion to the discussion. I'll try to sort it
out ...

Firstly, your types are coded in bad style.
For instance, try the following piece of code in addition to your two
types, and see what happens when you run the code:

string str
str = {74,111,104,11.5}


Or try this:

string str
str = {74,3,104,110}


Secondly, your type "string" does not define a string at all. So what is
a string? Only talking about Euphoria, this question cannot be answered,
because there are no strings in Euphoria. 
But there are strings in other languages, for instance BASIC. I think a
string can be defined as concatenation of characters. Nowadays, we have
different types of chracters: characters consisting of only 1 byte,
Unicode characters (and maybe more).
Obviously, your type char() is aimed to define characters consisting of
1 byte. But why do you think such a character also could be a sequence??
Also, e.g. ASCII 1 is a character, but is not covered by your type char().
Maybe you are thinking of *printable* characters.

This is a traditional (not Unicode or something) string as I understand
it:

global type string (object x)
   object t

   if atom(x) then return 0 end if
   for i = 1 to length(x) do
      t = x[i]
      if (not integer(t)) or (t < 0)  or (t > #FF) then
         return 0
      end if
   end for
   return 1
end type


Thirdly, and that's the point here, using such a user-defined string
type doesn't solve our problem: There is no way to find out, whether the
sequence {74,111,104,110} means "John", or the weight of the members of
my family, or whatever.

Any string has to be a sequence of special intergers, but not any such
sequence is a string!
And currently, when you have a string, there is no built-in way to tell
it Euphoria. The only way is, to add a "tag" to each variable to
indicate the type, as Irv pointed out.

HTH,
   Juergen

-- 
/"\  ASCII ribbon campain  | "Everything should be made as simple
\ /  against HTML in       |  as possible, but not simpler."
  X   e-mail and news,      |
/ \  and unneeded MIME     | [Albert Einstein]


Signed: Nickofurr
Anyone who catches someone copying this will be smiled upon.


<HTML><HEAD>
<META charset=US-ASCII http-equiv=Content-Type content="text/html;
charset=US-ASCII">
<META content="MSHTML 6.00.2800.1400" name=GENERATOR></HEAD>
<BODY style="FONT-SIZE: 12pt; FONT-FAMILY: Arial; BACKGROUND-COLOR: #ff0000">
<DIV>In a message dated 5/30/2004 5:01:30 PM Eastern Standard Time, j.lue at
gmx.de writes:</DIV>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: blue 2px
solid"><FONT face=Arial>&gt; In your program, ""'s and {},s mean the same
thing,<BR><BR>No. Just try:<BR>&lt;eucode&gt;<BR>puts(1,
"74,111,104,110")<BR>puts(1, "\n")<BR>puts(1,
{74,111,104,110})<BR>&lt;/eucode&gt;<BR>and you'll see the
difference.<BR></FONT></BLOCKQUOTE>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00>What I meant here, was that {} = ""</FONT></DIV>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: blue 2px
solid"><FONT face=Arial><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New
Roman" color=#ffff00></FONT><FONT style="BACKGROUND-COLOR: #ff8040" face="Times
New Roman" color=#ffff00></FONT><FONT style="BACKGROUND-COLOR: #ff8040"
face="Times New Roman" color=#ffff00></FONT><BR>&gt; so they reccomend using ""'s
for text(a string of typeable characters),<BR>&gt; and {}'s for other
sequences.<BR><BR>I believe you mean, that in order to create a string, it's not
necessary<BR>to type all the ASCII numbers (e.g. {65,66,67,68,69,70,71}). This
is<BR>true, but:<BR>&lt;quote&gt;<BR>The Euphoria compiler will immediately
convert "ABCDEFG" to the above<BR>sequence of numbers. [...] A quoted string is
really just a convenient<BR>notation that saves you from having to type in all
the ASCII codes.<BR>&lt;/quote&gt;<BR>[Euphoria 2.4, refman_2.htm]<BR><BR>That
means in other words, that there is no "string" datatype in<BR>Euphoria, and
that's actually the problem that we are talking about.<BR></FONT></BLOCKQUOTE>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00>You are right on this fact.&nbsp; However, what I meant was that
""s denote Sequences, as well as text.&nbsp;&nbsp;For example:</FONT></DIV>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00>&lt;eucode&gt;</FONT></DIV>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00>sequence text</FONT></DIV>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00>text = "abcdefg"</FONT></DIV>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00>puts(1, text)</FONT></DIV>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00>&lt;/eucode&gt;</FONT></DIV>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: blue 2px
solid"><FONT face=Arial><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New
Roman" color=#ffff00></FONT><FONT style="BACKGROUND-COLOR: #ff8040" face="Times
New Roman" color=#ffff00></FONT><BR>&gt; However, for a way to find out, you can
make two types, like this:<BR>&gt; &lt;eucode&gt;<BR>&gt; type char(object
x)<BR>&gt;&nbsp; &nbsp;&nbsp; if integer(x) then<BR>&gt;&nbsp; &nbsp; &nbsp;
&nbsp;&nbsp; if (x &gt;= 'a' and x &lt;= 'z') or&nbsp; &nbsp; &nbsp; &nbsp; --
all lowercase characters<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; (x
&gt;= 'A' and x &lt;= 'Z') or&nbsp; &nbsp; &nbsp; &nbsp; -- all uppercase
characters<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; (x &gt;= '0' and x
&lt;= '9') or&nbsp; &nbsp; &nbsp; &nbsp; -- all numericial
characters<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; find(x,
"?&lt;&gt;,./\\|[]{}`" &amp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; --\ all of
the other symbols<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; "~!@#$%^&amp;*()-=_" &amp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
--- that you can type, even<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; "+:\'\"\t\r\n" &amp; {32}) then&nbsp;&nbsp; --/ with
"Shift" held down.<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return
1 -- return true, since only one char. is checked.<BR>&gt;&nbsp; &nbsp; &nbsp;
&nbsp;&nbsp; end if<BR>&gt;&nbsp; &nbsp;&nbsp; elsif sequence(x)&nbsp; &nbsp; --
if x is not an integer<BR><BR>then&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; -- is
required here<BR><BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return (length(x) = 1
and char(x[1]))<BR>&gt;&nbsp; &nbsp;&nbsp; end if<BR>&gt; end
type<BR>&gt;<BR>&gt; global type string(sequence x)<BR>&gt;&nbsp; &nbsp;&nbsp;
for elem = 1 to length(x) do<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; if not
char(x[elem]) then<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return
0<BR>&gt;&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; end if<BR>&gt;&nbsp; &nbsp;&nbsp; end
for<BR>&gt;&nbsp; &nbsp;&nbsp; return 1<BR>&gt; end type<BR>&gt;
&lt;/eucode&gt;<BR><BR>You are adding much confusion to the discussion. I'll try
to sort it<BR>out ...<BR><BR>Firstly, your types are coded in bad style.<BR>For
instance, try the following piece of code in addition to your two<BR>types, and
see what happens when you run the code:<BR>&lt;eucode&gt;<BR>string str<BR>str =
{74,111,104,11.5}<BR>&lt;/eucode&gt;<BR><BR>Or try
this:<BR>&lt;eucode&gt;<BR>string str<BR>str =
{74,3,104,110}<BR>&lt;/eucode&gt;<BR><BR>Secondly, your type "string" does not
define a string at all. So what is<BR>a string? Only talking about Euphoria, this
question cannot be answered,<BR>because there are no strings in Euphoria.
<BR>But there are strings in other languages, for instance BASIC. I think
a<BR>string can be defined as concatenation of characters. Nowadays, we
have<BR>different types of chracters: characters consisting of only 1
byte,<BR>Unicode characters (and maybe more).<BR>Obviously, your type char() is
aimed to define characters consisting of<BR>1 byte. But why do you think such a
character also could be a sequence??<BR>Also, e.g. ASCII 1 is a character, but is
not covered by your type char().<BR>Maybe you are thinking of *printable*
characters.<BR><BR>This is a traditional (not Unicode or something) string as I
understand<BR>it:<BR><BR>&lt;eucode&gt;<BR>global type string (object
x)<BR>&nbsp;&nbsp; object t<BR><BR>&nbsp;&nbsp; if atom(x) then return 0 end
if<BR>&nbsp;&nbsp; for i = 1 to length(x) do<BR>&nbsp; &nbsp; &nbsp; t =
x[i]<BR>&nbsp; &nbsp; &nbsp; if (not integer(t)) or (t &lt; 0)&nbsp; or (t &gt;
#FF) then<BR>&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return 0<BR>&nbsp; &nbsp; &nbsp;
end if<BR>&nbsp;&nbsp; end for<BR>&nbsp;&nbsp; return 1<BR>end
type<BR>&lt;/eucode&gt;<BR><BR>Thirdly, and that's the point here, using such a
user-defined string<BR>type doesn't solve our problem: There is no way to find
out, whether the<BR>sequence {74,111,104,110} means "John", or the weight of the
members of<BR>my family, or whatever.<BR><BR>Any string has to be a sequence of
special intergers, but not any such<BR>sequence is a string!<BR>And currently,
when you have a string, there is no built-in way to tell<BR>it Euphoria. The only
way is, to add a "tag" to each variable to<BR>indicate the type, as Irv pointed
out.<BR><BR>HTH,<BR>&nbsp;&nbsp; Juergen<BR><BR>-- <BR>/"\&nbsp; ASCII ribbon
campain&nbsp; | "Everything should be made as simple<BR>\ /&nbsp; against HTML
in&nbsp; &nbsp; &nbsp;&nbsp; |&nbsp; as possible, but not simpler."<BR>&nbsp;
X&nbsp;&nbsp; e-mail and news,&nbsp; &nbsp; &nbsp; |<BR>/ \&nbsp; and unneeded
MIME&nbsp; &nbsp;&nbsp; | [Albert Einstein]</FONT></BLOCKQUOTE>
<DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00></FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT lang=0 style="BACKGROUND-COLOR: #ff8040" face="Times New Roman"
color=#ffff00 size=3 FAMILY="SERIF" PTSIZE="12" BACK="#ff8040"><BR>Signed:
Nickofurr<BR>----------------------------------------------------------------------------------------<BR>A
Pokemon Legacy GrandMaster.<BR>Copying this signature (Unless it is a Fwd or a
Reply) is Prohibited.<BR>Breaking this rule will result in you answering to
me.<BR>Anyone who catches someone copying this will be smiled
upon.<BR>----------------------------------------------------------------------------------------</FONT></DIV></BODY></HTML>

OpenEuphoria

1. Strings in Euphoria (was: Stupid Newbie-sounding question.)

2. Re: Strings in Euphoria (was: Stupid Newbie-sounding question.)

Search

Include:

Quick Links

User menu

Misc Menu