1. Strings in Euphoria (was: Stupid Newbie-sounding question.)
- Posted by "Juergen Luethje" <j.lue at gmx.de> May 30, 2004
- 406 views
Nicholas Koceja wrote: > Juergen Luethje wrote: >> >> Irv wrote: >> >>> Michelle wrote: >>> >>>> but the problem with that is that it prints "John" as a string of the ascii >>>> codes, >>>> instead of an easily editable "John".. >>>> if i add if sequence() it counts "John" as a sequence, just like the >>>> sequence >>>> of flags.. >>>> so my question is... >>>> how i can (within a loop) tell it to evaluate whether to puts or print? >>>> >>>> bah..told you it was a stupid newbie question >>>> >>>> Michelle Rogers BTW: Michelle (and others), choosing a *meaningful* subject will help a lot. Thanks in advance. >>> Not stupid at all. >>> What the Eu docs call "flexible", I call a design flaw. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Euphoria cannot tell the difference between "John" and {74,111,104,110} >>> where the latter might be a series of integers or flags or whatever. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> [snip] >> >> Yes, like "integer" is a special type of atom, there definitely should >> be a "string" datatype, which is a special type of sequence. >> >> At the first glance, it might look as if an additional data type would >> make Euphoria more complicated, but (latest) at the second glance it >> is clear that Euphoria would become *much* more flexible (see Michelle's >> problem here), simpler, safer, and better readable. >> >> Regards, >> Juergen >> > Well, I can see your problem, and I agree that that adding "string" would > do so, I must say something. There is a way to tell if it is a string or > not. The decisive sentence is the one by Irv, that I emphasized by two lines of ++++++++++++ (see above). > In your program, ""'s and {},s mean the same thing, No. Just try:
puts(1, "74,111,104,110") puts(1, "\n") puts(1, {74,111,104,110})
and you'll see the difference. > so they reccomend using ""'s for text(a string of typeable characters), > and {}'s for other sequences. I believe you mean, that in order to create a string, it's not necessary to type all the ASCII numbers (e.g. {65,66,67,68,69,70,71}). This is true, but: <quote> The Euphoria compiler will immediately convert "ABCDEFG" to the above sequence of numbers. [...] A quoted string is really just a convenient notation that saves you from having to type in all the ASCII codes. </quote> [Euphoria 2.4, refman_2.htm] That means in other words, that there is no "string" datatype in Euphoria, and that's actually the problem that we are talking about. > However, for a way to find out, you can make two types, like this: > }}} <eucode> > type char(object x) > if integer(x) then > if (x >= 'a' and x <= 'z') or -- all lowercase characters > (x >= 'A' and x <= 'Z') or -- all uppercase characters > (x >= '0' and x <= '9') or -- all numericial characters > find(x, "?<>,./\\|[]{}`" & --\ all of the other symbols > "~!@#$%^&*()-=_" & --- that you can type, even > "+:\'\"\t\r\n" & {32}) then --/ with "Shift" held down. > return 1 -- return true, since only one char. is checked. > end if > elsif sequence(x) -- if x is not an integer then -- is required here > return (length(x) = 1 and char(x[1])) > end if > end type > > global type string(sequence x) > for elem = 1 to length(x) do > if not char(x[elem]) then > return 0 > end if > end for > return 1 > end type > </eucode> {{{ You are adding much confusion to the discussion. I'll try to sort it out ... Firstly, your types are coded in bad style. For instance, try the following piece of code in addition to your two types, and see what happens when you run the code:
string str str = {74,111,104,11.5}
Or try this:
string str str = {74,3,104,110}
Secondly, your type "string" does not define a string at all. So what is a string? Only talking about Euphoria, this question cannot be answered, because there are no strings in Euphoria. But there are strings in other languages, for instance BASIC. I think a string can be defined as concatenation of characters. Nowadays, we have different types of chracters: characters consisting of only 1 byte, Unicode characters (and maybe more). Obviously, your type char() is aimed to define characters consisting of 1 byte. But why do you think such a character also could be a sequence?? Also, e.g. ASCII 1 is a character, but is not covered by your type char(). Maybe you are thinking of *printable* characters. This is a traditional (not Unicode or something) string as I understand it:
global type string (object x) object t if atom(x) then return 0 end if for i = 1 to length(x) do t = x[i] if (not integer(t)) or (t < 0) or (t > #FF) then return 0 end if end for return 1 end type
Thirdly, and that's the point here, using such a user-defined string type doesn't solve our problem: There is no way to find out, whether the sequence {74,111,104,110} means "John", or the weight of the members of my family, or whatever. Any string has to be a sequence of special intergers, but not any such sequence is a string! And currently, when you have a string, there is no built-in way to tell it Euphoria. The only way is, to add a "tag" to each variable to indicate the type, as Irv pointed out. HTH, Juergen -- /"\ ASCII ribbon campain | "Everything should be made as simple \ / against HTML in | as possible, but not simpler." X e-mail and news, | / \ and unneeded MIME | [Albert Einstein]
2. Re: Strings in Euphoria (was: Stupid Newbie-sounding question.)
- Posted by Nickofurr at aol.com May 31, 2004
- 398 views
-------------------------------1085962945 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit In a message dated 5/30/2004 5:01:30 PM Eastern Standard Time, j.lue at gmx.de writes: > In your program, ""'s and {},s mean the same thing, No. Just try:
puts(1, "74,111,104,110") puts(1, "\n") puts(1, {74,111,104,110})
and you'll see the difference. What I meant here, was that {} = "" > so they reccomend using ""'s for text(a string of typeable characters), > and {}'s for other sequences. I believe you mean, that in order to create a string, it's not necessary to type all the ASCII numbers (e.g. {65,66,67,68,69,70,71}). This is true, but: <quote> The Euphoria compiler will immediately convert "ABCDEFG" to the above sequence of numbers. [...] A quoted string is really just a convenient notation that saves you from having to type in all the ASCII codes. </quote> [Euphoria 2.4, refman_2.htm] That means in other words, that there is no "string" datatype in Euphoria, and that's actually the problem that we are talking about. You are right on this fact. However, what I meant was that ""s denote Sequences, as well as text. For example:
sequence text text = "abcdefg" puts(1, text)
> However, for a way to find out, you can make two types, like this: > }}} <eucode> > type char(object x) > if integer(x) then > if (x >= 'a' and x <= 'z') or -- all lowercase characters > (x >= 'A' and x <= 'Z') or -- all uppercase characters > (x >= '0' and x <= '9') or -- all numericial characters > find(x, "?<>,./\\|[]{}`" & --\ all of the other symbols > "~!@#$%^&*()-=_" & --- that you can type, even > "+:\'\"\t\r\n" & {32}) then --/ with "Shift" held down. > return 1 -- return true, since only one char. is checked. > end if > elsif sequence(x) -- if x is not an integer then -- is required here > return (length(x) = 1 and char(x[1])) > end if > end type > > global type string(sequence x) > for elem = 1 to length(x) do > if not char(x[elem]) then > return 0 > end if > end for > return 1 > end type > </eucode> {{{ You are adding much confusion to the discussion. I'll try to sort it out ... Firstly, your types are coded in bad style. For instance, try the following piece of code in addition to your two types, and see what happens when you run the code:
string str str = {74,111,104,11.5}
Or try this:
string str str = {74,3,104,110}
Secondly, your type "string" does not define a string at all. So what is a string? Only talking about Euphoria, this question cannot be answered, because there are no strings in Euphoria. But there are strings in other languages, for instance BASIC. I think a string can be defined as concatenation of characters. Nowadays, we have different types of chracters: characters consisting of only 1 byte, Unicode characters (and maybe more). Obviously, your type char() is aimed to define characters consisting of 1 byte. But why do you think such a character also could be a sequence?? Also, e.g. ASCII 1 is a character, but is not covered by your type char(). Maybe you are thinking of *printable* characters. This is a traditional (not Unicode or something) string as I understand it:
global type string (object x) object t if atom(x) then return 0 end if for i = 1 to length(x) do t = x[i] if (not integer(t)) or (t < 0) or (t > #FF) then return 0 end if end for return 1 end type
Thirdly, and that's the point here, using such a user-defined string type doesn't solve our problem: There is no way to find out, whether the sequence {74,111,104,110} means "John", or the weight of the members of my family, or whatever. Any string has to be a sequence of special intergers, but not any such sequence is a string! And currently, when you have a string, there is no built-in way to tell it Euphoria. The only way is, to add a "tag" to each variable to indicate the type, as Irv pointed out. HTH, Juergen -- /"\ ASCII ribbon campain | "Everything should be made as simple \ / against HTML in | as possible, but not simpler." X e-mail and news, | / \ and unneeded MIME | [Albert Einstein] Signed: Nickofurr Anyone who catches someone copying this will be smiled upon. <HTML><HEAD> <META charset=US-ASCII http-equiv=Content-Type content="text/html; charset=US-ASCII"> <META content="MSHTML 6.00.2800.1400" name=GENERATOR></HEAD> <BODY style="FONT-SIZE: 12pt; FONT-FAMILY: Arial; BACKGROUND-COLOR: #ff0000"> <DIV>In a message dated 5/30/2004 5:01:30 PM Eastern Standard Time, j.lue at gmx.de writes:</DIV> <BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: blue 2px solid"><FONT face=Arial>> In your program, ""'s and {},s mean the same thing,<BR><BR>No. Just try:<BR><eucode><BR>puts(1, "74,111,104,110")<BR>puts(1, "\n")<BR>puts(1, {74,111,104,110})<BR></eucode><BR>and you'll see the difference.<BR></FONT></BLOCKQUOTE> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00>What I meant here, was that {} = ""</FONT></DIV> <BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: blue 2px solid"><FONT face=Arial><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00></FONT><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00></FONT><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00></FONT><BR>> so they reccomend using ""'s for text(a string of typeable characters),<BR>> and {}'s for other sequences.<BR><BR>I believe you mean, that in order to create a string, it's not necessary<BR>to type all the ASCII numbers (e.g. {65,66,67,68,69,70,71}). This is<BR>true, but:<BR><quote><BR>The Euphoria compiler will immediately convert "ABCDEFG" to the above<BR>sequence of numbers. [...] A quoted string is really just a convenient<BR>notation that saves you from having to type in all the ASCII codes.<BR></quote><BR>[Euphoria 2.4, refman_2.htm]<BR><BR>That means in other words, that there is no "string" datatype in<BR>Euphoria, and that's actually the problem that we are talking about.<BR></FONT></BLOCKQUOTE> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00>You are right on this fact. However, what I meant was that ""s denote Sequences, as well as text. For example:</FONT></DIV> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00><eucode></FONT></DIV> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00>sequence text</FONT></DIV> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00>text = "abcdefg"</FONT></DIV> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00>puts(1, text)</FONT></DIV> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00></eucode></FONT></DIV> <BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: blue 2px solid"><FONT face=Arial><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00></FONT><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00></FONT><BR>> However, for a way to find out, you can make two types, like this:<BR>> <eucode><BR>> type char(object x)<BR>> if integer(x) then<BR>> if (x >= 'a' and x <= 'z') or -- all lowercase characters<BR>> (x >= 'A' and x <= 'Z') or -- all uppercase characters<BR>> (x >= '0' and x <= '9') or -- all numericial characters<BR>> find(x, "?<>,./\\|[]{}`" & --\ all of the other symbols<BR>> "~!@#$%^&*()-=_" & --- that you can type, even<BR>> "+:\'\"\t\r\n" & {32}) then --/ with "Shift" held down.<BR>> return 1 -- return true, since only one char. is checked.<BR>> end if<BR>> elsif sequence(x) -- if x is not an integer<BR><BR>then -- is required here<BR><BR>> return (length(x) = 1 and char(x[1]))<BR>> end if<BR>> end type<BR>><BR>> global type string(sequence x)<BR>> for elem = 1 to length(x) do<BR>> if not char(x[elem]) then<BR>> return 0<BR>> end if<BR>> end for<BR>> return 1<BR>> end type<BR>> </eucode><BR><BR>You are adding much confusion to the discussion. I'll try to sort it<BR>out ...<BR><BR>Firstly, your types are coded in bad style.<BR>For instance, try the following piece of code in addition to your two<BR>types, and see what happens when you run the code:<BR><eucode><BR>string str<BR>str = {74,111,104,11.5}<BR></eucode><BR><BR>Or try this:<BR><eucode><BR>string str<BR>str = {74,3,104,110}<BR></eucode><BR><BR>Secondly, your type "string" does not define a string at all. So what is<BR>a string? Only talking about Euphoria, this question cannot be answered,<BR>because there are no strings in Euphoria. <BR>But there are strings in other languages, for instance BASIC. I think a<BR>string can be defined as concatenation of characters. Nowadays, we have<BR>different types of chracters: characters consisting of only 1 byte,<BR>Unicode characters (and maybe more).<BR>Obviously, your type char() is aimed to define characters consisting of<BR>1 byte. But why do you think such a character also could be a sequence??<BR>Also, e.g. ASCII 1 is a character, but is not covered by your type char().<BR>Maybe you are thinking of *printable* characters.<BR><BR>This is a traditional (not Unicode or something) string as I understand<BR>it:<BR><BR><eucode><BR>global type string (object x)<BR> object t<BR><BR> if atom(x) then return 0 end if<BR> for i = 1 to length(x) do<BR> t = x[i]<BR> if (not integer(t)) or (t < 0) or (t > #FF) then<BR> return 0<BR> end if<BR> end for<BR> return 1<BR>end type<BR></eucode><BR><BR>Thirdly, and that's the point here, using such a user-defined string<BR>type doesn't solve our problem: There is no way to find out, whether the<BR>sequence {74,111,104,110} means "John", or the weight of the members of<BR>my family, or whatever.<BR><BR>Any string has to be a sequence of special intergers, but not any such<BR>sequence is a string!<BR>And currently, when you have a string, there is no built-in way to tell<BR>it Euphoria. The only way is, to add a "tag" to each variable to<BR>indicate the type, as Irv pointed out.<BR><BR>HTH,<BR> Juergen<BR><BR>-- <BR>/"\ ASCII ribbon campain | "Everything should be made as simple<BR>\ / against HTML in | as possible, but not simpler."<BR> X e-mail and news, |<BR>/ \ and unneeded MIME | [Albert Einstein]</FONT></BLOCKQUOTE> <DIV><FONT style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00></FONT></DIV> <DIV> </DIV> <DIV><FONT lang=0 style="BACKGROUND-COLOR: #ff8040" face="Times New Roman" color=#ffff00 size=3 FAMILY="SERIF" PTSIZE="12" BACK="#ff8040"><BR>Signed: Nickofurr<BR>----------------------------------------------------------------------------------------<BR>A Pokemon Legacy GrandMaster.<BR>Copying this signature (Unless it is a Fwd or a Reply) is Prohibited.<BR>Breaking this rule will result in you answering to me.<BR>Anyone who catches someone copying this will be smiled upon.<BR>----------------------------------------------------------------------------------------</FONT></DIV></BODY></HTML>