Strings in Euphoria (was: Stupid Newbie-sounding question.)
- Posted by "Juergen Luethje" <j.lue at gmx.de> May 30, 2004
- 720 views
Nicholas Koceja wrote: > Juergen Luethje wrote: >> >> Irv wrote: >> >>> Michelle wrote: >>> >>>> but the problem with that is that it prints "John" as a string of the ascii >>>> codes, >>>> instead of an easily editable "John".. >>>> if i add if sequence() it counts "John" as a sequence, just like the >>>> sequence >>>> of flags.. >>>> so my question is... >>>> how i can (within a loop) tell it to evaluate whether to puts or print? >>>> >>>> bah..told you it was a stupid newbie question >>>> >>>> Michelle Rogers BTW: Michelle (and others), choosing a *meaningful* subject will help a lot. Thanks in advance. >>> Not stupid at all. >>> What the Eu docs call "flexible", I call a design flaw. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Euphoria cannot tell the difference between "John" and {74,111,104,110} >>> where the latter might be a series of integers or flags or whatever. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> [snip] >> >> Yes, like "integer" is a special type of atom, there definitely should >> be a "string" datatype, which is a special type of sequence. >> >> At the first glance, it might look as if an additional data type would >> make Euphoria more complicated, but (latest) at the second glance it >> is clear that Euphoria would become *much* more flexible (see Michelle's >> problem here), simpler, safer, and better readable. >> >> Regards, >> Juergen >> > Well, I can see your problem, and I agree that that adding "string" would > do so, I must say something. There is a way to tell if it is a string or > not. The decisive sentence is the one by Irv, that I emphasized by two lines of ++++++++++++ (see above). > In your program, ""'s and {},s mean the same thing, No. Just try:
puts(1, "74,111,104,110") puts(1, "\n") puts(1, {74,111,104,110})
and you'll see the difference. > so they reccomend using ""'s for text(a string of typeable characters), > and {}'s for other sequences. I believe you mean, that in order to create a string, it's not necessary to type all the ASCII numbers (e.g. {65,66,67,68,69,70,71}). This is true, but: <quote> The Euphoria compiler will immediately convert "ABCDEFG" to the above sequence of numbers. [...] A quoted string is really just a convenient notation that saves you from having to type in all the ASCII codes. </quote> [Euphoria 2.4, refman_2.htm] That means in other words, that there is no "string" datatype in Euphoria, and that's actually the problem that we are talking about. > However, for a way to find out, you can make two types, like this: > }}} <eucode> > type char(object x) > if integer(x) then > if (x >= 'a' and x <= 'z') or -- all lowercase characters > (x >= 'A' and x <= 'Z') or -- all uppercase characters > (x >= '0' and x <= '9') or -- all numericial characters > find(x, "?<>,./\\|[]{}`" & --\ all of the other symbols > "~!@#$%^&*()-=_" & --- that you can type, even > "+:\'\"\t\r\n" & {32}) then --/ with "Shift" held down. > return 1 -- return true, since only one char. is checked. > end if > elsif sequence(x) -- if x is not an integer then -- is required here > return (length(x) = 1 and char(x[1])) > end if > end type > > global type string(sequence x) > for elem = 1 to length(x) do > if not char(x[elem]) then > return 0 > end if > end for > return 1 > end type > </eucode> {{{ You are adding much confusion to the discussion. I'll try to sort it out ... Firstly, your types are coded in bad style. For instance, try the following piece of code in addition to your two types, and see what happens when you run the code:
string str str = {74,111,104,11.5}
Or try this:
string str str = {74,3,104,110}
Secondly, your type "string" does not define a string at all. So what is a string? Only talking about Euphoria, this question cannot be answered, because there are no strings in Euphoria.But there are strings in other languages, for instance BASIC. I think a string can be defined as concatenation of characters. Nowadays, we have different types of chracters: characters consisting of only 1 byte, Unicode characters (and maybe more). Obviously, your type char() is aimed to define characters consisting of 1 byte. But why do you think such a character also could be a sequence?? Also, e.g. ASCII 1 is a character, but is not covered by your type char(). Maybe you are thinking of *printable* characters. This is a traditional (not Unicode or something) string as I understand it:
global type string (object x) object t if atom(x) then return 0 end if for i = 1 to length(x) do t = x[i] if (not integer(t)) or (t < 0) or (t > #FF) then return 0 end if end for return 1 end type
Thirdly, and that's the point here, using such a user-defined string type doesn't solve our problem: There is no way to find out, whether the sequence {74,111,104,110} means "John", or the weight of the members of my family, or whatever. Any string has to be a sequence of special intergers, but not any such sequence is a string! And currently, when you have a string, there is no built-in way to tell it Euphoria. The only way is, to add a "tag" to each variable to indicate the type, as Irv pointed out. HTH, Juergen -- /"\ ASCII ribbon campain | "Everything should be made as simple \ / against HTML in | as possible, but not simpler." X e-mail and news, | / \ and unneeded MIME | [Albert Einstein]