1. String?
- Posted by Rolf Schröder <rolf at rschr.de> May 31, 2004
- 721 views
Hi 'string fans'! As I know, a character is a byte that represents a human readable or printable symbol. A character string (synonymous: string) is a series of characters. i.e., a series of bytes representing human readable|printable symbols (words, sentences,...). Is it important to differentiate between a general byte series (#00 to #FF) and a 'string'? 1) If there are 256 readable|printable symbols assigned to the numbers #00 to #FF, then it's impossible do decide, if you have a 'string' or not! 2) If you declare at least one byte not to be a readable|printable symbol, then you may declare any byte series of this type as a 'string' in comparison to a generally byte series, which may contain any byte between #00 and #FF. In C, i.e., #00 is assumed to be such a byte, and therefore a byte series ending with the byte #00 is declared as such a type of string (Null terminated string). This makes sense only for specially written 'string handling routines' (stringcmp(), printf(),...), nothing else. 3) For I know what I would like to read|write|print, Euphoria gives you the opportunity to decide, what you would like to handle as a 'string' or not. In practice I don't see any necessity to have a so called string type, it makes no real sense. However, if you believe you need it, then use a type function similar like that, what Nicholas Koceja has given as an example. Do you really think a sting type makes sense in Euphoria? I don't! -- ---------------------------------------------------- | Dr.Rolf Schröder | E B | | Möörkenweg 37 | C | | 21029 Hamburg | D | | Deutschland | A | | Earth |-------------------------------| | Solar System | Earth Phone : +49-40-724-4650 | | Milky Way | National Fax: 0721-151-577722 | | Local Group | mailto:Rolf at RSchr.de | | Known Universe | http://www.rschr.de | ----------------------------------------------------
2. Re: String?
- Posted by "Juergen Luethje" <j.lue at gmx.de> May 31, 2004
- 671 views
Rolf wrote: > Hi 'string fans'!> As I know, a character is a byte that represents a human readable or > printable symbol. A character string (synonymous: string) is a series of > characters. i.e., a series of bytes representing human readable|printable > symbols (words, sentences,...). Again: I never heard or read, that the definition of "character" or "string" depends on the question, whether or not something is printable. E.g. in BASIC, this is clearly *not* the case. You might also want to look here: http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?characters http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?string Of course, under certain circumstances, it is useful to know, whether or not a given string is printable, but it has nothing got to do with its *definition*. > Is it important to differentiate between a general byte series (#00 to #FF) > and a 'string'? No, this is the same. > 1) If there are 256 readable|printable symbols assigned to the > numbers #00 to #FF, then it's impossible do decide, if you have a > 'string' or not! > > 2) If you declare at least one byte not to be a readable|printable > symbol, then you may declare any byte series of this type as a 'string' in > comparison to a generally byte series, which may contain any byte between > #00 and #FF. In C, i.e., #00 is assumed to be such a byte, and therefore > a byte series ending with the byte #00 is declared as such a type of > string (Null terminated string). This makes sense only for specially > written 'string handling routines' (stringcmp(), printf(),...), nothing > else. > > 3) For I know what I would like to read|write|print, Euphoria gives you the > opportunity to decide, what you would like to handle as a 'string' or not. > In practice I don't see any necessity to have a so called string type, it > makes no real sense. However, if you believe you need it, then use a > type function similar like that, what Nicholas Koceja has given as an > example. Like him, you are missing the point. Using such a user-defined string type doesn't solve the problem: If a Euphoria program reads e.g. {74,111,104,110} from a file, there is no way to find out, whether this sequence means "John", or the weight of the members of my family, or whatever. Surely any string has to be a sequence of special integers (which can be checked by such a user-defined type), but not any such sequence is a string! Regards, Juergen
3. Re: String?
- Posted by irv mullins <irvm at ellijay.com> May 31, 2004
- 652 views
Rolf Schröder wrote: > 3) For I know what I would like to read|write|print, Euphoria gives you the > opportunity to decide, what you would like to handle as a 'string' or not. > In practice I don't see any necessity to have a so called string type, it > makes no real sense. However, if you believe you need it, then use a > type function similar like that, what Nicholas Koceja has given as an > example. This is an 'opportunity' not unlike our recently-enjoyed 'opportunity' to pay income taxes. It obligates me to do lots of extra work, costing me time and money, and I seldom if ever see any benefits. The way I see it, is if in my program I declare "This is a sequence of human-readable characters\n", then clearly it was intended to be a sequence of human readable characters, and Euphoria should be smart enough to *remember* that for at least a few minutes, so later, when I want to display that sequence, Euphoria will do so correctly. If I had intended it to be {84,104,105,115,32,105,115,32,97,32,115... (perhaps a list of ages or weights or something) then I would have entered them as {84,104,105,115,32,105,115,32,97,32,115... wouldn't I? In the rare instance where someone might want to display the ASCII equivalents, or do "math" on that sequence, then *that* is where the programmer should have to go to extra lengths to coerce the data into some other form. Not every single time he uses it. > Do you really think a sting type makes sense in Euphoria? I don't! Absolutely. When I first started programming, computers were primarily for crunching numbers, and text was only a secondary concern. That day is long past. Irv
4. Re: String?
- Posted by Rolf Schröder <rolf at rschr.de> May 31, 2004
- 679 views
Juergen Luethje wrote: > > Rolf wrote: > ... > > As I know, a character is a byte that represents a human readable or > > printable symbol. A character string (synonymous: string) is a series of > > characters. i.e., a series of bytes representing human readable|printable > > symbols (words, sentences,...). > > Again: I never heard or read, that the definition of "character" or > "string" depends on the question, whether or not something is printable. > E.g. in BASIC, this is clearly *not* the case. You might also want to > look here: > <a > href="http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?characters">http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?characters</a> > <a > href="http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?string">http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?string</a> > Jürgen, I see just that stated there what I said, may be with different words. I got this (computer related) definition from: "Dictionary of Computer Terms" (Webster's) and also from "Computer & Internet Dictionary" (Random House). What is a character for you then (computer related)? Later you wrote: > Like him, you are missing the point. > Using such a user-defined string type doesn't solve the problem: If a > Euphoria program reads e.g. {74,111,104,110} from a file, there is no way > to find out, whether this sequence means "John", or the weight of the > members of my family, or whatever. That's true, specially if a fith byte woul be a zero! Excuse me, but now I think YOU are missing the point: the decision, if you want to print it as an ASCII string or if you want to print simply the numbers, the decision comes by selection the 'tool' YOU select: format {%s} in printf() gives you the text, and i.e. format {%d,%d,%d,%d} in printf would give you the plain numbers. Sincerely, Rolf
5. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> May 31, 2004
- 697 views
On Mon, 31 May 2004 09:33:25 -0700, Rolf Schr=F6der <guest at RapidEuphoria.com> wrote: >Excuse me, but now I think YOU are missing the point: the decision, if you= >want to print it as an ASCII string or if you want to print simply the >numbers, the decision comes by selection the 'tool' YOU select: format {%s= } >in printf() gives you the text, and i.e. format {%d,%d,%d,%d} in printf= =20 >would give you the plain numbers. > Hi Rolf, OK, I agree that in more than 90% of print cases, the programmer can easily apply the correct format info, however: Slightly restating the previous example: sequence weights weights={74,111,104,110} sequence name name="John" There is additional meaning obvious to anyone reading the source, which is lost in the assignment. Since equal(weights,name) will return true, any attempt at an "IsString" function is doomed to get one of them wrong. Sure you can do something like: constant tInt=1, tFlt=2, tSeq=3, tStr=4 weights={tSeq,{74,111,104,110}} name={tStr,"John"} Which I think is about the easiest way to preserve the semantic information. Not exactly nice though, is it? There may not be a whole lot a string type will allow that you cannot possibly do without. But to imply it has no merit is silly. Adding strings might more than double the program size and probably make everything 50% slower, so I could accept an argument against it on technical grounds. But being able to read values in the trace window, ex.err, and output from ?weights and ?name, is an overwhelming argument in favour. Of course you may actually be the second person on the planet that actually likes to see name (and weights) in the trace window appear as {74J,111o,104h,110n} ? That definitely falls into the class of Necessary Evil, not the realm of Good Ideas. Regards, Pete
6. Re: String?
- Posted by Nicholas Koceja <Nickofurr at aol.com> May 31, 2004
- 690 views
Juergen Luethje wrote: > > Rolf wrote: > > > Hi 'string fans'! > >> > > As I know, a character is a byte that represents a human readable or > > printable symbol. A character string (synonymous: string) is a series of > > characters. i.e., a series of bytes representing human readable|printable > > symbols (words, sentences,...). > > Again: I never heard or read, that the definition of "character" or > "string" depends on the question, whether or not something is printable. > E.g. in BASIC, this is clearly *not* the case. You might also want to > look here: > <a > href="http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?characters">http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?characters</a> > <a > href="http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?string">http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?string</a> > > Of course, under certain circumstances, it is useful to know, whether or > not a given string is printable, but it has nothing got to do with its > *definition*. > > > Is it important to differentiate between a general byte series (#00 to #FF) > > and a 'string'? > > No, this is the same. > > > 1) If there are 256 readable|printable symbols assigned to the > > numbers #00 to #FF, then it's impossible do decide, if you have a > > 'string' or not! > > > > 2) If you declare at least one byte not to be a readable|printable > > symbol, then you may declare any byte series of this type as a 'string' > > in > > comparison to a generally byte series, which may contain any byte between > > #00 and #FF. In C, i.e., #00 is assumed to be such a byte, and therefore > > a byte series ending with the byte #00 is declared as such a type of > > string (Null terminated string). This makes sense only for specially > > written 'string handling routines' (stringcmp(), printf(),...), nothing > > else. > > > > 3) For I know what I would like to read|write|print, Euphoria gives you the > > opportunity to decide, what you would like to handle as a 'string' or > > not. > > In practice I don't see any necessity to have a so called string type, it > > makes no real sense. However, if you believe you need it, then use a > > type function similar like that, what Nicholas Koceja has given as an > > example. > > Like him, you are missing the point. > Using such a user-defined string type doesn't solve the problem: If a > Euphoria program reads e.g. {74,111,104,110} from a file, there is no way > to find out, whether this sequence means "John", or the weight of the > members of my family, or whatever. First off, Why shuold the program differenciate from these two? take this example: The "key.ex" program.
-- find out what numeric key code is generated by any key on the keyboard -- usage: -- ex key integer code puts(1, "Press any key. I'll show you the key code. Press q to quit\n\n") while 1 do code = get_key() if code != -1 then <b> printf(1, "The key code is: %d\n", code)</b> if code = 'q' then exit end if end if end while
In the program, it thinks of code as an 'integer'. With a string typpe, you could NOT have code as integer, it would have to be OBJECT, since a Chaacter could be aan integer, or a String. the key statement is in bold. If there was a String type, this statement would cause an error. Furthermore, if you defined double-quotes as a "String", then the '%d' would cause the error, since a "string" can not have a decimal value. If you make it so that a "string" can have one, you would be destroying what you tried to prove in the first place. Take a look at this:
sequence string, seq atom at string = "" seq = {} procedure same(object a) if equal(string, a) then puts(1, "TRUE\n\n") else puts(1, "FALSE\n\n") end if end procedure puts(1, "\"\" = {}\n") same(seq) puts(1, "\"abcd\" = {\'a\',\'b\',\'c\',\'d\'}\n") string = "abcd" seq = {'a','b','c','d'} same(seq) puts(1, "\"a\" = \'a\'\n") string = "a" at = 'a' same(at) while get_key() = -1 do end while
This may make you think that there is no differenciation. However, the program will output the following, then wait for a key to be pressed before exiting: "" = {} TRUE "abcd" = {'a','b','c','d'} TRUE "a" = 'a' FALSE An atom, 'a', is not possible if quotes were strings. To tell you the honest truth, a "string" type would be pointless, and would make Euphoria a tad slower. If you were to include a string type, you would have to change how Euphoria looks at sequences, and "strings", for that matter. As you all know, Euphoria is a fast, and easy to understand (Not always easy to learn) Programming language. As it states in "ed.doc", Euphoria uses pure numbers. The smallet unit is an "atom". Atoms and Sequences are the ony two explicitly defined types. An Integer is a type of atom, and an object can be a sequence, as well as an atom. What I think we need to find out, is this Question: "What exactly is a String?" If a string is just defined as in BASIC: "A string of letters or numnbers." then yes, we could benifit from it. However, if a string is defined as a sequence of characters, then ANY seqence of positive integers would fit the description. Although the character page only goes to 255, you can "print" ANY positive integer. Try it yourself. Therefdore, the question of wether a string type should be included or not all depends on the answer to the question: "What exactly defines a String?" > Surely any string has to be a sequence of special integers (which can be > checked by such a user-defined type), but not any such sequence is a > string! > > Regards, > Juergen > > | Programs Incomplete: | 20-30 | | Operating System: | Windows XP |
7. Re: String?
- Posted by Derek Parnell <ddparnell at bigpond.com> Jun 01, 2004
- 637 views
Rolf Schröder wrote: > > > Hi 'string fans'! > > As I know, a character is a byte that represents a human readable or > printable symbol. A character string (synonymous: string) is a series of > characters. i.e., a series of bytes representing human readable|printable > symbols (words, sentences,...). Well, that's one interpretation. Another is that a character is any value in an encoding set, such as ASCII, EBCDIC, or Unicode. Each character in the set has a unique value and may have a glyph (displayable representation). Not all characters are displayable. Some characters have the same glyph. > Is it important to differentiate between a general byte series (#00 to #FF) > and a 'string'? In some sets, not all character values can be contained in a single byte. > 1) If there are 256 readable|printable symbols assigned to the > numbers #00 to #FF, then it's impossible do decide, if you have a > 'string' or not! > > 2) If you declare at least one byte not to be a readable|printable > symbol, then you may declare any byte series of this type as a 'string' in > comparison to a generally byte series, which may contain any byte between > #00 and #FF. In C, i.e., #00 is assumed to be such a byte, and therefore > a byte series ending with the byte #00 is declared as such a type of > string (Null terminated string). This makes sense only for specially > written 'string handling routines' (stringcmp(), printf(),...), nothing > else. > > 3) For I know what I would like to read|write|print, Euphoria gives you the > opportunity to decide, what you would like to handle as a 'string' or not. > In practice I don't see any necessity to have a so called string type, it > makes no real sense. However, if you believe you need it, then use a > type function similar like that, what Nicholas Koceja has given as an > example. Well that's one way of looking at things, but its not generic enough. > Do you really think a sting type makes sense in Euphoria? I don't! It all depends... Everything depends on interpretation. An ATOM is just a set of bytes in RAM that Euphoria has been instructed to interpret in a specific manner. So are INTEGER and SEQUENCE types. These are also just sets of bytes that are interpreted by Euphoria in a specific and documented manner. If Euphoria was to have a string type, it would be the same deal. It would just be the coder telling Euphoria to interpret a set of bytes in a specific manner. The difficulty is deciding what the "specific manner" would be. For example, we might decide that a string is really a restricted form of sequence - one that is only allowed to contain 32-bit unsigned integers that are interpreted as UTF-32 UNICODE characters. In reality, they would still be a set of bytes in RAM, but now we would have a specific and documented intepretation of them. Maybe we could chose to have UTF-8 encoding to save RAM usage as a trade off for of extra processing time. What would be the advantage of this? Well it would mean that Euphoria would be able to trap assignments of non-Unicode characters to string elements (characters?). Sure this can be done now with the 'type' system but a built-in method that is consistant, faster, and automatic is better than the generic 'type' method. It would also mean that other built-in and library routines could perform processing more relevant to the data. Such as displaying the value in string notation "John" rather than numbers. If we needed to see numbers we could always assign as string to a sequence (like we can assign an integer to an atom). It may also be argued that a string type might lead to fewer bugs in some applications, less time involved in debugging ('cos its easier to read strings rather than numbers), and easy take-up for new Euphoria coders. What are the costs? Increased complexity in the Euphoria product which would mean more testing, potentially more bugs, and slower execution times. The extent of these costs are not measurable at this stage and probably won't be until strings are actually implemented. So in the end, it really depends on whether RDS can risk the costs for the benefits. -- Derek Parnell Melbourne, Australia
8. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 01, 2004
- 721 views
On Mon, 31 May 2004 14:27:12 -0700, Nicholas Koceja <guest at RapidEuphoria.com> wrote: >First off, Why shuold the program differenciate from these two? As per my previous post, not much apart from the loss of the obvious, "common sense" meaning which is plain discarded, at a low level. >take this example: > The "key.ex" program. >}}} <eucode> >-- find out what numeric key code is generated by any key on the keyboard >-- usage: >-- ex key > >integer code > >puts(1, "Press any key. I'll show you the key code. Press q to quit\n\n") >while 1 do > code = get_key() > if code != -1 then ><b> printf(1, "The key code is: %d\n", code)</b> > if code = 'q' then > exit > end if > end if >end while ></eucode> {{{ >In the program, it thinks of code as an 'integer'. With a string typpe, you >could NOT have code as integer, it would have to be OBJECT, since a Chaacter >could be aan integer, or a String. the key statement is in bold. If there >was a String type, this statement would cause an error. You lost me completely. <aside>I would fight tooth and nail against a "Character" type... can we just agree that is not the issue here?</aside> . Having a string type won't cause an error; defining some variable as a string when it gets assigned either a string or an integer would, but no different to defining a sequence var and assigning an int to it. Supplying a string param to printf when an integer was expected also might cause an error, but why any more so than a sequence? > Furthermore, if you >defined double-quotes as a "String", then the '%d' would cause the error, since >a "string" can not have a decimal value. If you make it so that a "string" >can have one, you would be destroying what you tried to prove in the first >place. I fail to understand what you are trying to say here either ;-(( >Take a look at this: >}}} <eucode> >sequence string, seq >atom at > >string = "" >seq = {} >procedure same(object a) > if equal(string, a) then > puts(1, "TRUE\n\n") > else > puts(1, "FALSE\n\n") > end if >end procedure > >puts(1, "\"\" = {}\n") >same(seq) >puts(1, "\"abcd\" = {\'a\',\'b\',\'c\',\'d\'}\n") >string = "abcd" >seq = {'a','b','c','d'} >same(seq) >puts(1, "\"a\" = \'a\'\n") >string = "a" >at = 'a' >same(at) >while get_key() = -1 do > end while ></eucode> {{{ >This may make you think that there is no differenciation. Not for a moment. A string and a character (which I would prefer to keep as an integer) are indeed different. (I admit that I made that mistake once (implementing the now shelved pleumage), never again) >To tell you the honest truth, a "string" type would be pointless It might not add much extra functionality, in truth I agree. It would add a welcome degree of readability on the diagnostic side.. >, and would make Euphoria a tad slower. By a fair margin, I reckon. > If you were to include a string type, you would >have to change how Euphoria looks at sequences, and "strings", for that matter. >As you all know, Euphoria is a fast, and easy to understand (Not always easy to >learn) Programming language. As it states in "ed.doc", Euphoria uses pure >numbers. The smallet unit is an "atom". Atoms and Sequences are the ony two >explicitly defined types. An Integer is a type of atom, and an object can be a >sequence, as well as an atom. What I think we need to find out, is this >Question: >"What exactly is a String?" If a string is just defined as in BASIC: "A string >of >letters or numnbers." then yes, we could benifit from it. However, if a >string >is defined as a sequence of characters, then ANY seqence of positive integers >would >fit the description. Although the character page only goes to 255, you can >"print" >ANY positive integer. Well, yes, but that is another bit of a fudge and a clear indication that unicode support is a country mile away... > Try it yourself. Therefdore, the question of wether a string >type should be included or not all depends on the answer to the question: >"What exactly defines a String?" The fundamental thing I would claim is that if it looks like a string in the code you write, it is a string; if it looks like a sequence of numbers, it is a sequence of numbers. The point is to tally what the code reads like and how the trace window and the ? primitive operate. There should indeed be some (hopefully rarely used) functions to convert between the two, just in case you ever need them.. Pete
9. Re: String?
- Posted by irv mullins <irvm at ellijay.com> Jun 01, 2004
- 656 views
Derek Parnell wrote: <snip> > What would be the advantage of this? Well it would mean that Euphoria > would be able to trap assignments of non-Unicode characters to string > elements (characters?). Sure this can be done now with the 'type' system > but a built-in method that is consistant, faster, and automatic is better > than the generic 'type' method. Obviously, the user 'type' system does slow things down (significantly) when doing something like checking a long string character-by-character. So it would be better if it were built into Eu itself - like the integer checking. Add to that the fact that user-written type checking does nothing to simplify or eliminate errors when doing output, so it's a half-solution at best. Consider this, however: Strings can reasonably be expected to be entered either in the source code, where no one in their right mind would go to the trouble to do it this way: name = {74,111,104,110} or via keyboard, where they get entered character-by-character - meaning that if you actually were to type '{74,111,104,110}' and hit enter, you would not have anything resembling "John". It seems to me that it shouldn't be too difficult to say that a leading '{' could always tag the input as a sequence of objects, while a leading '"' could tag it as a string. Just like: constant a = 'A' constant b = "A" Note that these produce different results, based solely on whether a single or a double quote is used. Makes me wonder why, if that can be done, what is it so hard to differntiate between a double quote and a curly bracket? In neither of these cases would there be any significant slowdown, the first would be done at parse time, and the second would be limited by typing speed. Only in the (rare) instances where math is performed on strings would there be any chance of slowness actually being a factor. > It would also mean that other built-in and library routines could perform > processing more relevant to the data. Such as displaying the value in > string notation "John" rather than numbers. And it would do away with the multiple-choice quiz everytime we want to output something. This is a major point of confusion for newcomers to Euphoria, and just causes extra work for all of us. > So in the end, it really depends on whether RDS can risk the costs > for the benefits. I think it depends more on whether the design of the language allows for another type. There was discussion on this list years ago regarding this, and I believe a bit of detective work indicated that it was not possible. Regards, Irv
10. Re: String?
- Posted by David Cuny <dcuny at lanset.com> Jun 01, 2004
- 688 views
Irv Mullins wrote: > Makes me wonder why, if that can be done, what is it so hard to > differntiate between a double quote and a curly bracket? By ensuring that pointers fall on four byte boundaries, and dropping the precision of integers, Euphoria frees up a couple bits in the C int datatype, which it uses to flag the type stored inside, which is something like: - positive integer - negative integer - pointer to atom - pointer to sequence - undefined Since there's no remaining bits in the int that can be used to flag the datatype, the only other "simple" option would be to add an extra field to the sequence structure. The addition of the field wouldn't be too expensive, but you'd then have to perform an additional test on sequences to determine: 1. Is the sequence a string? 2. If it's a string, is it still a string after the last operation? Some operations - concatenation and slicing - would be "free", since you guarantee that the data in the sequence would still be a string. But for other operations - bitwise, math and comparison - you'd have to scan the string to ensure that it was still a valid string. Since any sequence could possibly be a string (you don't know until you test it), Euphoria would have to perform at least the first test on all sequences. This is guaranteed to slow things down a bit, I think that'll be a hard thing to sell to Robert. -- David Cuny
11. Re: String?
- Posted by irv mullins <irvm at ellijay.com> Jun 01, 2004
- 653 views
David Cuny wrote: > By ensuring that pointers fall on four byte boundaries, and dropping the > precision of integers, Euphoria frees up a couple bits in the C int datatype, > which it uses to flag the type stored inside, which is something like: > > - positive integer > - negative integer > - pointer to atom > - pointer to sequence > - undefined > > Since there's no remaining bits in the int that can be used to flag the > datatype, the only other "simple" option would be to add an extra field to > the sequence structure. Thanks, that confirms my suspicion. So I guess we're out of luck with regard to strings, structures, or similar things until everyone is running 64-bit systems? Then we'll either get higher precision integers, or 4294967296 new data types! Perhaps a compromise would be in order. Irv
12. Re: String?
- Posted by "Juergen Luethje" <j.lue at gmx.de> Jun 01, 2004
- 696 views
Pete wrote: > On Mon, 31 May 2004 09:33:25 -0700, Rolf Schroeder > <guest at RapidEuphoria.com> wrote: > >> Excuse me, but now I think YOU are missing the point: the decision, if you >> want to print it as an ASCII string or if you want to print simply the >> numbers, the decision comes by selection the 'tool' YOU select: format {%s} >> in printf() gives you the text, and i.e. format {%d,%d,%d,%d} in printf >> would give you the plain numbers. I know, Rolf. That's exactly what I (and other people, too) don't like. Mainly because in some situations, we don't have the possibility to select anything: this applies to trace() and to output to the "ex.err" file. Also, RDS claims that Euphoria is simpler than BASIC. While this might be true in general, in BASIC we can do this: dim s as string, i as integer s = "My age is" i = 99 print s i That's what I call simple. In Euphoria, it's currently not possible to have a generic output routine such as 'print' in BASIC, because sometimes only the programmer (and not the program) knows, what a given sequence means. Although I like Euphoria's pretty_print(), and Pete's version IMHO does even smarter guessing, any output routine sometimes can't do anything else than *guess*, what it should do. This is not satisfactory, IMHO. > Hi Rolf, > > OK, I agree that in more than 90% of print cases, the programmer can > easily apply the correct format info, however: > > Slightly restating the previous example: > > sequence weights > weights={74,111,104,110} > sequence name > name="John" > > There is additional meaning obvious to anyone reading the source, > which is lost in the assignment. Since equal(weights,name) will return > true, any attempt at an "IsString" function is doomed to get one of > them wrong. Yes, and using a user-defined string type does *not* solve the problem. > Sure you can do something like: > > constant tInt=1, tFlt=2, tSeq=3, tStr=4 > > weights={tSeq,{74,111,104,110}} > name={tStr,"John"} > > Which I think is about the easiest way to preserve the semantic > information. Not exactly nice though, is it? No, not too nice. And it also doesn't make the output of trace() and the output to "ex.err" better readable. But something like that is what I would Euphoria like to do *internally* (if the cost is not too big). > There may not be a whole lot a string type will allow that you cannot > possibly do without. But to imply it has no merit is silly. > > Adding strings might more than double the program size and probably > make everything 50% slower, so I could accept an argument against it > on technical grounds. Mee too. > But being able to read values in the trace window, ex.err, and output > from ?weights and ?name, is an overwhelming argument in favour. > > Of course you may actually be the second person on the planet that > actually likes to see name (and weights) in the trace window appear as > {74J,111o,104h,110n} ? > > That definitely falls into the class of Necessary Evil, not the realm > of Good Ideas. Regards, Juergen
13. Re: String?
- Posted by StewartML <stewart at isoclass.co.uk> Jun 01, 2004
- 644 views
I think that the string type would be very useful. It would be good if it could be implimented much like this: string myStr sequence mySeq myStr = "John" myStr = {74,111,104,110} myStr &= get_key() if myStr = "John" then --this could be invaluable, I hate having to type equal() mySeq = "John" -- should still display as {74,111,104,110} because it is declared as a string, it doesnt matter how you assign it, it will always be shown as a string.. By the same token, you should still be able to assign double quotes to a sequence, only it is displayed as integers. if you ever see the need to output the integer values of a string, or vice versa, then you could just assign it to the relevant data type, or have a VB style CStr() or CSeq() function. The major problem i have with sequences is the fact that you need to use equal() for simple strings. StewartML, Scotland
14. Re: String?
- Posted by cklester <cklester at yahoo.com> Jun 01, 2004
- 657 views
StewartML wrote: > > The major problem i have with sequences is the fact that you need to use > equal() for simple strings. function e( sequence x, sequence y ) return equal(x,y) end function sequence seq1, seq2 if e(seq1, seq2) then end if I just saved you 57% typing!!! I don't know how bad a speed hit this takes, though. Yes, you could save 75% from my method with seq1 = seq2 :)
15. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 01, 2004
- 657 views
On Tue, 01 Jun 2004 20:00:15 +0000, StewartML <Stewart at isoclass.co.uk> wrote: <snip> >if you ever see the need to output the integer values of a string, That particular need could be easily met, eg: sequence s string t t="hello" s=repeat(0,length(t)) for i=1 to length(t) do s[i]=t[i] end for As an application programmer, I would have no qualms with having to do something like that to "rip apart" a string, and of course if you do need such in several places, it is trivial to code as a function. >or vice versa, then you could just assign it to the relevant data type, I think you can guess how I think a sequence could easily be converted into a proper string (with integer(), >0, and <256 checks, of course) It may be tempting to think you can automate such conversions, but I think that will (may) cause problems, and is not really needed. (btw, thanks - that just cleared up a few things for me) > >The major problem i have with sequences is the fact that you need to use >equal() for simple strings. Tell me about itApart from the way upper() and lower() are currently implemented (which cannot in anyones mind be the best), just how often are =, !=, <, <=, >, >= actually used as sequence ops? Regards, Pete
16. Re: String?
- Posted by Derek Parnell <ddparnell at bigpond.com> Jun 01, 2004
- 643 views
Pete Lomax wrote: > > On Tue, 01 Jun 2004 20:00:15 +0000, StewartML <Stewart at isoclass.co.uk> > wrote: > > <snip> > >if you ever see the need to output the integer values of a string, > That particular need could be easily met, eg: > sequence s > string t > t="hello" > s=repeat(0,length(t)) > for i=1 to length(t) do > s[i]=t[i] > end for > > As an application programmer, I would have no qualms with having to do > something like that to "rip apart" a string, and of course if you do > need such in several places, it is trivial to code as a function. I would have thought that all one needed to do was ... sequence s string t t="hello" s=t This is like what one does for integers and atoms. atom x integer y y=1 x=y As a string is really a subset of a sequence, converting it to a sequence should be a trival effort for the interpreter. > >or vice versa, then you could just assign it to the relevant data type, > I think you can guess how I think a sequence could easily be converted > into a proper string (with integer(), >0, and <256 checks, of course) Your 'proper string' is still only good for certain subsets of strings. It wouldn't work for Unicode strings. And ASCII or EBCDIC strings either as you exclude the NUL character. The NUL is a valid character. For example many printers and modems need it in their control strings. > It may be tempting to think you can automate such conversions, but I > think that will (may) cause problems, and is not really needed. > > (btw, thanks - that just cleared up a few things for me) > > > > >The major problem i have with sequences is the fact that you need to use > >equal() for simple strings. > Tell me about itApart from the way upper() and lower() are > currently implemented (which cannot in anyones mind be the best), just > how often are =, !=, <, <=, >, >= actually used as sequence ops? I'm with you on this one though. The upper/lower functions are only good for ASCII encoding. Try this on for size ... ? lower({74.5, 104.01}) ? upper({74.5, 104.01}) RDS is concerned that some requested features for Euphoria might be little used and thus not really worth the trouble of adding them...such as using relationship operators as if they were sequence operations
-- Derek Parnell Melbourne, Australia
17. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 02, 2004
- 659 views
On Tue, 01 Jun 2004 16:34:38 -0700, Derek Parnell <guest at RapidEuphoria.com> wrote: >Try this on for size ... > > ? lower({74.5, 104.01}) > ? upper({74.5, 104.01}) LOL It makes a bit more sense if you write it like this: ?lower({'J'+0.5,'h'+0.1}) ?{'j'+0.5,'h'+0.1} ?upper({'J'+0.5,'h'+0.1}) ?{'J'+0.5,'H'+0.1}
18. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 02, 2004
- 644 views
On 2 Jun 2004 11:47:25 +0200, Christian Cuvier <Christian.CUVIER at agriculture.gouv.fr> wrote: >}}} <eucode> > >--find first pair of adjacent distinct values: >p=find(0,(s&s[length(s)] = prepend(s,s[1]))-1 -- -1 if no such pair There is a syntax error in that, and even if I fix it a simple for loop is about twice as fast when s={1,1,1,1,1,1,1,1,1,2} >--find first repeated value: >p=find(1,(s&s[length(s)] = prepend(s,s[1]))-1 -- -1 if no such pair Another syntax error, and that does not work at all. If I fix it so it does, a for loop is still twice as fast when s={1,2,3,4,5,6,7,7,8,9} >--find first mismatch: >p=find(0,(s1 = s2)) That one, OK, I grant you is slightly faster. and somewhat easier to type. You could still code it as p=find(0,eq(s1,s2)) though, provided there was a new builtin eq() function to replace the sequence op.. >--is a sequence strictly increasing? >p=find(1,(s&(s[length(s)]+1) >= prepend(s,s[1]-1)) --0 means yes Again, a simple for loop is over twice as fast. > ></eucode> {{{ > >Could go on for pages. These operators are quite useful in conjunction with >find(). match() allows even niftier tricks. Well, now that I have just tested them, I know they are not as fast as people like to think they are, and they can easily be replaced with a builtin function. Pete
19. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 02, 2004
- 646 views
On Tue, 01 Jun 2004 07:50:08 -0700, irv mullins <guest at RapidEuphoria.com> wrote: >Add to that the fact that user-written type checking does >nothing to simplify or eliminate errors when doing output, so it's a >half-solution at best. I still don't get that.
20. Re: String?
- Posted by irv mullins <irvm at ellijay.com> Jun 02, 2004
- 662 views
Pete Lomax wrote: > > On Tue, 01 Jun 2004 07:50:08 -0700, irv mullins > <guest at RapidEuphoria.com> wrote: > > >Add to that the fact that user-written type checking does > >nothing to simplify or eliminate errors when doing output, so it's a > >half-solution at best. > I still don't get that. It really isn't that complicated: Consider the following - a = 12 b = 23.6609 c = "Hello" d = [1,2,6,"Hi"] >>> print a, b, c, d 12 23.6609 "Hello" [1,2,6,"Hi"] That's python. You don't have to come up with the "right" function to properly print variables, python manages to keep track for itself what is a string and what is an integer, a float, or a sequence. Lua does much the same. Both, of course also have a printf() type of func for when you actually need special formatting, and like Euphoria's printf(), they are a bit slower than print. Now, without using printf, let's see you get Euphoria to display the contents of variable d: constant d = {1,2,"Hi"} print doesn't work, it displays: {1,2,{72,105}} - where's the "Hi"? ? doesn't work either, it displays: { 1, 2, {72,105} } puts() won't even run: test.exu:10 sequence found inside character string --> see ex.err So not only do you have to pick and choose the correct output function for each variable (and each member of the variable) separately, but if the contents of a variable change, or the nesting changes, you have to go back and rewrite every line that outputs that variable. Try changing {1,2,"Hi"} to {1,2,{"Hello","World"}} and see if it still works. Not even prinf() will help you here. printf(1,"%d %d %s %s\n",{d[1],d[2],d[3][1],d[3][2]}) Now make it {1,2,3,{"Hello","World"}} and see what happens. Does anyone think that meets the definition of "simple"? Python, Lua, and several other languages handle this in a straightforward manner, even though they do not have typed variables. Surely if Euphoria is going to make us declare types, it could make use of that information later. And user-written type checking isn't going to help. By the way, no one need bring up the "but that would make Euphoria slower" argument. I have already benchmarked Euphoria and Lua on output, and Lua wins handily. Irv
21. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 02, 2004
- 641 views
On Wed, 02 Jun 2004 15:43:14 -0700, irv mullins <guest at RapidEuphoria.com> wrote: >It really isn't that complicated: Now I see. >By the way, no one need bring up the "but that would make >Euphoria slower" argument. I have already benchmarked Euphoria and Lua on >output, and Lua wins handily. ex.exe is about 50 times faster than exw.exe for console display... Pete
22. Re: String?
- Posted by Derek Parnell <ddparnell at bigpond.com> Jun 03, 2004
- 676 views
Pete Lomax wrote: > > On Tue, 01 Jun 2004 07:50:08 -0700, irv mullins > <guest at RapidEuphoria.com> wrote: > > >Add to that the fact that user-written type checking does > >nothing to simplify or eliminate errors when doing output, so it's a > >half-solution at best. > I still don't get that. Error detection is not the same as error prevention. The fact that one finds an error does not stop the cause of the error. -- Derek Parnell Melbourne, Australia
23. Re: String?
- Posted by Evan Marshall <1evan at sbcglobal.net> Jun 03, 2004
- 636 views
Support for Euphoria from the creator of C++? There are only two kinds of programming languages: those people always bitch about and those nobody uses --Bjarne Stroustrup
24. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 03, 2004
- 669 views
On Wed, 02 Jun 2004 17:32:46 -0700, Derek Parnell <guest at RapidEuphoria.com> wrote: >Pete Lomax wrote: >> >> On Tue, 01 Jun 2004 07:50:08 -0700, irv mullins >> <guest at RapidEuphoria.com> wrote: >> >> >Add to that the fact that user-written type checking does >> >nothing to simplify or eliminate errors when doing output, so it's a >> >half-solution at best. >> I still don't get that. > >Error detection is not the same as error prevention. The fact that one finds an >error >does not stop the cause of the error. As I now understand it, irv was not talking about errors at all. He was talking about natural expression. Which I got already. Regards, Pete
25. Re: String?
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 03, 2004
- 666 views
On Wed, 02 Jun 2004 18:04:06 -0700, Evan Marshall <guest at RapidEuphoria.com> wrote: >There are only two kinds of programming languages: those people always bitch >about and those nobody uses >--Bjarne Stroustrup Saw him at the Lakeside, when he beat the Crafty Cockney by four sets I'll get me coat Pete