Re: Suggestion for pretty_print()
- Posted by Jason Gade <jaygade at yahoo.com> Dec 04, 2005
- 511 views
Juergen Luethje wrote: > > Robert Craig wrote: > > > Juergen Luethje wrote: > >>> Rob, how about this: > >>> You could ship Euphoria with the user defined type "string". In contrast > >>> to suggestions that had been made in the past concerning a "string" > >>> type, this would not mean any change of Euphoria's internal affairs > >>> during normal operation. People use a user defined "string" type anyway, > >>> and when using "without type check", there is no loss of speed. > >>> The advantage would be that in case of a crash, Euphoria knows what > >>> constants and variables are strings, and can display them accordingly. > >> > >> The following is probably even better: > >> Provide a new keyword "string", which is just handled as an alias for > >> "sequence" during normal operation. But the internal routine that writes > >> the ex.err dump can take advantage of the additional information which > >> sequence is a string. > >> > >> <snip> > >> > >> Regards, > >> Juergen > >> > >> PS: As a positive side-effect, declaring strings explicitely as "string" > >> rather than as "sequence" increases the readability of the code. > > > > It's an interesting idea, but most strings in an ex.err dump > > are probably elements of larger sequences. Most aren't aren't stored > > as simple strings in individual variables. For instance, > > a symbol table might contain a thousand entries, each represented by > > a sequence containing a mixture of string and numeric fields. > > Simply having a "string" type wouldn't let you define to the ex.err > > dumper which fields are string and which are numeric in a > > complicated sequence. > > Oh, I see. I didn't think of that. > However, the interpreter has some additional information about what is a > string. I think it can safely consider any sequence as string, that is > defined in the source code by using quotes. When the programmer writes: > sequence s > s = "red" > > s/he wants to deal with a string. When s/he writes: > sequence s > s = {114,101,100} > > then s/he wants to deal with three numbers. > But currently I think both expressions are exactly the same for Euphoria > internally. Immediately after reading the source code, the interpreter > 'forgets' about the fact, that "red" and {114,101,100} are intended to > express different things. > Also e.g. lines that are read from a file by gets() are strings, of > course. The thought of strings and data mixed in sequences did occur to me, but I still think that a "string" type is a good idea. I think you found the solution above: the interpreter should remember if a sequence or subsequence is assigned using quotes instead of braces. Or if a subsequence is assigned the value of a "string" variable or a function that returns a "string". The interpreter could flag whether something was a string or not by using these cues. > > > Also, when I get an ex.err dump, I usually just check the > > statement where the error occurred, and the error message, > > and sometimes I look at the values of 2 or 3 variables that > > are used near the crash point. It's not that big a deal if > > I sometimes see a mixture of characters and numerics in a > > variable value. It's a bit cluttered, but does it warrant a > > new language feature just to reduce the clutter? > > Did it warrant a new language feature just to allow us to replace > "x[length(x)]" with "x[$]"? My personal answer clearly is "Yes". > You know that in the past many people have asked for a readable ex.err > dump. As Jason wrote, this also affects the output of trace(). The problem is deciding whether a sequence or subsequence of small integers (0-255) is a string or just numbers. By having a "string" type and by remembering whether something was assigned using quotes or a "string" variable/function may solve that ambiguity. > > I am pretty sure that such "ugly spots on the beautiful face" of > Euphoria must be removed, when you want Euphoria to get a considerably > larger number of users. The interpreter automatically promotes sequences of integers to atoms internally when necessary, and keeps track of that. It may be a lot of work to do the same thing with strings, but I think it makes sense. One thing I love about Euphoria is the nearly typeless system. I wouldn't want to see it polluted with a lot of unnecessary built-in types. However, I think a built-in string type makes sense, is orthogonal with the way the language handles integers, atoms, and sequences, and could improve the language. I know one forum poster who often deals with multi-megabyte text data files who would probably appreciate it if strings were more optimized internally and took up less than 4 bytes per character... > > Regards, > Juergen > > -- > Have you read a good program lately? -- "Actually, I'm sitting on my butt staring at a computer screen." - Tom Tomorrow j.