Re: Suggestion for pretty_print()

new topic     » goto parent     » topic index » view thread      » older message » newer message

Juergen Luethje wrote:
> 
> Robert Craig wrote:
> 
> > Juergen Luethje wrote:
> >>> Rob, how about this:
> >>> You could ship Euphoria with the user defined type "string". In contrast
> >>> to suggestions that had been made in the past concerning a "string"
> >>> type, this would not mean any change of Euphoria's internal affairs
> >>> during normal operation. People use a user defined "string" type anyway,
> >>> and when using "without type check", there is no loss of speed.
> >>> The advantage would be that in case of a crash, Euphoria knows what
> >>> constants and variables are strings, and can display them accordingly.
> >> 
> >> The following is probably even better:
> >> Provide a new keyword "string", which is just handled as an alias for 
> >> "sequence" during normal operation. But the internal routine that writes
> >> the ex.err dump can take advantage of the additional information which
> >> sequence is a string.
> >> 
> >> <snip>
> >> 
> >> Regards,
> >>    Juergen
> >> 
> >> PS: As a positive side-effect, declaring strings explicitely as "string"
> >>     rather than as "sequence" increases the readability of the code.
> > 
> > It's an interesting idea, but most strings in an ex.err dump
> > are probably elements of larger sequences. Most aren't aren't stored
> > as simple strings in individual variables. For instance,
> > a symbol table might contain a thousand entries, each represented by
> > a sequence containing a mixture of string and numeric fields.
> > Simply having a "string" type wouldn't let you define to the ex.err
> > dumper which fields are string and which are numeric in a 
> > complicated sequence.
> 
> Oh, I see. I didn't think of that.
> However, the interpreter has some additional information about what is a
> string. I think it can safely consider any sequence as string, that is
> defined in the source code by using quotes. When the programmer writes:
>    sequence s
>    s = "red"
> 
> s/he wants to deal with a string. When s/he writes:
>    sequence s
>    s = {114,101,100}
> 
> then s/he wants to deal with three numbers.
> But currently I think both expressions are exactly the same for Euphoria
> internally. Immediately after reading the source code, the interpreter
> 'forgets' about the fact, that "red" and {114,101,100} are intended to
> express different things.
> Also e.g. lines that are read from a file by gets() are strings, of
> course.

The thought of strings and data mixed in sequences did occur to me, but I still
think that a "string" type is a good idea.

I think you found the solution above: the interpreter should remember if a
sequence or subsequence is assigned using quotes instead of braces. Or if a
subsequence is assigned the value of a "string" variable or a function that
returns a "string". The interpreter could flag whether something was a string or
not by using these cues.

> 
> > Also, when I get an ex.err dump, I usually just check the
> > statement where the error occurred, and the error message,
> > and sometimes I look at the values of 2 or 3 variables that
> > are used near the crash point. It's not that big a deal if
> > I sometimes see a mixture of characters and numerics in a 
> > variable value. It's a bit cluttered, but does it warrant a 
> > new language feature just to reduce the clutter?
> 
> Did it warrant a new language feature just to allow us to replace
> "x[length(x)]" with "x[$]"? My personal answer clearly is "Yes".
> You know that in the past many people have asked for a readable ex.err
> dump. As Jason wrote, this also affects the output of trace().

The problem is deciding whether a sequence or subsequence of small integers
(0-255) is a string or just numbers. By having a "string" type and by remembering
whether something was assigned using quotes or a "string" variable/function may
solve that ambiguity.
 
> 
> I am pretty sure that such "ugly spots on the beautiful face" of
> Euphoria must be removed, when you want Euphoria to get a considerably
> larger number of users.

The interpreter automatically promotes sequences of integers to atoms internally
when necessary, and keeps track of that. It may be a lot of work to do the same
thing with strings, but I think it makes sense.

One thing I love about Euphoria is the nearly typeless system. I wouldn't want
to see it polluted with a lot of unnecessary built-in types. However, I think a
built-in string type makes sense, is orthogonal with the way the language handles
integers, atoms, and sequences, and could improve the language.

I know one forum poster who often deals with multi-megabyte text data files who
would probably appreciate it if strings were more optimized internally and took
up less than 4 bytes per character...

> 
> Regards,
>    Juergen
> 
> -- 
> Have you read a good program lately?


--
"Actually, I'm sitting on my butt staring at a computer screen."
                                                  - Tom Tomorrow

j.

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu