Integrating string types into EUPHORIA handling

new topic     » topic index » view thread      » older message » newer message

Forked from Re: Euphoria vs The Other Guys

DerekParnell said...
SDPringle said...

... Strings are easily implemented as a UDT.

Not quite. UDT have some restrictions. They are only used during assignments or explicit if UDT(x) tests. We can't tailor other operations to deal with UDTs, nor can we arbitrarily have existing library functions know what to do with a UDT when they encounter one. That, of course, also applies to built in types too.

I really mean we can use UDTs for assignments and explicit checks. In fact the standard library has 'string' so I should have said a UDT already exists there. Did you mean we cannot use UDT routines for things other than assignments and tests or did you mean we don't have operator overloading in EUPHORIA? I don't think we need to.

DerekParnell said...
SDPringle said...

If added as a builtin type, you would only need to modify the parser part.

Actually, it turns out to be a lot more complicated than that. A new built-in type has semantic implications that would undoubtedly affect nearly every aspect of the existing language operations.

One could implement string the same way as sequence is. That is to say use the same structure. We would need to add a special call to make sure it is a valid string but other than that just use struct s1. We have three representations for objects: struct s1, struct d, and int. If you add another we go from binary operations having to handle 9 cases to operations having to handle 16. [/quote]

DerekParnell said...
SDPringle said...

... working with a string is the same as working with a sequence of integers, or a sequence of anythings.

Again, only if you are talking about sequence operations, but even then we would need to add some extra semantics to some operations. For example, adding an atom to a String type should do what, exactly? Issue a runtime error, maybe? Then there's the whole complication of how to output String data. If the string is going to a console device, we would probably need to convert it to UTF8. But what about out to a file? Or output to a windowed applications - UTF16 or ASCII? Now consider functions like upper() etc, these need special handling for unicode strings.

[/quote]

I have thought this out down this same path. I would love for the interpreter to magically help the user and re-encode the strings to the way they ought to be when calling C routines.

I pass the buck to library implemntations and that means us with the Standard Library. If we had different UDT for different encodings we could have ascii_string, codepage800string, utf_string, etc... The EUPHORIA type system could not always tell if they were encoded as we normally encode strings if you passed the wrong kind of string to the wrong place. You could have magic numbers in strings but they would be magic, string packages instead of just strings. We could just leave things simple and let the users figure it out. Most importantly library implementors should always specify what kind of string they are expecting for a given routine.

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu