unicode and puts

new topic     » topic index » view thread      » older message » newer message

I'm having a problem outputting unicode strings - now it's no drama doing this
with files, but gets hairy for me when serving a page from my server.

However, when I ran into a situation which involved the page being in russian,
with Lithuanian and English as alternate language options to view the same page,
I decided to make the whole page straight unicode, and ran into this:
I've never had this problem with UTF-8, probably because I only used it for the
odd character such as the Maori 'a' in New Zealand english, which has no all zero
bytes in it (maybe UTF-8 doesn't encode with leading or trailing zeros, I don't
know, but its irrelevent to unicode)


where s = the unicode string "Hello World", the following statement:
puts(1,s)


will output "H" for to the screen because of puts() aborting when the null
character is reached.
putting two bytes (1 character) at a time will work, obviously for both Big and
Little Endian, but there is a performance hit and adding another layer between
the output of your page and Euphoria's puts() procedure.

The only way I can see to get around it is implementing the two bytes chunking
technique, with an alias of puts (which I already use for my web CMS Framework.)

Is there a *really quick and easy* way to get around this?

Also, as someone who uses Euphoria primarily for CGI, the last release for me
was the most directly significant of all of them because of the "include"
changes. I can now relax and spread my files out a bit :P

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu