Re: unicode and puts

new topic     » goto parent     » topic index » view thread      » older message » newer message

FD(censored) wrote:
> The only way I can see to get around it is implementing the two bytes chunking
> technique, with an alias of puts (which I already use for my web CMS Framework.)
>
> Is there a *really quick and easy* way to get around this?

Ah, Euphoria and Unicode. Don't you just love it?

I've taken a different approach. I use HTML encoding for all of my Unicode.

To demonstrate, go to http://www.wazu.jp/hosting/pricing.exu

Click on 'price this plan', and if you haven't changed any of the 
default numbers, it will verify it and add the ordering section to the page.

In any of the 'name' fields, enter your non-Latin characters. Russian,
Japanese, whatever. Do *not* click that you've agreed to the terms and
conditions. That way, the page will fail with an error message, but your
entered characters will be re-displayed. That's the key ... how they
were re-displayed.

1) The page is 'charset=utf-8'. That means that no matter how you enter
the characters, the browser will POST them back to the CGI program as UTF-8.
    Let's say that  {229,147,169} is input. That's how it's stored in
the database.

2) The UTF-8 (1 - 4 bytes) character is converted to a hex number (its
Unicode number).
     This example would become 54E9.

3) The hex is turned into decimal. The above example would be 21737.

4) The number is turned into its HTML representation, 哩

5) Each character in the page buffer is replaced as above, and the
buffer written out.

6) <SHAMELESS PLUG> Then go to http://www.wazu.jp/ to get some Unicode
fonts to test with </SHAMELESS PLUG>

HTH.

-- 
Craig

PS You should turn off directory listing.

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu