Re: UTF-8

new topic     » goto parent     » topic index » view thread      » older message » newer message
ArthurCrump said...

The current Unicode specification fits into 21 bits.
UTF-32 characters are in the range 0-#10FFFF

Thanks for jogging my memory. When I was looking for "personal space" for better collation of for 6000-8000 characters in one of the Indic languages, I had decided upon using E000 area. Then I vaguely remember that I was also looking at the pages above #10FFFF to do the same i.e. to find 12 times 6000 character space and I stopped because I fell ill.

Yes, 31 bit integer Euphoria can definitely cope with the "32 bit Unicode" and my Indic OEM extensions after 21 bits. I hope I can keep good health to do this. I will look more closely at the Unicode branch as suggested by Matt Lewis, and see if I can cope with it and/or improve upon it.

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu