OpenEuphoria: Forum: Re: Status of Edix

Re: Status of Edix

new topic » goto parent » topic index » view thread » older message » newer message

Posted by jimcbrown (admin) Aug 27, 2019
1083 views

Bhupen1277 said...

The original Unicode character could be represented by precisely 2 bytes each. Therefore, writing a piece of software to reach the nth character in a sequence, was very simple, IF you represented them in 2 byte fields

when UTF8 is used to represent these, the nth character position cannot be estimated or guestimated; you have to crawl along to find the nth character.

Yep, but...

Bhupen1277 said...

Therefore, UTF16 was invented, and it was good for the original extent (64K characters) but not enough for the extended characters.

You are thinking of UCS-2. UTF-16 is an extension of UCS-2, which can represent the full new Unicode character set, including the extended characters. However, it does this by using four bytes for the extended characters. (The original set is still represented by two bytes.) So UTF-16 suffers from the same problem as UTF-8 in terms of getting the nth character (however fewer characters are affected).

new topic » goto parent » topic index » view thread » older message » newer message

OpenEuphoria

Re: Status of Edix

Search

Include:

Quick Links

User menu

Misc Menu