Re: Status of Edix

new topic     » goto parent     » topic index » view thread      » older message » newer message
Bhupen1277 said...

The original Unicode character could be represented by precisely 2 bytes each. Therefore, writing a piece of software to reach the nth character in a sequence, was very simple, IF you represented them in 2 byte fields

when UTF8 is used to represent these, the nth character position cannot be estimated or guestimated; you have to crawl along to find the nth character.

Yep, but...

Bhupen1277 said...

Therefore, UTF16 was invented, and it was good for the original extent (64K characters) but not enough for the extended characters.

You are thinking of UCS-2. UTF-16 is an extension of UCS-2, which can represent the full new Unicode character set, including the extended characters. However, it does this by using four bytes for the extended characters. (The original set is still represented by two bytes.) So UTF-16 suffers from the same problem as UTF-8 in terms of getting the nth character (however fewer characters are affected).

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu