Re: euphoria text processing

new topic     » goto parent     » topic index » view thread      » older message » newer message
DerekParnell said...
EUWX said...

The simple fact is that Euphoria has a 4 byte character as DEFAULT.

This is nitpicking, I know, but by using the word "default" here, it might imply that there are alternatives. There are none. Euphoria only stores characters as integers, and an integer can hold any of the valid Unicode values (code points) (0 to #10FFFF) and each takes up 4 bytes in RAM.

Or 8 bytes if you're using a 64-bit euphoria!

DerekParnell said...
EUWX said...

In C or C plus plus, there is a methodology based on W prefix signifying wide characters or 16 bit characters. However, I don't think anybody as made an effort to allow this way of compilation.

Unless I'm misunderstanding you, the "W" suffix is actually a Windows convention in naming their API functions and has nothing to do with C or C++. In the C/C++ language, there is a way of specifying 'wide' characters, and that is to use the 'L' prefix. eg... wchar_t wszStr[] = L"1a1g";

And even then, the size of a wchar_t is platform dependent. On Windows, it's 16 bits, but on Unix like systems, it's usually 32 bits.

DerekParnell said...
EUWX said...

You can quite easily write your own functions using Peek/Poke facilities of Euphoria to create the full 16 bit LE text, and write little functions to select a character or number of characters, rotate, etc, using you own small functions. Alternatively you can modify the actual Euphoria low level functions in C language (or C plus). Whatever you do, you will have to create two levels of text functions. If you happen to want the extended Chinese then you can use the 4byte per character setup similar to what I suggested above.

The need for peek() and poke() would only be required for converting text created by non-Euphoria applications into Euphoria sequences.

Indeed, euphoria already has peek2/poke2 routines (and of course, has had peek4/poke4 forever). The memstruct functionality would allow another way to read / write wide chars to memory in a portable way.

Matt

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu