Re: internal storage

new topic     » goto parent     » topic index » view thread      » older message » newer message

CoJaBo wrote:
> 
> Derek Parnell wrote:

[snip]

> > All 'characters' are stored as 4-byte integers and not stored as single 
> > bytes.
>
> This DEFINATLY should be improved in a new version of Euphoria.
> There are many times where I use allocated memory to get around
> this problem.
> Euphoria 2.5(or 2.6 if it would take too long) should use 1-byte
> instead of 4-byte whenever possible.

On the other hand, Euphoria's choice of 30-bit characters makes Unicode
very, very easy to implement. Encoding in UTF-32 is a one-to-one
mapping for most characters and only a small number would need to be
stored in atoms.

At the risk of complicating Euphoria, there may be a case to argue for
a native UTF-8 character string. This would mean that English text
would use 8-bit characters, and most European languages would average
around 8-10 bits per character, though the East Asian languages would
more than likely average 16-20 bits per character. Microsoft have 
decided to store Unicode strings as UTF-16 encoding which means that
most languages in the world use about 2 bytes per character.

Of course, you could do roll-your-own 'packed' string type for Euphoria
sequences at the cost of slower execution speed. 

But there can't be many applications where the need for all text to 
be simulanteously stored in RAM is actually a performance boost. Most
applications would only be dealing with a subset of the text at any one
time. I don't think Google keeps all its cached pages in RAM blink

<anacadote>
I once wrote a tiny text editor (4KB of assembler) in which the text was
never stored in RAM, just the disk address of each line. It ran so fast
on an Intel-8088 that people didn't notice it was continually going
out to disk to read text in.
</anacadote>

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu