OpenEuphoria: Forum: Re: Euphoria 2.5 Features..... ??

Re: Euphoria 2.5 Features..... ??

new topic » goto parent » topic index » view thread » older message » newer message
Posted by Mario Steele <eumario at trilake.net> Dec 05, 2003
680 views
Hey Al,

Thanks for your response, and as for that Unicode thing about it, I do 
know of a easy way to make that work to, but still allow for a single 
string type definition.  All you would have to do is do a quick byte 
scan through the input stream, to see weither the char is in a 0 to 255 
range, or a 0 to 255*2 range.  A example of this would be:

function need_2_bytes(object stream)
    for x = 1 to length(stream) do
        if stream[x] > 255 and stream[x] < 255*2 then
            return 1
        elsif stream[x] > 255*2 then
            return -1
        end if
    end for
    return 0
end function

Oviously, the first character that it runs into, to be unicode, then we 
don't need to check any further, we assume it's unicode, and allcoate 
the string as such, and if it returns -1, then we have a type_check 
error, which means that someone threw in there something that's bigger 
then 256*2.  And if we get 0 back, then the stream can be put into 
single byte character holders.  And I'm sure there are faster 
algorithims out there, that can scan byte wise in a much faster fashion 
then this.  And the problem is, people don't want to deal with the 
memory routines themselvs, unless it's like explicitly needed by 
windows.  They'd rather use Sequences, and their programs get bloated. 
 That's why Sequences are so popular in Euphoria, cause it get's away 
from PTRs that are dependant in C/C++.  But again, this is just a simple 
Programmer writting his two cents in. LOL

L8ers,
EuMario

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Mario Steele (EuMario)
Tuscan Chat Client
http://www.tuscanchat.com
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Al Getz wrote:

>
>Hello Mario,
>
>I'd like to comment on your string type observations mainly.
>
>I too noticed that to open a standard text file in an editor
>written in Euphoria using sequences to store the characters
>line by line requires four times as much memory as there are
>bytes of characters.  Such as waste, but if there was to be
>a string type that operates like sequences but only stores
>one byte per element (that's such as good idea!) it would
>have to be convertable over to 2 bytes per character also
>with the trend toward using Unicode, which would require
>two bytes per character.
>Currently to open a file that contains 1,000,000 characters 
>takes up 4,000,000 bytes of memory just for the characters alone.
>
>The only other way i can think of to do it is to manage your
>own memory block.  You'd have to do a comparison to see if
>it took that much longer to peek/poke then to simply get
>a line of text from a sequence.  In any case it would be
>more code to think about.
>
>Im all for a string type but as i was saying it would have
>to be expandable to 2 bytes per element, or possibly have 
>a 'string2' type that uses 2 bytes instead of 1 per char.
>
>I've found that most of the time when im using a sequence
>a certain way...such as for a string of chars...i seldom
>change this later in the program to store say an integer or
>atom (that is integer in the form of a number, not a char).
>It seems easier to follow the code flow if the sequences
>dont change too much.  Even when im returning a variable
>from a function that 'sometimes' has to return a sequence
>rather then integer i try to keep the basic format of 
>the returned structure the same:
>return {0,""}
>for an error condition rather then just
>return 0
>which would have meant it returns an object rather then a sequence
>all the time.
>
>It would introduce some confusion however, because you would
>have to know when a sequence was used as a string rather then
>a full blown sequence.
>
>Most text applications dont use that much text anyway, but
>on the other hand if you think about it there is an AWEFULL
>lot of blocks of ram that look something like:
>
>00
>00
>00
>65
>00
>00
>00
>66
>
>etc
>
>when you open a text file in an editor that uses sequences to
>store text 
>
>
>I would imagine that using 'allocate_string' over and over
>as a line of text changes would take lots more time to
>do, but if the program was done with this in mind it might
>be actually faster to draw on screen if the text is already
>in memory...
>
>Take care,
>Al
>  
>
new topic » goto parent » topic index » view thread » older message » newer message
OpenEuphoria

Re: Euphoria 2.5 Features..... ??

Search

Include:

Quick Links

User menu

Misc Menu