Re: internal storage

new topic     » goto parent     » topic index » view thread      » older message » newer message

Derek Parnell wrote:
> 
> Jim Hendricks wrote:
> > 
> > The manual states:
> > 
> >  Performance Note:
> >     Does this mean that all atoms are stored in memory as 8-byte 
> >     floating-point numbers? No. The Euphoria interpreter usually 
> >     stores integer-valued atoms as machine integers (4 bytes) to 
> >     save space and improve execution speed. When fractional results 
> >     occur or numbers get too big, conversion to floating-point happens 
> >     automatically.
> > 
> > My question is are string sequences then stored as 4 byte atoms or as 
> > 1 byte atoms?
> > 
> > This is quite important to the app I am porting since I am parsing about
> > 4 Meg of character data and the process performs much better if I can
> > parse the character data to an intermediate state in memory which causes
> > having all the data in memory at once. I could always go with an
> > intermediate state to temp files, but this causes the process to run 1000%
> > slower( on windoze ). I can live with requiring 4 Meg of memory consumption
> > for this process, but not so keen on requiring 16 Meg of memory consumption.
> 
> All 'characters' are stored as 4-byte integers and not stored as single 
> bytes.
> 
> If you have large text strings in RAM, you might need to rethink your 
> design, or let the operating system deal with the virtual memory.
> 
> Another consideration is that each sequence is stored in contiguous RAM
> so rather than have a giant single sequence containing all your
> text, it is better to break it up into a sequence of sequences.
> 
> E.G. 
> 
> sequence theFile
> theFile = {
>   "line one", 
>   "line two", 
>   "line three"
>    }
> 
> This is four sequences: one for the file and another for each line. So it
> takes up four independant RAM blocks. This can effect RAM paging swapping
> for very large sequences.
> 
> -- 
> Derek Parnell
> Melbourne, Australia
Thanks, I will have to go with temp files than and take the performance
hit.  It's not a key function of the application so perfomance is not
important, just anoying to take a process that runs in a few seconds in
memory to something that runs in a minute or more via temp files. ( I may
have been exagerating a little with my 1000% slow down, but it sure 
feels that way )  I know this slow down because this app is running fine
in Java and I first wrote it against memory, didn't like how much memory
was consumed, went to a temp file and saw the slow down.  Java Strings use
2 byte chars, but I think char and byte still take 4 bytes internally with
Java because I canned using Strings and went with byte arrays and didn't
see any improvement in memory usage. 

I am already using sequences of sequences because it allows me to structure
the intermediate data in a very navigatable way so I guess it's good to
hear that the decision had some hidden benefits and I managed by accident to
dodge a performance bullet.

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu