Re: internal storage
- Posted by Jim Hendricks <jim at bizcomputinginc.com> Sep 20, 2004
- 425 views
Derek Parnell wrote: > > Jim Hendricks wrote: > > > > The manual states: > > > > Performance Note: > > Does this mean that all atoms are stored in memory as 8-byte > > floating-point numbers? No. The Euphoria interpreter usually > > stores integer-valued atoms as machine integers (4 bytes) to > > save space and improve execution speed. When fractional results > > occur or numbers get too big, conversion to floating-point happens > > automatically. > > > > My question is are string sequences then stored as 4 byte atoms or as > > 1 byte atoms? > > > > This is quite important to the app I am porting since I am parsing about > > 4 Meg of character data and the process performs much better if I can > > parse the character data to an intermediate state in memory which causes > > having all the data in memory at once. I could always go with an > > intermediate state to temp files, but this causes the process to run 1000% > > slower( on windoze ). I can live with requiring 4 Meg of memory consumption > > for this process, but not so keen on requiring 16 Meg of memory consumption. > > All 'characters' are stored as 4-byte integers and not stored as single > bytes. > > If you have large text strings in RAM, you might need to rethink your > design, or let the operating system deal with the virtual memory. > > Another consideration is that each sequence is stored in contiguous RAM > so rather than have a giant single sequence containing all your > text, it is better to break it up into a sequence of sequences. > > E.G. > > sequence theFile > theFile = { > "line one", > "line two", > "line three" > } > > This is four sequences: one for the file and another for each line. So it > takes up four independant RAM blocks. This can effect RAM paging swapping > for very large sequences. > > -- > Derek Parnell > Melbourne, Australia Thanks, I will have to go with temp files than and take the performance hit. It's not a key function of the application so perfomance is not important, just anoying to take a process that runs in a few seconds in memory to something that runs in a minute or more via temp files. ( I may have been exagerating a little with my 1000% slow down, but it sure feels that way ) I know this slow down because this app is running fine in Java and I first wrote it against memory, didn't like how much memory was consumed, went to a temp file and saw the slow down. Java Strings use 2 byte chars, but I think char and byte still take 4 bytes internally with Java because I canned using Strings and went with byte arrays and didn't see any improvement in memory usage. I am already using sequences of sequences because it allows me to structure the intermediate data in a very navigatable way so I guess it's good to hear that the decision had some hidden benefits and I managed by accident to dodge a performance bullet.