Re: Str

new topic     » topic index » view thread      » older message » newer message

Derek Parnell wrote:
> 
> Kat wrote:
> > 
> > Derek Parnell wrote:
> > > 
> > > Kat wrote:
> > > > 
> > > > Shawn Pringle wrote:
> > > > > 
> > > > > Kat,
> > > > > 
> > > > > A EUPHORIA string is a sequence that contains integer
> > > > > values that each represent a character value.
> > > 
> > > 
> > > > Really? 
> > > 
> > > Ok, its not quite accurate. It should read more like ...
> > > 
> > > "A EUPHORIA string is a sequence that ONLY contains POSTIVE integer
> > > values that each represent a character value."
> > > 
> > > > Since when does EUPHORIA use only 8 bits for each CHAR
> > > > in the SEQUENCE? 
> > > 
> > > Since when is the definition of "string" :: An array of 8-bit unsigned
> > > integers?
> > 
> > It isn't, it could be UTF16 as well, using only 1/2 as much memory as Eu
> > sequences.
> 
> Except for some weird characters, you are right. However, Euphoria strings are
> encoded as UTF32 only. 
> 
> > > > Since when can you load a 500mbyte STRING into EUPHORIA and
> > > > not have the OS kill the application with "too much memory used"
> > > > error (windoze allows each app to have  only 2 gigabytes)?
> > > 
> > > It doesn't. I bet a Commodore 64 couldn't do that either.
> > 
> > Strangely, there is a IDE64 for the C64 that can handle 4gigabyte files on
> > ATA
> > drives.
> 
> That would be a disk right and not RAM? I though the problem was storing huge
> amounts of bytes in RAM.

Didn't mean to confuse anyone, sorry. I just thought it interesting a 30 year
old 8bit computer could seek() in bigger files than Eu can on winxp or vista, or
etc..

> > > Since when do you absolutely, positively, must have all those 500 mega
> > > BYTES
> > > in RAM at the same time? Are you saying that your task can only be
> > > achieved
> > > if all those bytes are in RAM simultaneously?
> > 
> > Damn, i pick a number that guarantees windose killing the app, and that's
> > all
> > you can think of.
> 
> Hang on a minute ... did you tell us that you picked '500MB' because that's
> a number that will cause Windows to fail?  No you did not. 

Yes, i did. It was simple way to show Shawn Pringle how a file of 1/4 the limit
of windows would cause a termination. *Using* a 250megabyte file in memory will
do the same.

> It seemed to me,
> and I think everyone else, that you have a need to actually have that 500MB
> in RAM. I know that you have applications that to an awful lot of text
> analysis
> etc on stuff from the internet, so I just thought that your concern was that
> you can't use Euphoria for those apps because of its forced UTF32 encoding.
 
There's also the overhead of memory management by Eu and windose, involving
swapping to disk, allocating and freeing, etc.. And again i point out it's not
the 500meg loading that sneakily gets you, it's using the smaller amounts of data
where the data is copied in the function call, doubling the app's memory needs.
Somehow, the apps often get to be 4x whatever the working file's size.
 
> It seems to me that if we had UTF8 encoding that you'd then complain that it
> can't handle "strings" of more than 2GB! 
> 
> > Don't you have anything better to do, Derek, or you just trolling
> > for me?
> 
> Kat, you know me better than that. I'm not shitting you or trying to upset
> you.
> I'm trying to get to the real issue AND try to help find ways to overcome it.
> 
> If you can tell me about some of your real world tasks that Euphoria's strings
> prevent you from performing, I'd like to find solutions to those for you. So
> far, I'm not convinced you have a case but I can have my opinion changed with
> some evidence to the contrary.

I wasn't trying to prove anything, nor was i asking for help. I am so accustomed
now to Eu not being able to handle large files, or doing so slowly, that my first
impulse is to ask how i might get around the limits, not what i might be able to
do. For instance, i recently broke up that 16gigabyte file into a few million
smaller files, and in that process, i needed a huge_seek(), which i asked about,
and i don't think qualified as sufficent proof of any need.

I have some 250megabytes of tab-delimited files that should fit into memory, but
won't. I'll haveto gets() each file, keep a running list of what i am looking for
from each one, possibly writing that out to a file if it won't fit in memory
(which i'll find out about about 3 days into the run <sigh>).

Kat

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu