1. From a newbie to newbies

Watch out for get_bytes().

If you say

    sequence blk
    integer handle
    handle = open( "HUGEFILE.DAT", "rb" )
    blk = get_bytes( handle, 72000000 )

then be prepared for a bit of a wait. Yes, there's the wait while the data
actually gets sucked off the disk, but there's also the wait while get_bytes
executes a repeat() to set aside the memory, and then the loop from 1 to
72000000 while get_bytes() gets each byte (individually?!). Added to all that is
the memory and disk swapping that goes on while the underlying system tries to
provide the memory. Add it all up and you've got a big performance hit.

Question to the Euphoric Community: is there anything better out there than
get_bytes(). Any Windows function?

Bruce.

new topic     » topic index » view message » categorize

2. Re: From a newbie to newbies

On Mon, 10 May 1999, Bruce M. Axtens wrote:

] Watch out for get_bytes().
]
] If you say
]
]     sequence blk
]     integer handle
]     handle = open( "HUGEFILE.DAT", "rb" )
]     blk = get_bytes( handle, 72000000 )
]
] then be prepared for a bit of a wait. Yes, there's the wait while the data
] actually gets sucked off the disk, but there's also the wait while get_bytes
] executes a repeat() to set aside the memory, and then the loop from 1 to
] 72000000 while get_bytes() gets each byte (individually?!). Added to all
] that is the memory and disk swapping that goes on while the underlying
] system tries to provide the memory. Add it all up and you've got a big
] performance hit.

72000000 is 72Meg. Why on earth are you loading in a file that size in one
go? The only files I can think of that are that size are uncompressed
media like .BMP and .WAV.

As for get_bytes(); While the code for it reads the data one byte at a
time (it has to, it's a library file with no access to itself!), Euphoria
actually reads 8k at a time into an internal buffer, so for your 72M file,
Euphoria only reads the disk ~8800 times.

However, only extremely rich people have computers with more than 64M (at
time of this post ;) ) So you're bound to get some kind of disk-swapping
via Euphoria's VM scheme thus turning everything to molasses.

I suggest using get_bytes() piecewise, reading in 2Mb (say) at a time.
Most computers don't need to VM swap at this size.

] Question to the Euphoric Community: is there anything better out there than
] get_bytes(). Any Windows function?

Nope. It would probably be possible for RDS to implement it in the
interpreter as a new command, but you're not going to get much faster,
even *with* Windows.

I've already asked Rob about increasing the 8k limit on the internal
buffer, and IIRC, it's on his "might do this if I get bored" list... :)

Happy Coding,
Carl (Who had three attempts to type his own name here...)

--
Carl R White -- cyrek- at -bigfoot.com -- http://www.bigfoot.com/~cyrek
 aka Cyrek   --    No hyphens :)    --       Bigfoot URL Alias

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu