Re: Constructing sequences

new topic     » goto parent     » topic index » view thread      » older message » newer message

Robert Craig wrote:
> 
> Gordon Webster wrote:
> > Aaaaaaaaaaah! Thanks Matt
> > 
> > It's all the individual memory allocations that take the time on the
> > 'append' operation. Using your method of pre-allocating memory in blocks,
> > my 40 second sequence build comes down to 0.05s - an 800x speedup!
> > 
> > BTW: just for fun, when I compiled this, the time was too small to be
> > displayed by the time() function for 100,000 records added.
> > 
> > I will allocate and expand my sequences in blocks from now on.
> 
> That sounds like a good solution, however
> I did a test on my Pentium-4 1.8GHz machine...
> 
> }}}
<eucode>
> integer fn
> object line
> sequence buffer
> 
> -- create file
> fn = open("bigfile.txt", "wb")
> for i = 1 to 100000 do
>     puts(fn, 'A' + rand(repeat(100, 19 + rand(11))))
>     puts(fn, '\n')
> end for
> close(fn) -- file is about 2.5Mb
> 
> -- read file
> fn = open("bigfile.txt", "rb") -- 100,000 lines of 20 to 30 random chars
> 
> atom t
> t = time()
> 
> buffer = {}
> while 1 do
>     line = gets(fn)
>     if atom(line) then
> 	exit
>     end if
>     buffer = append(buffer, line)
> end while
> 
> close(fn)
> 
> ? time() - t -- 4.06 seconds, 1.8GHz Pentium-4
> 
> ? length(buffer) -- 100,000
> <font color="#330033"></eucode>
{{{
</font>
> 
> With 2.5, ex.exe took 4.06 seconds. 
> With 2.4, ex.exe took 4.34 seconds.
> 
> Maybe your machine is much slower, or maybe you
> were not appending to a simple variable, like "buffer",
> but rather to a subscripted variable. The latter case
> is not optimized as well.
> 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a>
> 


Dear Rob,

You are absolutely right, I was writing to a subscripted variable.

I had created a sequence whose first few variables are the sequence
'config', and whose last variable is another sequence destined to hold
the actual data records.

In effect was 'appending' to mysequence[$]. I didn't realise that there
was such a difference between subscripted and unsubscripted variables
for this purpose (and my machine is pretty fast - a 3.2GHz Pentium).
Thanks for the tip. This is something I might never have discovered
for myself.

This list is a terrific resource and I have had some great help here,
for which I'm very grateful. Coding in Euphoria is certainly a pleasure.
I love how lean, clean and readable my code is and (an important feature
for me) how fast it runs ...

... when you do things the "right" way, as I'm learning today smile

I know that you have a "Performance Tips" section in the online docs,
but it might be worth compiling some more of this kind of advice into it.
Trawling through 9 years worth of the EuForum is not so easy - the search
is lighning quick, but wading through the output can be a slog. It would
really be a shame if a newcomer to Euphoria like myself, were to try out
the language without being aware of these things, erroneously conclude
that Perl/Python etc. (pick your favorite interpreter) was faster, and go
looking elsewhere for speedy code.

Best

Gordon

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu