Re: Constructing sequences
- Posted by Gordon Webster <gwalias-bb at yahoo.com> Jun 27, 2005
- 647 views
Robert Craig wrote: > > Gordon Webster wrote: > > Aaaaaaaaaaah! Thanks Matt > > > > It's all the individual memory allocations that take the time on the > > 'append' operation. Using your method of pre-allocating memory in blocks, > > my 40 second sequence build comes down to 0.05s - an 800x speedup! > > > > BTW: just for fun, when I compiled this, the time was too small to be > > displayed by the time() function for 100,000 records added. > > > > I will allocate and expand my sequences in blocks from now on. > > That sounds like a good solution, however > I did a test on my Pentium-4 1.8GHz machine... > > }}} <eucode> > integer fn > object line > sequence buffer > > -- create file > fn = open("bigfile.txt", "wb") > for i = 1 to 100000 do > puts(fn, 'A' + rand(repeat(100, 19 + rand(11)))) > puts(fn, '\n') > end for > close(fn) -- file is about 2.5Mb > > -- read file > fn = open("bigfile.txt", "rb") -- 100,000 lines of 20 to 30 random chars > > atom t > t = time() > > buffer = {} > while 1 do > line = gets(fn) > if atom(line) then > exit > end if > buffer = append(buffer, line) > end while > > close(fn) > > ? time() - t -- 4.06 seconds, 1.8GHz Pentium-4 > > ? length(buffer) -- 100,000 > <font color="#330033"></eucode> {{{ </font> > > With 2.5, ex.exe took 4.06 seconds. > With 2.4, ex.exe took 4.34 seconds. > > Maybe your machine is much slower, or maybe you > were not appending to a simple variable, like "buffer", > but rather to a subscripted variable. The latter case > is not optimized as well. > > Regards, > Rob Craig > Rapid Deployment Software > <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a> > Dear Rob, You are absolutely right, I was writing to a subscripted variable. I had created a sequence whose first few variables are the sequence 'config', and whose last variable is another sequence destined to hold the actual data records. In effect was 'appending' to mysequence[$]. I didn't realise that there was such a difference between subscripted and unsubscripted variables for this purpose (and my machine is pretty fast - a 3.2GHz Pentium). Thanks for the tip. This is something I might never have discovered for myself. This list is a terrific resource and I have had some great help here, for which I'm very grateful. Coding in Euphoria is certainly a pleasure. I love how lean, clean and readable my code is and (an important feature for me) how fast it runs ... ... when you do things the "right" way, as I'm learning today I know that you have a "Performance Tips" section in the online docs, but it might be worth compiling some more of this kind of advice into it. Trawling through 9 years worth of the EuForum is not so easy - the search is lighning quick, but wading through the output can be a slog. It would really be a shame if a newcomer to Euphoria like myself, were to try out the language without being aware of these things, erroneously conclude that Perl/Python etc. (pick your favorite interpreter) was faster, and go looking elsewhere for speedy code. Best Gordon