1. map vs append
- Posted by euphoric (admin) Apr 18, 2012
- 1225 views
I just have to report, if you aren't using maps, you're probably missing out on a ton of speed.
I have a program that loads the contents of over 4600 files and creates a cache file of their contents. Before, this was taking upwards of 30 seconds. Not a big deal in the whole scheme of things, but idle fingers cost money.
After, now using maps instead of sequences, it takes less than 2 seconds to complete.
Maps FTW!
2. Re: map vs append
- Posted by Lnettnay Apr 19, 2012
- 1093 views
Could you perhaps show a small example?
3. Re: map vs append
- Posted by euphoric (admin) Apr 19, 2012
- 1081 views
Below, I've shown the actual code that got changed. I stripped most of the guts of the code to show just the changed lines, but kept a few other lines for some context.
sequence -- initially, the 'map' was kept in the standard two-element sequence -- where an element in sequence 1 was related to the same element in sequence 2... the poor man's map-- logs = {{},{}} -- now I keep it in a map map mLogs = map:new() -- the following loop would run over 4600 times (currently); this number grows on a daily basis -- it uses file data in the years[] sequence for t=1 to yCount do -- i was growing the original sequence with append(), as below -- logs[1] = append(logs[1],years[t][D_NAME][1..10]) -- logs[2] = append(logs[2],match( ".GOT", upper( years[t][D_NAME] ) ) > 0 ) -- now i'm using a map map:put( mLogs, years[t][D_NAME][1..10], match( ".GOT", upper( years[t][D_NAME] ) ) > 0 ) end for for t=1 to length(files) do for x=1 to length(TEMP) do text = split( TEMP[x], "\t" ) if length(text) > 2 then -- now, instead of find()ing... -- i = find( text[CODE], logs[1] ) -- I use map:has() i = map:has( mLogs, text[CODE] ) if i > 0 then -- and instead of a sequence query -- if logs[2][i] then -- I use a map:get() if map:get( mLogs, text[CODE] ) then end if end if end if end for end for
There's no way to parse what I'm doing with the data from this snipped example, but these few changes led to a significant speed-up: from 15-20 seconds to less than 2 seconds.
The speed-up, no doubt, comes from the building of the 4,600+-element sequence/map using map:put(), not the query/get changes.
4. Re: map vs append
- Posted by DerekParnell (admin) Apr 19, 2012
- 1070 views
I just have to report, if you aren't using maps, you're probably missing out on a ton of speed.
I have a program that loads the contents of over 4600 files and creates a cache file of their contents. Before, this was taking upwards of 30 seconds. Not a big deal in the whole scheme of things, but idle fingers cost money.
After, now using maps instead of sequences, it takes less than 2 seconds to complete.
I suspect the main speed-up has been the change from using a single sequence to hold the file cache to using multiple sequences.
When you append to a sequence, Euphoria allocates enough space to hold all the existing items in the sequence plus the new item(s), then copies all the existing items and new items to the new space, then removes the old sequence. As a sequence grows, this can have quite a large overhead. In general, avoid appending to large sequences.
A map's internal structure has many sequences, nearly all of them are very sort ones, so appending to these is usually a lot faster.
BTW, if notice you use this type of construct ...
if map:has( themap, key ) then data = map:get( themap, key )
It might be faster if you use ...
data = map:get( themap, key, badvalue ) if not equal(data, badvalue) then
5. Re: map vs append
- Posted by euphoric (admin) Apr 19, 2012
- 1080 views
When you append to a sequence...
Ah! Helpful insight. Thanks, Derek.