1. map vs append

I just have to report, if you aren't using maps, you're probably missing out on a ton of speed.

I have a program that loads the contents of over 4600 files and creates a cache file of their contents. Before, this was taking upwards of 30 seconds. Not a big deal in the whole scheme of things, but idle fingers cost money.

After, now using maps instead of sequences, it takes less than 2 seconds to complete.

Maps FTW!

new topic     » topic index » view message » categorize

2. Re: map vs append

Could you perhaps show a small example?

new topic     » goto parent     » topic index » view message » categorize

3. Re: map vs append

Below, I've shown the actual code that got changed. I stripped most of the guts of the code to show just the changed lines, but kept a few other lines for some context.

sequence 
-- initially, the 'map' was kept in the standard two-element sequence 
-- where an element in sequence 1 was related to the same element in sequence 2... the poor man's map smile 
--	logs = {{},{}}   
 
-- now I keep it in a map 
map mLogs = map:new() 
 
-- the following loop would run over 4600 times (currently); this number grows on a daily basis 
-- it uses file data in the years[] sequence 
for t=1 to yCount do 
-- i was growing the original sequence with append(), as below 
-- 		logs[1] = append(logs[1],years[t][D_NAME][1..10]) 
-- 		logs[2] = append(logs[2],match( ".GOT", upper( years[t][D_NAME] ) ) > 0 ) 
 
-- now i'm using a map 
	map:put( mLogs, years[t][D_NAME][1..10], match( ".GOT", upper( years[t][D_NAME] ) ) > 0 ) 
end for 
 
for t=1 to length(files) do 
	for x=1 to length(TEMP) do 
		text = split( TEMP[x], "\t" ) 
		if length(text) > 2 then 
 
-- now, instead of find()ing... 
-- 			i = find( text[CODE], logs[1] ) 
-- I use map:has() 
			i = map:has( mLogs, text[CODE] ) 
			 
			if i > 0 then 
-- and instead of a sequence query 
-- 				if logs[2][i] then 
-- I use a map:get() 
				if map:get( mLogs, text[CODE] ) then 
				end if 
			end if 
		end if 
	end for 
end for 

There's no way to parse what I'm doing with the data from this snipped example, but these few changes led to a significant speed-up: from 15-20 seconds to less than 2 seconds.

The speed-up, no doubt, comes from the building of the 4,600+-element sequence/map using map:put(), not the query/get changes.

new topic     » goto parent     » topic index » view message » categorize

4. Re: map vs append

euphoric said...

I just have to report, if you aren't using maps, you're probably missing out on a ton of speed.

I have a program that loads the contents of over 4600 files and creates a cache file of their contents. Before, this was taking upwards of 30 seconds. Not a big deal in the whole scheme of things, but idle fingers cost money.

After, now using maps instead of sequences, it takes less than 2 seconds to complete.

I suspect the main speed-up has been the change from using a single sequence to hold the file cache to using multiple sequences.

When you append to a sequence, Euphoria allocates enough space to hold all the existing items in the sequence plus the new item(s), then copies all the existing items and new items to the new space, then removes the old sequence. As a sequence grows, this can have quite a large overhead. In general, avoid appending to large sequences.

A map's internal structure has many sequences, nearly all of them are very sort ones, so appending to these is usually a lot faster.

BTW, if notice you use this type of construct ...

   if map:has( themap, key ) then 
       data = map:get( themap, key )  

It might be faster if you use ...

   data = map:get( themap, key, badvalue )  
   if not equal(data, badvalue) then  
new topic     » goto parent     » topic index » view message » categorize

5. Re: map vs append

DerekParnell said...

When you append to a sequence...

Ah! Helpful insight. Thanks, Derek.

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu