Re: Constructing sequences
- Posted by Bob Thompson <rthompson at rthompson.karoo.co.uk> Jun 30, 2005
- 599 views
Maybe I’m digressing from the point of the original post but, for me, it raises the question, what are the general rules for fast loading and saving of very complex or huge structures and what level of sub-sequences can be addressed efficiently? Here’s an example with pre allocation (structure only) and appending to a subscripted variable which is only twice as slow at loading from file as Robert’s optimised routine loading to a simple buffer sequence (it’s part of a simple spell checker I use). For speed, it appears that pre-defining the structure before reading and breaking the structure down into simple units before saving is important. include file.e include misc.e atom handle -- create pseudo dictionary file - legible version of Robert's code handle = open("bigfile.txt", "w") for i = 1 to 100000 do puts(handle, 'a' + rand(repeat(25, 19 + rand(11)))) puts(handle, '\n') end for close(handle) --key for pseudo dictionary hash - code intended for proper lower case words function dic_hash_key(sequence key_word) integer i, j, k i = key_word[1] - 96 if length(key_word) > 2 then--most tested at this stage j = key_word[2] - 96 k = key_word[3] - 96 return {i, j, k} elsif length(key_word) > 1 then--very few tested at or after this stage j = key_word[2] - 96 return {i, j, 26} else return {i, 26, 26} end if end function --read function read_list_to_hash(sequence file) object filed_line sequence hash, key handle = open(file, "r") hash = repeat(repeat(repeat({}, 26), 26), 26) while 1 do filed_line = gets(handle) if sequence(filed_line) then filed_line = filed_line[1..length(filed_line) -1] key = dic_hash_key(filed_line) hash[key[1]][key[2]][key[3]]=append(hash[key[1]][key[2]][key[3]],filed_line) else exit end if end while close(handle) return hash end function --save procedure save_hash_to_list(sequence file, sequence hash) handle = open(current_dir() & file, "w") for h = 1 to length(hash) do for i = 1 to length(hash[h]) do for j = 1 to length(hash[h][i]) do for k = 1 to length(hash[h][i][j]) do puts(handle, hash[h][i][j][k] & 10) end for end for end for end for close(handle) end procedure sequence dictionary_hash atom t --load dictionary t = time() dictionary_hash = read_list_to_hash("bigfile.txt") ?time()-t --save dictionary t = time() save_hash_to_list("bigfile.txt", dictionary_hash) ?time()-t sleep(3)