1. Hashing
- Posted by Arthur Adamson <euclid at ISOC.NET> Jan 03, 1998
- 534 views
- Last edited Jan 04, 1998
At 09:47 PM 1/3/98 -0500, you wrote: >Ralf writes: >> Why isn't there a hash-table in Euphoria, I haven't quite been >> thinking about it, (if it would be cleaner to have such a table or not), >> but why did you decide not to... (I'm not in favor of this, not am >> against it, I'll give it a bit more thought..) > >What is a hash table? Here's a quick introduction. > >Suppose you have a sequence s of names and you want >to know if "Ralf" is one of the names. In Euphoria you can >use find("Ralf", s) which will start with s[1] and compare strings >until "Ralf" is found, or the end of the sequence is reached. > >If your list consists of 20 or fewer strings this is probably the fastest >way to search. If your list has 1000 strings it may >take a long time. If you have hundreds or thousands of names >that you want to look up, you should consider hashing. > >The idea behind "hashing" is to organize the list of 1000 as >many short lists, and then search one >of the short lists. Suppose that you took the first character >of the name (converted to upper case) and used that as >an index into a sequence of 26 sequences. All the >names beginning with 'A' would be on the first sequence, all the >names beginning with 'B' would be on the second sequence, etc. >With a nice distribution of names you might then have >26 sequences with about 1000/26 = 38 names per sequence. >After indexing to select the correct sequence, you only >have 38 names to search, not 1000. These techniques are similar to the ones I used in the faster float and string sort progs I posted on the Euphoria home site. I didn't do it very elegantly but the improvement was large. Bye, Art Arthur P. Adamson, The Engine Man, euclid at isoc.net