Re: print.e and Other Questions
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 04, 2002
- 385 views
5/03/2002 2:51:28 PM, "C. K. Lester" <cklester at yahoo.com> wrote: > >Where can I get print.e? > >From the RDS Contributions page, I guess. It is a replacement to Euphoria's >print() routine. >How do I use Euman's hash table? It gives you a fast way on knowing if a given word is in the words.txt file or not. You use it by calculating the 'hash' value for the word you are checking on, then scan through the sequence referenced by the first letter of the word and its length, looking to see if the hashvalue is there. If so, then the word is in the dictionary, otherwise it is not. Eg. theWord = upper(theWord) hv = EumsHash(theWord) l = theWord[1] - 'A' + 1 s = length(theWord) inDict = 0 for i = 1 to length(hash_table[l][s]) do if hash_table[l][s][i] = hv then -- Word is in dict. inDict = 1 exit end if end for >Or, rather, what structure is it? It is a three-level sequence. The first level represents the letters of the alphabet. It is used to group the dictionary words by their initial letter. The second level, that's the level within each initial letter, represents word length. This sorts all words that start with the same letter into word size. The third level is just a list of hash values for the dictionary words. >I'll be looking at it tonight, Have fun. >but maybe somebody can give me a general idea about hash tables. The general idea behind hashing is to calculate a single value for an item, based on attributes of the item. Then use this value as a sort of index to speed up searches for the item. It is often used in compilers and other word-processing programs that have to keep track of individual words. A common method is to add up the ASCII values of each letter, divide this by the number of "bins" you have and use the remainder to select a bin to put the word into. If you have a huge list of words, this is one method of effectively reducing the number of words you have to scan through to find the one you're after. Example: Assume I have 5 bins, numbered 0 to 4. word hashvalue bin "CAT" 24 4 "KANGAROO" 82 2 "DOG" 26 1 "ELEPHANT" 80 0 "LEOPARD" 71 1 thus "DOG" and "LEOAPARD" would both go in bin#1, but the others would only have one word in them. The hard part is getting a good enough hashing algorithm that spreads the indexes evenly over the bins. --------- Cheers, Derek Parnell ICQ# 7647806