Re: print.e and Other Questions
- Posted by euman at bellsouth.net Mar 04, 2002
- 390 views
Sorry bout that CK, I only used print.e to look at the output in an external file. so its safe to comment this line out. or you can download the include here: http://www.rapideuphoria.com/print.zip Derek more than answered the Q? about hash routines. BIG D's the man! Euman euman at bellsouth.net Q: Are we monetarily insane? A: YES ----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: "EUforum" <EUforum at topica.com> Sent: Tuesday, March 05, 2002 12:08 AM Subject: Re: print.e and Other Questions > > 5/03/2002 2:51:28 PM, "C. K. Lester" <cklester at yahoo.com> wrote: > > > > >Where can I get print.e? > > > >From the RDS Contributions page, I guess. It is a replacement to Euphoria's > >print() routine. > > >How do I use Euman's hash table? > > It gives you a fast way on knowing if a given word is in the words.txt file or > not. > > You use it by calculating the 'hash' value for the word you are checking on, > then scan through the > sequence referenced by the first letter of the word and its length, looking to > see if the hashvalue > is there. If so, then the word is in the dictionary, otherwise it is not. > > Eg. > > theWord = upper(theWord) > hv = EumsHash(theWord) > l = theWord[1] - 'A' + 1 > s = length(theWord) > inDict = 0 > for i = 1 to length(hash_table[l][s]) do > if hash_table[l][s][i] = hv then > -- Word is in dict. > inDict = 1 > exit > end if > end for > > >Or, rather, what structure is it? > > It is a three-level sequence. The first level represents the letters of the > alphabet. It is used to > group the dictionary words by their initial letter. The second level, that's > the level within each > initial letter, represents word length. This sorts all words that start with > the same letter into > word size. The third level is just a list of hash values for the dictionary > words. > > >I'll be looking at it tonight, > > Have fun. > > >but maybe somebody can give me a general idea about hash tables. > > The general idea behind hashing is to calculate a single value for an item, > based on attributes of > the item. Then use this value as a sort of index to speed up searches for the > item. It is often used > in compilers and other word-processing programs that have to keep track of > individual words. > > A common method is to add up the ASCII values of each letter, divide this by > the number of "bins" > you have and use the remainder to select a bin to put the word into. If you > have a huge list of > words, this is one method of effectively reducing the number of words you have > to scan through to > find the one you're after. > > Example: Assume I have 5 bins, numbered 0 to 4. > > word hashvalue bin > "CAT" 24 4 > "KANGAROO" 82 2 > "DOG" 26 1 > "ELEPHANT" 80 0 > "LEOPARD" 71 1 > thus "DOG" and "LEOAPARD" would both go in bin#1, but the others would only > have one word in them. > > The hard part is getting a good enough hashing algorithm that spreads the > indexes evenly over the > bins. > > --------- > Cheers, > Derek Parnell > ICQ# 7647806 > > > >