1. Binary searching in spell checker
- Posted by Robert B Pilkington <bpilkington at JUNO.COM> Apr 28, 1998
- 545 views
Well, I got a chance to try the binary searching in my spell checker, and although it is nice to not have to wait for the dictionary to load, waiting for about half as long for the replacement list to generate for a misspelled word defeats the purpose... :/ Let's see, what were those calculations I made earlier.... Something like 500 checks to generate a replacement list.. With about an average of about 15 checks per word to see if it exists. (Most of them won't exist, only about 8 will be found, and the word will definitely be there in 18 checks, or it doesn't exist.) This would be about 7500 checks... Ouch. That's more than 1/10th of the dictionary. 9 misspelled words and the whole dictionary could be loaded. (BTW, this is all guesswork, but reasonable enough.) I guess I'll keep trying. I'm trying to completely understand the checking stuff but the weird, short variable names aren't helping.. :/ Is there any way to get really FAST reads from a WORDS.TXT style dictionary? (Faster than gets() with SMARTDRV or other caching software enabled.) Any other ideas that could work? Hybrid ideas? (I tried several things to try to get the dictionary filesize smaller, but decoding it all took longer than reading the larger dictionary...) Or should I just give up and stick to making Vector better so it can have a 'final' release? (Well, nevermind that last one.. I'm sure I'll get a couple responses telling me not to worry about the spell checker anymore.... :) _____________________________________________________________________ You don't need to buy Internet access to use free Internet e-mail. Get completely free e-mail from Juno at http://www.juno.com Or call Juno at (800) 654-JUNO [654-5866]
2. Re: Binary searching in spell checker
- Posted by Ralf Nieuwenhuijsen <nieuwen at XS4ALL.NL> Apr 29, 1998
- 513 views
>Is there any way to get really FAST reads from a WORDS.TXT style >dictionary? (Faster than gets() with SMARTDRV or other caching software >enabled.) Use Jaques Deschenes int21.e or something. There he uses an DOS interrupt call to load a file into a memory block. The overhead of converting everything from bytes to 4-byte-machine integers will be lost, and it will be more memory efficient. You could specify the precize amount of data you want to load, not more, not less, at this point Euphoria caches for ya, even if you only want to lookup 1 character. Then it writes everything it has cached back out. And you may want to use some advanced form of hashing. Like having buckets in buckets in buckets.. a set of buckets for every 4 characters, so a word of length 12, can be calculated in once, althrough you need to lookup the 'location' of the next bucket 3 times. Hashing with buckets is a good compromize between a normal up-t'ill-down search and an over-exsessive binary search. Ralf Nieuwenhuijsen nieuwen at xs4all.nl
3. Re: Binary searching in spell checker
- Posted by Daniel Berstein <daber at PAIR.COM> May 01, 1998
- 545 views
> Use Jaques Deschenes int21.e or something. > There he uses an DOS interrupt call to load a file into a memory block. WARNING: I'm not sure about this, but be careful when using DOS call under Windows 95 OSR2. If you have FAT32 you may get weird results. Under Win32 it's safer to use WinAPI. Regards, Daniel Berstein.