Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 02, 2002
- 481 views
----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > > On 3 Mar 2002, at 2:52, C. K. Lester wrote: > > > > > Kat wrote: > > > > > > My new question: It's taking me 143 seconds to load a > > > dictionary with the following code: > > > > Kat, I'm loading Junko's 50,000-word dictionary in less than a second. > > What dictionary are YOU using?! :D > > One i threw together and premunged just for this. Unfortunately, it's making a > 4.8Meg file out of the dictionaries, even after deleting all the trailing spaces, > CR, LF, and duplicates. Using: > > writefile = open(dfilename,"wb") > print(writefile,dictionary) > close(writefile) > > puts this stuff into the file: > > {{{50},{67},{71},{72},{75},{77},{78},{84},{88},{65},{73},{65},{66},{66},{67} ,{68},{68} > , etc, which is just a listing of the letters of the alphabet, which is what i > wanted at that point. But i didn't need it represented that way, i wanted it > done in a way Eu could reload it instantly,, like,, umm, in > D:\Euphoria\DEMO\MYDATA.EX. As it is saved tho, it is taking up 5x as > much space and load time as needed. Any ideas? That is exactly why print() and get() should rarely be used. They are extremely inefficient. In my version of reformatting Junko's WORDS.TXT, I went from the original 508,190 bytes to 531,828 bytes. In speed differences, it takes my about 6 seconds to use WORDS.TXT to build the internal dictionary, and about 0.5 seconds to build it using the reformatted words.txt. For speed, use binary reads and writes, and just use getc() and puts(). I'm not sure if I'm allowed to give any more details of the algorithms I used etc..., but I did strip out unnecessary delimiters between words. ---------- Derek