Re: $100 Contest Question

new topic     » goto parent     » topic index » view thread      » older message » newer message

----- Original Message -----
From: "Kat" <gertie at PELL.NET>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


>
> On 3 Mar 2002, at 2:52, C. K. Lester wrote:
>
> >
> > Kat wrote:
> > >
> > > My new question: It's taking me 143 seconds to load a
> > > dictionary with the following code:
> >
> > Kat, I'm loading Junko's 50,000-word dictionary in less than a second.
> > What dictionary are YOU using?! :D
>
> One i threw together and premunged just for this. Unfortunately, it's
making a
> 4.8Meg file out of the dictionaries, even after deleting all the trailing
spaces,
> CR, LF, and duplicates. Using:
>
>   writefile = open(dfilename,"wb")
>   print(writefile,dictionary)
>   close(writefile)
>
> puts this stuff into the file:
>
>
{{{50},{67},{71},{72},{75},{77},{78},{84},{88},{65},{73},{65},{66},{66},{67}
,{68},{68}
> , etc, which is just a listing of the letters of the alphabet, which is
what i
> wanted at that point. But i didn't need it represented that way, i wanted
it
> done in a way Eu could reload it instantly,, like,, umm, in
> D:\Euphoria\DEMO\MYDATA.EX. As it is saved tho, it is taking up 5x as
> much space and load time as needed. Any ideas?

That is exactly why print() and get() should rarely be used. They are
extremely inefficient. In my version of reformatting Junko's WORDS.TXT, I
went from the original 508,190 bytes to 531,828 bytes. In speed differences,
it takes my about 6 seconds to use WORDS.TXT to build the internal
dictionary, and about 0.5 seconds to build it using the reformatted
words.txt.

For speed, use binary reads and writes, and just use getc() and puts().

I'm not sure if I'm allowed to give any more details of the algorithms I
used etc..., but I did strip out unnecessary delimiters between words.

----------
Derek

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu