Re: 20gigabyte string searching

new topic     » goto parent     » topic index » view thread      » older message » newer message

On Sat,  2 Oct 2004 22:26:26 +0000, Mike <vulcan at win.co.nz> wrote:

>Depending on the hash table quality there might be, say, 2 or 3 
>disk movements per URL search.
>
>Mike
>
>PS: This idea seems similar to the one Pete posted but the index is 
>managed in RAM so performance should be much better (at the expense of 
>huge amounts of RAM).

I thought about that, with a hash table of a million entries on disk,
and 15 million urls, if the hash function does its job, the average
bucket size will be 15. Either way, (15 or 2) local disk accesses will
probably be faster than reading a 100-byte url from the interweb.

Pete

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu