Re: Optimizing basic operations

new topic     » goto parent     » topic index » view thread      » older message » newer message

James,

The talk has mostly been about the speed of processing the data. 
 
You need to consider first of all the upload time. Given 10^12 records 
with a evaluuation integer and 50 numbers you have a very large amount 
to load. say 200 Tb. 
 
According to Tom's hardware the fastest off the shelf and affordable 
SSDs have data transfer rates approximating 0.5 Gb per second. That is 
to say you would be looking at an upload time of 400,000 seconds or so. 
 
That is 111 hours or 4.5 days.  
 
The processing of the records (assuming random data) could be expected 
to be 40,000 seconds (based on Derek's numbers). That is to say  
11 hours.  
 
Multiplying the number of machines used is much more effective than  
using a faster processor. Eg 10 machines would get the load time down  
to 11 hours and the processing down to 1.1 hours.  
 
Cutting down the amount of data would also help (but, of course you  
still need to have the identifying data together with the evaluation). 

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu