Friday, December 3, 2010

Hard Drive speed matters most

So, you are building a workstation for fMRI analysis huh? Good for you, and exciting times. I would recommend not getting blinded by too many fancy specs and just focusing on the hard drive when purchasing new equipment:

Without a doubt, the #1 thing to pay a bit more money for is your hard drive, not the processor or RAM or anything else. fMRI analysis is brutal on hard drives, because you will often have 20,000 small (< 5 mb) files that require reading and writing at every processing step until second (aka group) level analysis. I have just compared running the same analysis on two drives: A) a 2TB Western Digital 'Green' drive, which is designed for storage capacity and energy efficiency rather than quick access (it is 5400 RPM rather than the standard 7200 RPM for most hard drives these days); and B) a Seagate 1TB 7200.12, which is your standard hard drive comparable I suppose to a Western Digital Blue or a Samsung F3.

The comparison was skewed to help the Western Digital green- it was run off a computer with an i5 760 quad core processor, a solid state hard drive running windows 64, 4 GB of RAM. The Seagate hard drive was on an older computer, a dual core E5300, 32bit windows, 2GB of RAM. Both computers were running Windows 7.

Both computers ran an SPM analysis of fMRI data on 88 x 12 minute runs of data (a huge dataset, with 407 image files per run). One might think that the newer computer would be a slam dunk to pull ahead, but because the first level analysis requires loading each of files one at a time for inclusion into the data matrix, the opposite was true:

Time to setup design matrix on 1 participant, old comp w faster drive: 1 min, 10 sec
Time to setup design matrix on 1 participant, new comp w slower drive: 28 minutes

Time to estimate 1 participant on the old computer with faster hard drive: 19 minutes
Time to estimate 1 participant on the new computer with slower hard drive: 18 minutes

In pure number crunching (estimation), the two drives are about the same. However, for a large dataset, the difference in setting up the design matrices (which requires reading from each file sequentially), is staggering. Small file random seek time is a major rate limiting step in fMRI analysis, so before you go out and buy a new computer, consider the cheaper upgrade of a new harddrive first!

No comments:

Post a Comment