The cost of supercomputing

November 12, 2010
rants hpc sysadmin work

So this upcoming week is the big annual supercomputing convention, SC10, down in New Orleans. Since I’m skipping out (anxiously waiting for the arrival of Little Miss Sunshine), I’ve got time to actually try and read through the slew of new product announcements and news coverage. So today I saw this quote on twitter from hpc_guru and just had to share:

“Cost of the building next generation of supercomputers is not the problem. The cost of running the machines is what concerns engineers.”

That’s one thing that sometimes frustrates me when it comes to working with academics who want to get their own HPC system. For example, you may be looking at an annual facilities cost that’s say 10-20% of the original purchase cost of the system. It’s usually a whole lot easier to get funding from the fed or elsewhere for a 1 time big purchase than it is to get them to provide you with an annual budget for operational expenses. I’ve certainly heard horror stories of folks that went out and got a grant to buy a cluster and only talked to the university computing folks after it arrived to find out there weren’t enough data center resources (floor space, power, cooling) available to unbox the thing and turn it on. Then you end up in situations where they unbox a small handful of rackmount compute nodes and stuff one under each grad student’s desk in order to get something out of it. Not quite the cluster they were hoping for, but that’s arguably better than going back to the grant agency after a few years to tell them you haven’t published anything with the system you bought since you didn’t think to make sure there was a place to put it before you pursued the grant.

A more frequent pet peeve of mine is the end users that don’t understand why HPC storage doesn’t cost the same per TB as they can get from Best Buy. “But I just saw in their ad last week that I can get a 2TB drive for $100. You should give me way more storage on your HPC system than you do because it’s so cheap.” Right. Who cares about performance or scalability or reliability and BER. Certainly not them until they start complaining that the system is slow or demand to know why their data went poof.