Recently, I’ve been thinking that I should write down some of my views on IT. I don’t believe in a black & white world, but in one that’s full of realities, tradeoffs, and compromises. I’ve worked with people who refuse to (or are unable to) recognize that and spend energy trying to dictate instead of collaborate, typically to the detriment of themselves and frustration of everyone around them. IT exists to support and enable an organization, and should not be an end unto itself.
It’s been busy week, but I feel I’ve at least got something to show for it. The first big project I’ve worked on has gone public at ITEXPO East 2015 and I can actually talk about it now. The new service is called Switchvox Rescue.
About six months ago, I took a new job and left the position I’d held at the Alabama Supercomputer Center for roughly seven years. I’ve struggled whether I should write a post about why I chose to leave my old job or focus on the positive and talk about my new job. If you’re curious, this article at the Washington Business Journal covers some of the shenanigans that motivated me to leave my job with CSC.
TL;DR Support for NUMA systems in torque/Moab breaks existing means of specifying shared memory jobs and limits scheduling flexibility in heterogeneous compute environments.
My experience is based on SUSE SLES 11 SP1/SP2 with a stock kernel, so YMMV if you’re running a newer mainline kernel without all the backports.
I tested on two Supermicro systems. One with an LSI HBA card with 22x 2TB enterprise SATA drives (originally purchased to run OpenSolaris/ZFS). Second has an Adaptec hardware RAID controller with 36x 2TB enterprise SATA drives. Some of the data loss and stability issues I experienced may be attributed to later discovering the “enterprise” drives used in the first system turned out to be less RAID-friendly than the manufacturer claimed, eventually leading to them to replace ALL of my drives with a different model.
Btrfs was a preview in SLES SP1 and is “supported” in SP2 but with major restrictions if you wanted a supported configuration. Support in SP2 requires that you create btrfs filesystems using Yast and live with the limited options it allows. I’m guessing what you can do via Yast is the subset of features they tested enough to be willing to try and support. I tried using Yast to set up btrfs on one of our systems, but found their constraints too limiting given my use case and the organization I’d settled on in the SP1 days.
So this upcoming week is the big annual supercomputing convention, SC10, down in New Orleans. Since I’m skipping out (anxiously waiting for the arrival of Little Miss Sunshine), I’ve got time to actually try and read through the slew of new product announcements and news coverage. So today I saw this quote on twitter from hpc_guru and just had to share:
“Cost of the building next generation of supercomputers is not the problem. The cost of running the machines is what concerns engineers.”
Well this is certainly not something I expected. SGI is one of the few HPC vendors out there that I’m aware of who are still doing neat things with hardware. We’ve got some of their large SMP Itanium boxes on the floor where I work, and I think they’re pretty slick machines. Pricy, but slick. And so far their support is about the best I’ve dealt with. That’s not saying their perfect (try getting a CXFS guru on the phone when you need one without sitting on a major outage for several hours), but they generally seem better than most of the other HPC vendors I’ve worked with (IBM, Cray).
Medical researchers at the University of Alabama at Birmingham (UAB) have discovered a new use for scorpion venom – cancer medication. Each year, some 9,000 Americans are diagnosed with malignant glioma, a form of brain cancer that kills about half its victims within a year of diagnosis. Glioma cells work a lot like cockroach muscle cells. And while that fact is pretty disgusting, it also got UAB researchers thinking about the giant Israeli scorpion, whose venom is harmless to humans but deadly to its cockroach prey.
On the drive back from Savannah, Charles and I quickly stopped in to visit the Robins Air Force Base Museum of Aviation. Didn’t spend very long there, but it wasn’t too much out of the way for the drive back and had a chance to take a few pics. There were some more unusual planes there that I’d meant to go back and look up info on, but obviously hadn’t since I only just pulled the photos off my camera a few days ago with the rest of the photos from Savannah.
Took a trip for work back in December to provide cluster training for some of the engineers at Gulfstream. Seems like we also did some very minor maintenance on the cluster we administer there. Anyway, I brought the camera along and had a chance to take a few pics while wandering around the riverfront. Only got around to pulling them off my camera today, so here they are: