The Cluster Era – Affordable Supercomputers
In 1999, I installed our first HPC system. It was a 16-node cluster in a single rack, and the customer was a bio-research centre in Norwich.
The compute nodes were 2U rack-mount chassis with a 2U head node.
By today’s standards, it was quite rudimentary, but it had a scheduler and two MPI environments, which I had installed and tested, with some assistance from the chaps from the parallel computing department at Imperial College.
The cluster was running BLAST, basically genome sequencing.
This scientific field is particularly good for parallelisation, as you can partition the query sequence or the database into node-sized chunks to achieve parallel running using the appropriate MPI environment.
We left the customer to try their new cluster out for a few weeks, after which I returned to site to check everything was still working as it should.
I asked my contact there if they were happy with the system.
He was telling me that the previous method of genome sequencing they were using was via a chunky Compaq server running next to his desk.
He said that a run would take two or three weeks, depending on the type of sequence, and after a week or so, he would get nervous that the machine would crash, or they would have a power cut or something, and he would have to start all over again.
“So how long does the new cluster take to run a sequence?” I asked.
“Eleven hours,” came the reply.
The new cluster had revolutionised their work with genome sequencing in that they could now do a run a day, whereas before they were limited to one every few weeks.
Needless to say, we received some more orders from this customer.
The spectacular success of this project gave us the enthusiasm to pursue this further.
Building a cluster of this type was quite time-consuming, as we built each node using separately sourced chassis, system boards, memory, disks and so on to produce a working cluster node.
My boss had an aspiration that if we could supply a reasonably sized cluster at a competitive cost, this would open up the possibility of academic institutions with limited budgets being able to afford their own parallel compute resource without having to beg or borrow compute time from elsewhere.
Taking this a step further, if we could build our own chassis, this would further save material costs.
Further still, if we could save even more costs by building a chassis which could hold many nodes instead of just one, this would save even more.
Building our own blade systems
My boss contacted a company nearby that specialised in sheet metal fabrication.
The owner of this business was a genius at designing and building things from sheet metal.
After some discussion, we came up with a rack-mountable box which housed a commodity PC system board mounted on a “blade” and could be hot-plugged into the box, which had a power supply and backplane installed in it.
What we didn’t know at the time was IBM were having a go at making something similar themselves.
So after the design phase, where we worked out how to design the power supply and backplane, which allowed the hot-plugging of a PC system board, how to make sure all the components were cooled, and how to ensure all the I/O connections were secured and safe, we were in a position to build some prototypes which we could give to prospective customers for them to try out.
As you might expect, academic institutions operate in a collegiate environment, and if you provide good service or if you have an exciting product, word gets around.
Before we knew it we were inundated with orders for lots of these in-house designed compute clusters.
We started with a chassis box that held five nodes, quite low density, really considering the size of the box, so we kept developing the product, and soon we were able to offer a double-density box that held ten system boards per chassis.
CPU cooling was the most challenging part here. We had to find a really efficient low-profile CPU cooler.
With this 10-node design, we spent some time developing it into a sleek, reliable and aesthetically pleasing product.
Then the orders started to come in.
We had orders from pharmaceutical sites, more bioinformatics sites, CFD-related sites and a lot more academic sites.
There was one project that some of the universities in the north of the country were involved in, a grid-computing collaboration that needed a lot of compute power. We were very busy with this one.
We were having to take on more staff and tune the build process to make it slicker and quicker.
One of these systems went into Manchester University. This is when I first met Professor Brian Cox, as he was involved in this project.
This period really marked the point where what had started as an interesting engineering experiment had become a fully-fledged and very busy part of the business.
What began as a handful of commodity PCs had turned into affordable supercomputing, and there was clearly an appetite for it.
There was still plenty more to come, and as the cluster side of the business matured, we began to look at another equally demanding challenge, storage.

Paul Ingram
Senior Principal Consultant
Red Oak Consulting