Ever since I can remember, I’ve had easy access to some form of multi node compute environment . But after changing jobs earlier this year, I found myself without access to anything resembling a cluster. At first it was only mildly annoying and I made do with using virtual machines on my (rather beefy) PC and I dabbled with public clouds like Azure and Amazon when I got a free coupon at some convention. But recently, I needed a cluster environment to test something for a potential customer and much to my chagrin, every virtual environment I tried just did not want to cooperate.
So I made the solemn promise to myself:
I will never be at the mercy of virtual compute ever again!
Well, that was easy… But now of course comes the hard part: making the promise a reality. I’ve never been a DIY kind of person, I much prefer to enlist the help of a home improvement professional. But building a cluster for myself just wasn’t something I was ever going to let anybody else do for me.
So, without further ado, time to articulate some Design Decisions:
- Intel/AMD x64 based
- Low electricity consumption
- Small footprint
- At minimum twelve nodes
- Focus on testing and development, not performance
- Financially reasonable, below 4,000.00 EUR
There are quite some small clusters available for purchase. However, these so called pico-clusters are usually based on Raspberry Pi or other ARM based hardware. Even though I think those devices are brilliant, I want to use this cluster to build and test anything that comes my way. Firstly, tying in to my profession, I want to be able to set up an Apache Hadoop cluster. Secondly, I want it to be a test bed for the various things I run on my public internet server.
2. Low electricity consumption
Obviously I want to be able to exploit the cluster without going bankrupt in the process. But additionally, a high electricity usage also means a high heat output. This cluster will sit next to me in my study and I do not want to be deafened by high speed fans. Luckily, current advances in x64 chip design are geared more towards lower consumption rather than higher clock speeds.
3. Small footprint
It needs to sit on my desk in my study. A 19 inch rack solution is just not going to work. This will mean that I will most likely need to build the housing cabinets myself. My earlier disclaimer about not being the DIY type notwithstanding, I’ve always been intrigued with these “scratch builds”. So let’s call this a challenge. We’ll see how far I get…
4. At minimum twelve nodes
I want to be able to have one large, or two smaller clusters. Five slave nodes with one master sounds like a good minimum cluster size. Two of these small clusters would then mean two times six nodes for a total of twelve. Right now, building cluster pods of six nodes each sounds good.
5. Focus on testing and development, not performance
Having a cluster that sits at the top of the TOP500 (http://www.top500.org/) would be amazing, but this would conflict rather badly with points 2, 3 and 6 on the design criterium list. Also, it is quite impossible to set up a cluster that is perfect for the intended workflow, since I have no idea what I’ll be using the cluster for tomorrow, let alone next month.
6. Financially reasonable, below 4,000.00 EUR
At this point, it’s just a stab in the dark of course, but I feel it should be possible the get all the computer hardware for a twelve node cluster under 2,000.00 EUR. The scratch build housing will require some additional expenses, but that should never be more than the cost of the computer hardware. Not unimportantly: a max 4.000,00 EUR budget is also something I think will comply with the W.A.F. (http://goo.gl/OnXqP4). For anyone who is interested, I will keep a running tally of my expenses available on this blog.
To be continued…