Health Care System: EC2 as a build server

I've been using for the past year or so a Slicehost virtual private server running Ubuntu Linux to run a build server.

Due to the inherent IO-bound nature of some of my builds and the RAM starved nature of the servers sold, I've been forced to upgrade from the 256MB to the 512MB and then to the 768MB slice. Not sure if it's a marketing ploy but you cannot use the server otherwise.

Starting last week, I'm running experiments on migrating the builds on top of EC2 (and S3 for storage). Using EC2 for a build server, especially for a small company is a perfect fit:

EC2 machines are way more powerful

The smallest EC2 machine has 1.7GB of RAM and the next one 7GB. These are serios machines.

Builds are finite

This might not apply for your projects or your company, but I generally do a few operations per day that would trigger a build.

This means that I actually only need the server for, let's say, 5 builds per day or less. Over 20 work days, I would actually use the build server for 100 builds per month.

So I am actually paying for a server to be live all the time when I only need it for 100 builds. Assuming a build takes about 1 hour (which is does for the longest project I have), I only need the server for 100 hours per month.

It's cheaper

Considering the previous paragraph where I noticed I only need the server for 100 hours, it's cheaper to pay for the EC2 hourly usage. Of course, running the EC2 server full-time would be a lot more expensive compared to Slicehost, but I don't need it full-time.

Thus, it's cheaper either to give up Slicehost altogether or to have some mixed scenario perhaps, with a much cheaper Slicehost server combined to an EC2 slave running on demand when needed. I'm slowly migrating to the mixed scenario first.

But there are also some clear advantages to using EC2:

You really do a clean build

While with a normal build server that's configured properly this problem doesn't show up, it is possible to be there: tainted builds. A tainted build is one that's using some form of unexpected binaries for various reasons.

When you build on a fresh machine there is nothing there to influence your build. Just the operating system, your tools and your code.

It forces you to take out to magic out of the build

When you start with a bare-bone machine you cannot make any assumptions you would unknowingly do on the build server.

On an always-live build server you can easily ssh and do some manual tweak which will remain there forever but never be actually documented.

This style of whole world building will force you to document and produce all the build dependencies.

Some first results for my most IO bound build

It finishes:

After 30 minutes with the 512MB slice, (but that started slipping for some reason, hence the upgrade)
After 20 minutes with the 768MB slice.
After 25 minutes of uptime with the EC2 m1.small
After 11 minutes of uptime with the EC2 m1.large
After 15 minutes of uptime with the EC2 m1.large, building everything over a RAM-disk.

The surprising thing here is that on EC2 m1.large, where I have over 7GB of RAM, a RAM-disk is slower. I assume the reason is that Linux uses the RAM for disk-cache anyhow and it's smarter about that (ie. by only caching the JARs not the whole source and build folder like I did).

The build over EC2 m1.small seems to be a bit slower but this total time is uptime. Meaning in the 25 minutes I install all the tools and download and unzip more than 1GB of dependencies and do the build.