tl;dr: C4 instances have a simpler design because they don’t have to support hard drives which makes them more reliable and cheaper for AWS.
At last years re:Invent AWS announced a new generation of compute-optimized EC2 instances: the C4 instance family. Back then they provided some technical details about these instances including the information that they are powered by a custom “Intel Xeon E5-2666 v3” processor built for AWS only, but the information regarding pricing or an availability date were still missing. Earlier this week these instances became available and pricing details and some remaining information were published as well.
Most press coverage about these new instance family focused on the custom built CPU, but the longer I think about it the more I think that this is just the tip of the iceberg.
Block storage at AWS
AWS is providing two different kinds of block storage: So called ephemeral storage, which is storage directly connected to the host system an instance is running on and EBS (Elastic Block Storage) volumes, which are network attached disks.
Ephemeral volumes have been available for most instances, only their smallest instance families (T1 and T2) don’t offer them. Depending on the instance you choose you’ll get a different amount of ephemeral storage which can be either HDD or SDD backed. The great thing about it: It’s free, because it’s a part of the price you already pay for an instance whether or not you use this storage. But there is as usual a caveat: Once you stop an instance the data stored on these volumes is gone, because there is no host system anymore which could hold the data.
EBS volumes are the means of choice if you want reliable block storage. In the past there have been performance bottlenecks with them, but since the introduction of IOPS EBS volumes, gp2 volumes and some improvements they are going to implement, that shouldn’t be an issue anymore.
One thing that went unnoticed with the C4 instances is, that these are the first instances (beside the small T1 and T2 instance) which don’t provide ephemeral storage at all!
I believe that’s a shift in AWS philosphy to decrease their costs and to ease their server architecture.
We all know that companies like Google or AWS build their own hardware. Facebook goes a different way with it’s Open Compute project, but they all try to achieve the same goal: They want to have hardware perfectly matching their needs without including features they don’t need to drive costs down and increase reliability.
AWS started builing it’s own custom network equipment five years ago and they have several other specialized components as well, like big storage racks and of course the EC2 instances. But these instances always came with hard disks included to enable ephemeral storage. With the introduction of the C4 instances AWS seems to be confident enough that EBS volumes also fit the needs of customers which have been using ephemeral volumes before, so they omitted these ephemeral storage options, which allows them to omit hard drives completely from these instances. That removes on major pain point of such servers: Failing drives. That’s a big plus for reliability.
AWS has published technical details about their custom “Intel Xeon E5-2666 v3” processor, but one interesting detail is missing: The TDP (Thermal Design Power), the maximum amount of power the CPU uses during normal operation. I wouldn’t be surprised if Intel managed to decrease that by a few watts for AWS, maybe by disabling features AWS will never use. But if AWS doesn’t even need hard drives they also don’t need SATA ports and only a limited number of PCIe-lanes for the network cards which would allow them to use a stripped down platform controller hub, which I believe Intel developed for them as well. So AWS uses a custom processor and might also use a custom platform controller hub, which both might save them a tiny fraction of needed power. Even if it’s a tiny fraction at the scale of AWS that’s a lot of saved money, which is important to drive the costs down to stay competitive.
The biggest news about the C4 instances isn’t the custom built processor, but the removal of the ability to use ephemeral volumes with them. With not providing such volumes AWS eleminated the need to add hard drives to these instances which removes a major point of failure of such servers and also drives down costs by making the design simpler and by maybe saving some watts of power. It’s interesting to see such developments and I’m pretty sure they’ll be enabling AWS to continue cutting prices in future as they did in the past.