Amazon Usage Estimates

Two weeks ago Guy Rosen posted am interesting analysis of the EC2 instance IDs that reveals how many instances (virtual machines) have been launched on EC2 since its beginning in 2006. We’ve also been digging in our records, and I can share some interesting findings.

First of all, Guy’s analysis contains one significant error, due to the limited data set he had access to. Before May 2009 EC2 issued even and odd instance IDs, not just even ones, as he mentions. Since that date EC2 issued only even IDs until it switched to only odd ones in early September. The even/odd switches don’t seem to correlate with ID boundaries; perhaps Amazon switches between two active/standby reservation systems or something else is going on.

The formula to convert an EC2 ID into a sequential launch number as far as we call tell is:

Given an AWS ID as i-11223333
Assign p1 the 1s, p2 the 2s, and p3 the 3s
Also assign p31 the first two 3s and p32 the last two 3s
  c1 = (p1 ^ p32) ^ 0x69
  c2 = (p2 ^ p31) ^ 0xe5
  c3 =  p3 ^ 0x4000
And finally concatenate c1-c2-c3. (This does not include the even/odd adjustments.)

The upshot of Guy’s error is that he underestimates the launches by almost 2x! Here is a graph showing the instances launched daily since late 2006 that we would postulate based on his formula for instance IDs and what we’ve observed. We compute a total of 15.5 million instances (!) launched to date:

You can see that EC2 has been growing steadily, except for dips during the holidays and a spike in activity in April 2008. That spike was due to Animoto’s scaling to several thousands of servers within few days. We’re a little puzzled about this spike, however, because the instance ID analysis shows about 2x more servers launched than Animoto actually launched (we launched them so we know). We believe this discrepancy to be temporary, but there remain some mysteries in the instance ID allocation.

It’s also important to be clear about the what an instance launch means – namely, the launch of a virtual server.  It says nothing about what size server is launched (and therefore its cost per hour) or how long that server runs (and therefore how many servers are running concurrently).  As a result, an “instance launch” might mean as little as 10 cents in EC2 revenue (one small instance for one hour) or, for example,  $7,008 in EC2 revenue (one XL instance run for 365 days), or even more.  That’s quite a difference, and makes it challenging to calculate revenues based solely on total instance launch statistics.

Another interesting fact that we have observed is that during 2009 many of the larger EC2 customers have been migrating to larger instance sizes. In earlier days the predominant method of scaling was by launching more servers, but we are now seeing more scaling by replacing smaller servers by larger ones. Those XL servers are going like hotcakes! In addition, we see a clear rule where the larger the server, the longer it runs. A lot of the small servers go as quickly as they came; they’re used for experimentation, development, and testing. Once you launch a large server and fill it up with data, chances are you’ll keep it running for a while. Hold onto your wallet!

Another interesting trend we’ve seen is the improvement in sysadmin-to-server ratio. Our customers who grok the RightScale platform become very effective at managing lots of servers with few people – hundreds to thousands per sysadmin. As a result they use servers aggressively to solve business needs, whether to keep up with exponential traffic or simply flexibility during dev and test. Overall, in terms of all cloud spending, in the last 12 months we’ve observed:

  • Cloud infrastructure spending grew 380% in terms of money spent on cloud provider resources
  • Average cloud costs per customer grew 140%; cloud users on average are spending 2.5X more than a year ago
  • RightScale’s own cloud infrastructure consumption grew 440%

That’s phenomenal growth – and testimony to the value of managed cloud computing