Cloud Storage: AWS vs Azure vs Google vs SoftLayer

In taking a strategic approach to choosing their cloud infrastructure, enterprise are opting for a mix of public and private clouds to ensure that they are using the most appropriate clouds for their workloads and applications. AWS, Google Cloud Platform, Microsoft Azure, and IBM SoftLayer are among the top public cloud providers according to the RightScale 2016 State of the Cloud Report.

When defining your requirements for each application or workload, you will need to consider a variety of factors to determine which cloud is the best fit. These include features, costs, locations, security and compliance, performance, existing data centers you have in place for your private clouds, and vendors that you have enterprise agreements or discounted pricing with, to give just a few examples.

Among public cloud vendors, the key differences are with storage, container services, and pricing. In a previous post, we covered differences in container services and pricing. In this blog, I’ll drill down on cloud storage.

Want to compare public clouds based on your own specific requirements? RightScale offers Cloud Comparison, a free tool for comparing clouds that enables you to select your own requirements and shows you how many and which of those requirements are supported by each cloud.

Because the features and services offered by each cloud provider are constantly changing, it can be difficult to keep up. Cloud Comparison pulls all of this data together in one place and updates it quarterly. So rather than digging through multiple websites to determine whether a cloud provider has a particular service, region, certification, or SLA term, you can simply select your requirements and see which clouds match. RightScale users can also access Cloud Comparison from within the Cloud Analytics Scenario Builder, which provides pricing and cost comparisons for all of the various instance sizes.

Cloud Storage Options

Cloud providers offers a variety of options to store data including:

  • Object storage
  • Block storage
  • Instance/server storage (“ephemeral” storage)
  • Archival storage
  • Content delivery networks (CDN)
  • Queue services
  • Database services
  • Caching services
  • Import/export services

I’ll focus on two of the most commonly used core storage services, object storage and block storage, since almost everyone leverages one or both of these when they first start using cloud.

Object Storage:

AWS Simple Storage Service (S3)

  • Storage abstraction: “buckets”
    • Unlimited number of objects per bucket; 5TB limit per object
  • SLAs:
    • Standard:
      • Availability: 99.99% on yearly basis
      • Durability: 99.99999999999% (11 nines) on yearly basis
    • Standard:
      • Availability: 99.9% on yearly basis
      • Durability: 99.99999999999% (11 nines) on yearly basis
  • Encryption: in-flight and at-rest
    • Multiple encryption options: AWS controls keys, user controls keys

S3 is the simple storage service for AWS. S3 uses the term “buckets” to describe the storage abstraction that you can throw your objects into. S3 allows an unlimited number of objects per bucket, and there is a very generous 5TB per object size limit. Similar to the other cloud vendors, AWS offers different service levels: standard and infrequent. The standard service level for S3 provides 99.99% availability on a yearly basis and durability is 11 nines (yes, you read that right). These numbers make S3 very durable and safe: For every 10,000 objects in S3, you will lose one every 10,000 years.

The other service level for S3 is infrequent access, which provides slightly lower availability at 99.9% with the same 11 nines of durability. There’s a slight price break for infrequent access, so take a look at your use case to see which one makes the most sense.

With all of these storage services you can encrypt data in-flight via SSL and TLS, but you can also encrypt data at-rest. There are several encryption options that AWS provides: server-side encryption where AWS controls the keys, or you can provide your own keys and store them via AWS Key Management Service (KMS). You can also encrypt data yourself client-side and upload the encrypted data to Amazon S3. In this case, you manage the encryption process and the encryption keys using your own tools.

Google Cloud Storage

  • Storage Abstraction: “buckets”
    • Unlimited number of objects per bucket; 5TB limit per object
  • SLAs:
    • Standard: 99.9% monthly for standard
    • Durable reduced availability (DRA): 99.0% monthly
    • Latency for both is milliseconds.
  • Encryption: Same as AWS, but in alpha

Google calls its object storage Google Cloud Storage, and like AWS, Google also uses “buckets” as an abstraction. Google provides the exact same limits as AWS: an unlimited number of objects per bucket and a 5TB limit on the size of each individual object. Google also provides three service levels: standard, Durable Reduced Availability (DRA), and Nearline.

Google provides monthly SLAs instead of annual. For standard storage, Google provides a  99.9% monthly uptime guarantee, and the latency is milliseconds to gain access to your objects. For DRA, Google offers 99% monthly uptime, and again, the latency is milliseconds. Nearline has the same 99% uptime as DRA but with a latency of about 3 seconds, so I think of that more as an archival storage.

Google automatically encrypts all data both in-flight and at rest. By default, Google Cloud Storage uses its own server-side encryption keys to encrypt data. Alternatively, there is a beta version for a new option that allows you to provide your own encryption keys. Beta means that there may be backward breaking changes in Google’s API. In addition, as with any cloud, you can choose to encrypt data on the client side before you write it to Google Cloud Storage.

Azure Storage

  • Storage Abstraction: “containers” and “blobs”
    • Unlimited number of objects, 500TB limit per storage account; can have multiple storage accounts
  • Service Levels:
    • Locally Redundant Storage (LRS), Zone Redundant Storage (ZRS),  Geographically Redundant Storage (GRS) (more comparable to AWS and Google), Read-Access Geo-Redundant (RA-GRS)
  • Encryption: Same via Azure Encryption Extensions (run it on your VM); can be used with Azure Key Vault

Azure uses slightly different nomenclature for storage: “containers” instead of “buckets” and “blobs” for block storage. Azure offers an unlimited number of objects per container and a 500TB limit per storage account, but you can have multiple storage accounts as well.

Azure takes a different approach to service levels. It has local, zone, and geo-redundancy options as well as read-access geo-redundant. Azure calls them LRS, ZRS, GRS, and RA-GRS. LRS is replicated multiple times within the same data center. ZRS is replicated multiple times within the same zone (that is, multiple data centers within the same geographic region). GRS replicates locally and to a second data center hundreds of miles away. GRS is the most comparable to Google and AWS storage, so if you wanted to compare apples-to-apples prices, you would compare Azure GRS Storage with those of the other providers. And lastly, there’s read-access geo-redundant, which adds read-access to the other geographic region that’s being used as your backup data center. Obviously the price increases as you move toward the more robust redundancy options.

As with other clouds, encryption in-flight is supported, and with at-rest you can use Azure Encryption Extensions and you can also store keys in the Azure Key Vault. Azure Encryption Extensions is a bit different than the other clouds because it is a utility that runs on your VM. In contrast, AWS and Google offer server-side encryption so there is no CPU load on your VM to do the encryption. With Azure Encryption Extensions, it’s on the VM you’re running it on, which may be important if you’re trying to milk the most out of your CPU.

SoftLayer Object Storage

  • Based on OpenStack Swift platform
  • Storage abstraction: “containers”
    • Unlimited number of objects per container; 5GB limit per object, but you can store data in chunks and the storage creates a manifest file that knows how to piece it back together; it allows you parallel uploads/downloads, so the limit is a bit misleading
  • Single Service Level
    • Durability 99.99999999999% (11 nines)
  • Replication within a cluster, but no geo-replication
  • Encryption: Third-party tools or customer-implemented tools, nothing built in

SoftLayer Object Storage is based on the OpenStack Swift platform, so if you’re familiar with OpenStack Swift, you might be familiar with some of the limits and caveats for SoftLayer storage. SoftLayer uses the same storage abstraction term as Azure: containers. It also supports an unlimited number of objects per container. SoftLayer does have a significantly smaller limit per object — at 5GB — but you can also store a huge object in multiple chunks and create a manifest file that knows how to automatically piece it back together when you download it. SoftLayer also allows you parallel uploads and downloads, so the 5GB limit is a little misleading. You can store bigger objects, you just have to do a little bit of finagling to get them broken up and reassembled.

SoftLayer only offers a single service level versus the multiple service levels from some of the other providers. SoftLayer also has 11 nines of durability that AWS advertises as well. SoftLayer offers replication within the cluster — that local data center — but currently offers no geo-replication. For SoftLayer, there’s nothing built-in to the platform for encryption. You can, of course, leverage third-party partners and tools to do client-side encryption.

Block Storage

Object storage is great for the use case it is designed for  that is, storing data as a self-contained “object” for later retrieval. However, if you need a more standard filesystem configuration (and one that is POSIX compliant), then block storage is the appropriate choice.

AWS Elastic Block Storage (EBS)

  • Volume size: 1GB to 16TB (in 1GB increments)
  • Volume types:
    • Magnetic: 100 IOPS on average, bursting to several hundred IOPS (used mostly for storage/snapshotting)
    • General Purpose (SSD): 3 IOPS/GB up to 10,000 IOPS. Throughput limit of 128MB/sec, up to 160MB/sec on larger (> 170GB) volumes
    • Provisioned IOPS (SSD): Up to 20,000 IOPS/volume. Max throughput of 320MB/sec (when used with EBS-optimized instances)
  • Snapshots available across Availability Zones (AZs) but not regions
  • Encrypted EBS volumes of all types are supported

Most AWS users are using EBS in some shape or form. The volume sizes range from 1GB up to 16TB in 1 GB increments, so it’s very granular and lets you get up to a very large volume size. AWS has three different volume types which have gone through name changes over the years, but currently they are called Magnetic, General Purpose, and Provisioned IOPS.

Magnetic is just a typical old school disk that you were used to before SSD came on the scene. AWS cites 100 IOPS on average and bursting to several hundred IOPS, so this volume type is not appropriate for high-transaction workloads. It is best used for information that you want to store and snapshot but that doesn’t require quick access in terms of IOPS rate. Obviously Magnetic will be a bit cheaper than using the SSD-based black storage.

General Purpose is an SSD-backed storage mechanism. It provides 3 IOPS per GB with up to 10,000 IOPS. For a 3,334 GB (3.3TB) volume, you’re going to get 10,000 IOPS. And everything above that, up to 16TB, will still be at that max of 10,000 IOPS. The throughput ranges from 128MB/sec up to 160MB/sec. There’s some fancy math that goes into calculating the throughput, but to simplify it, for disks up to 170GB you have 128MB/sec throughput while with disks larger than that you get 160MB/sec.

The third and final type of EBS is Provisioned IOPS (PIOPS), which is also SSD based. You can get up to 20,000 IOPS/volume if you use PIOPS. You can also increase your max throughput up to 320MB/sec if you use EBS Optimized instances. Basically these are, as the name implies, instances that have been optimized on their backend to give faster access to the EBS infrastructure. There is a slight cost increase associated with these. Some of the newer instance types in AWS are EBS optimized by default, so there is no extra charge. But if you use some of the older legacy instance types with the EBS optimized options you have to pay a little more. It gives you a higher SLA, and you’re going to get more throughput than you’d otherwise expect to get from that Provisioned IOPS volume.

You can snapshot all these EBS volume types. Those snapshots are available across AZs. So if you’re in US-East-1A, for example, and you have a volume and you snapshot it, you can automatically access that snapshot anywhere in US-East (US-East-A, B, C, D, and E), but that snapshot is not available in US-West. Snapshots are not available across regions automatically, although AWS does provide you the tools to be able to copy your snapshots around, but that’s something you need to do yourself.

You can encrypt all types of EBS volumes, whether Magnetic, General Purpose, or PIOPS. There’s an option to encrypt when you create a volume and you have the same choices to have AWS manage the keys or do it yourself. The encryption is not done on your VM, so there is no CPU load on your VM to do that encryption.

Google Block Storage (Persistent Disk, “PD”)

  • Volume size: 1GB to 10TB
  • Volume Types:
    • HDD (standard magnetic).
      • IOPS: Up to 3,000 read IOPS/15,000 write IOPS
      • Throughput: 180MB/sec read, 120MB/sec write
    • SSD
      • IOPS: Up to 15,000 IOPS.
      • Throughput: Up to 240MB/sec
  • Snapshots available across all datacenters in the zone, but not across regions
  • All data encrypted in-flight and at-rest by default on all volumes

Volume sizes range from 1GB to 10TB as opposed to 16TB for AWS. Google has two different types: HDD and SSD.  HDD is magnetic storage and is advertised for up to 3,000 read IOPS and 15,000 write IOPS with throughput of 180MB/sec for read and 120MB/sec for write. SSD provides up to 15,000 IOPS and a throughput up to 240MB/sec.

Similar to AWS, snapshots are available across all data centers in the zone but not across regions. You can copy snapshots across regions if you want. By default, Google Persistent Disk, like object storage, encrypts data by default both in-flight and at-rest.

Azure Block Storage

  • Volume size: 1GB to 1TB (significantly smaller on the high end because of how Azure does the back end)
  • Implemented as “Page Blobs.” Reads/writes translated to GETs/PUTs on backend
  • Volume Types:
    • Standard storage
      • IOPS: 500 IOPS/attached disk
      • Throughput: 60MB/sec
    • Premium storage: SSD based (only available to Azure Virtual Machines, not other services)
      • IOPS: Up to 80,000 IOPS
      • Throughput: 2,000MB/sec
  • Snapshots replicated across multiple data centers in the zone, with option for cross-region replication
  • All data encrypted in-flight and at-rest via Azure Encryption Extensions

Azure volume sizes, from 1GB to 1TB, are significantly smaller on the high end because of how Azure does the back end, which it implements as page blobs as opposed to block blobs. Basically what happens is for reads and writes, you can now create a POSIX-compliant file system here and do freads and fwrites that get translated to GETs and PUTs on the backend. This approach allows Azure to use the same backend infrastructure for all of its storage, but because it is using these page blobs, you’re limited to a 1TB volume size.

Azure has two different volume types — Standard and Premium. With Standard storage you get 500 IOPS per attached disk and a throughput of about 60MB/sec, which is decent, but not wonderful. Premium storage though, is pretty screaming with up to 80,000 IOPS and amazingly 2,000MB/sec in throughput. It’s SSD based and it’s only available currently to Azure Virtual Machines so you can’t use it with other Azure services.

Azure also provides some additional snapshotting options. Snapshots are replicated across multiple data centers in the zone, but unlike with the other providers, you also have the option for cross region replication with GRS. With Azure you can specify, for example, that when I snapshot this volume, I want three copies locally and I want three copies in another data center that is far, far away.

All data is encrypted in-flight and, again, you can do at-rest encryption with the Azure Encryption Extensions, but the CPU hit for that encryption is on your VMs.

SoftLayer Block Storage

  • Volume size: 20GB to 12TB
  • Volume types:
    • Endurance Storage:
      • IOPS: 0.25, 2.0, or 4.0 IOPS/GB, so up to 48,000 IOPS is possible
    • Performance Storage:
      • IOPS: Up to 6,000 IOPS. 100GB volume can support 6,000 IOPS
      • Need 1.5TB of Endurance for same IOPS rate
  • Snapshots replicated across multiple data centers in the zone with option for cross-region replication (Endurance only)
  • Encryption request third-party tools and/or customer implementation

SoftLayer Block Storage offers volume sizes of 20GB to 12TB, so the smallest size is a little bigger than other cloud providers if that matters to you. SoftLayer has two different volume types — Endurance and Performance. Endurance storage provides either 0.25, 2.0, or 4.0 IOPS per GB. So if you took a 12TB volume, up to 48,000 IOPS is possible.

Performance storage goes up to only 6,000 IOPs. However, with Performance you can get that 6,000 IOPS at only 100GB. So everything at 100GB and above with performance storage will do 6,000 IOPS. If you were using Endurance storage you’d need a 1.5TB volume for you to achieve that same IOPS rate of 6,000. So Endurance is good for use cases that aren’t super-high transactional, but that might need a really large volume. Performance is for highly transactional data that you need to read and write very, very quickly. The downside is that you can’t snapshot Performance storage. You can snapshot Endurance volumes, which will be replicated within the zone and also cross-region, but for Performance storage volumes, you have to take care of that yourself. SoftLayer customers can use Performance for databases where they are already replicating from a master or slave or they can leverage the slew of IBM third-party tools to do that data backup.

And similar to SoftLayer Object Storage, the encryption requires third-party tools or you need to do it yourself.

Cloud Storage Pricing

Although each cloud provider has similar offerings for storage, it’s difficult to get an exact apples-to-apples comparison because of the myriad differences. For pricing, the best approach is to determine the most appropriate option on each of the cloud providers you are considering and then apply the relevant pricing to see what the cost will be for each.

To get more detail on the topics I covered here, watch our on-demand webinar, Compare Clouds: AWS vs. Azure vs. Google vs. SoftLayer.