“Hey, AKSysAdmin. I want to push all our backups to cheap Azure storage. Can you do a proof-of-concept for me and a quick cost write up?”
We are all eagerly awaiting the implementation of Veeam’s Scale-Out Backup Repository Archive Tier functionality in v10. The Archive Tier functionality will allow Veeam customers to leverage to “cheap” cloud storage like AWS’s S3 and Glacier and Azure’s rather hilariously named Cool Blob Storage. Once this feature is released Veeam will have native support for this functionality but in the meantime if you wanted to use Azure Blob Storage right now what are your options?
- A “middleware” appliance like NetApp’s AltaVault, Microsoft’s StorSimple or StarWind’s VTL
- Roll your own IaaS solution in Azure
The first option is pretty straight-forward. You buy an appliance that provides a storage target for your on-prem Veeam Backup and Replication server and send your Backup Copy jobs to that Backup Repository. Once your backups are located there, “magic” happens that handles the hot/warm/cold tier-ing of the data out to Azure as well as the conversion from structured data to unstructured data.
The second option is a little more complicated. You’ll need to spin up an Azure IaaS VM and attach storage to it and make it usable to your on-prem Veeam infrastructure in some capacity.
Before we go too much further we should probably talk about the different blob storage types.
These are pretty much what they sound like, block based storage of large contiguous files. They work great for things are not accessed via randomized read and writes. The individual blocks stored in each blobs are referenced by a BlockID and can be uploaded simultaneously, assembled and then committed with a single operation . You can see how well this type of storage lends itself streaming uses where large files are split into smaller pieces and uploaded or downloaded sequentially. The maximum size (as of writing) for a block blob is about 4.75TBs.
Page blobs are composed of 512-byte pages optimized for random read and write operations. Changes to the pages require immediate commits unlike block blobs. Page blobs work great for things like virtual disks where some other mechanism is organizing the data inside the blob. Incidentally they are used as the underlying storage for Azure IaaS data disks. The maximum size (as of writing) for a page blob is 8TBs.
Azure Blob Storage Tiers: Hot, Cool and Archive
Azure Storage Accounts allow you group all your various pieces of blob storage together for the purposes of management (and billing!). With Blob and General Purpose v2 Storage Accounts you can elect to use these storage tiers. Cool Blob Storage has a lower storage costs (and higher access costs) and is intended for things like short-term backup and disaster recovery data. Archive storage has even lower storage costs and even higher access costs is designed for data that can tolerate hours of potential retrieval time and is intended for data like long-term backups, secondary backup storage or data that has archival requirements. In order to to read data in archive storage the blob needs to be rehyrdated which can take up to 15 hours. Blob size is also a factor.
I should mention that the option to have your blobs stored in locally redundant storage (LRS) or globally redundant storage (GRS) exists for all of these.
This is all great but how do I use it?
Well if you went with the first option you break out your wallet for a capital purchase and follow Veeam’s Deployment Guide for AltaVault or vendor equivalent. The second option is a little more involved. You need to deploy and instance of Veeam’s Cloud Connect for the Enterprise, add some data disks to the resulting Azure Iaas VM, configure them in Windows, setup a Backup Repository using them and finally add the resulting repository to your on-prem install as a Cloud Backup Repository. For the price of the IaaS VM and the underlying storage you now have a cloud-based backup repository using Azure blob storage.
Here’s why you probably don’t want to do this.
Veeam will support Azure Cool Blob storage fairly soon so does it really make sense to buy a purpose built “middleware” appliance to bridge the gap until that time comes? I think a few years ago it certainly made sense but with more and more backup vendors supporting cloud storage natively it seems like market for these devices will shrink.
Your freshly created Cloud Backup Repository is backed by Azure IaaS data disks which sit on top of page blob storage. Guess what page blobs cannot support? Storage Tiers. If you create a storage account in the cool tier you’ll notice the only container option you have is for blobs. If you try and add a data disk to your IaaS VM you get this:
What if you setup a Azure File Storage container and utilized it instead of a data disk? Same problem. Only blob storage supports archiving tiers.
- Data disks are limited to 4TBs
- Most IaaS VMs only support 15 data disks
- If you need more than 15 data disks your IaaS VM is going to get really exspensive
- You have to correctly manage and configure a VM with 15 disks using Storage Spaces
- All your disks are running on page blob storage which is not really that cheap
In summary your solution will be performance and capacity limited right out of the gate, it will be complicated and potentially brittle and it doesn’t take advantage of the pricing of storage tiers making it rather pointless in my opinion.
Why you still may want to do this
If your backup dataset that you want to archive is fairly small this might still make sense but if that’s the case I would forgo the entire exercise of trying to cram a round peg into a square whole and look very seriously at a DRaaS provided like Iland where you will get so much more than just cloud storage for your backups for what will likely be a competitive price.
Why even if you still want to do this it’s probably not a good idea
Everything is elastic in the cloud except the bill and unless you have an accurate picture of what you really need you might be surprised once you get that bill. There is a bunch of things that are not really accounted for in your traditional on-premise billing structure: IP addresses, data transfer between virtual networks, IOPS limited performance tiers and so on. There’s more to doing the cost analysis than just comparing the cost of storage.
Speaking of let’s take a look at the current storage spaces and see if they really are “cheap”. These prices are based on the Azure Storage Overview pricing and are located in the WestUS2 region.
Standard Page Blobs (Unmanaged Disks)
|$0.045 per GB||N/A per GB||$0.06 per GB||$0.075 per GB|
This also comes with a $0.0005 per 10,000 transaction charge when Standard Page Blobs are attached to a VM as an Unmanaged Disk.
Block Blob Pricing
|First 50 terabyte (TB) / month||$0.0184 per GB||$0.01 per GB||$0.002 per GB|
|Next 450 TB / Month||$0.0177 per GB||$0.01 per GB||$0.002 per GB|
|Over 500 TB / Month||$0.017 per GB||$0.01 per GB||$0.002 per GB|
There are also some operational charges and data transfer costs
|Write Operations* (per 10,000)||$0.05||$0.10||$0.10|
|List and Create Container Operations (per 10,000)||$0.05||$0.05||$0.05|
|Read Operations** (per 10,000)||$0.004||$0.01||$5|
|All other Operations (per 10,000), except Delete, which is free||$0.004||$0.004||$0.004|
|Data Retrieval (per GB)||Free||$0.01||$0.02|
|Data Write (per GB)||Free||$0.0025||Free|
To replace our rather small GFS LTO tape set we’d need somewhere north of 100TBs. First problem is with the limitation requiring us to use page blob backed data disks means we won’t even be able to meet our capacity requirements (4TBs per data disk, limited to 15 data disks per IaaS VM = 60 TBs).
Let’s look at a notional cost even if capacity wasn’t an issue. 100TBs * 1024 = 102,400 GBs * $0.045 = $4,608 per month. This doesn’t include the cost of the IaaS VM and associated infrastructure you may need (IP addresses, Virtual Networks, Site-to-Site VPN, etc.) nor any of the associated transaction charges.
The storage charge is more than expected since we’re not really using the technology as intended. Block blob storage gets us a much more respectable number: 100TBs * 1024 = 102,400 GBs * $0.002 = $204.8 per month. BUT we need to factor in the cost of some time of “middleware” appliance to utilize this storage so tack on an extra $40-$60k (it’s hard to pin this cost down since it will come via a VAR so I could be totally off). If we “op-ex” that cost over three years it’s an additional $1388.00 a month bring your total $1593.68 per month for “cheap” storage.
LTO data tapes. Personally I loath them but they have their place. Particularly for archiving GFS data sets that are small. A small LTO-6 tape library like Dell’s TL2000 is around $20k and 40 LTO-6 tapes with a raw capacity of 100TBs (not including compression) comes to about $602 per month over three years.
What about on-premise storage? A Dell MD1400 with 12 10TB 7.2K RPM NLSAS drives is somewhere in the $15-$20k range and brings 80TBs of usable storage with RAID-60 configuration. Allocated out over three years this comes to roughly $555 per month.
The cloud is hard and no matter how much executive and sales folks push “cloud-first” like it’s some kind of magic bullet, cloud services are a technology like any other with distinct pros and cons, use cases and pitfalls. Getting an accurate picture of how much it will cost to shift a previously capital expense based on-premise service to cloud services is actually a fairly difficult task. There are a tremendous amount of things that you get “included” in your on-premise capital purchases that you have to pay for every month once that service is in the cloud and unless you have a good grasp on them you will get a much bigger bill than you expected. I really recommend Sysadmin1138’s post about the challenges of moving an organization to this new cost model if you are considering any significant cloud infrastructure.
If you want to use Azure Blob Storage right now for Veeam the answer is: You can but it’s not going to work the way you want, it’s probably going to cost more than you think and you’re not really using the technology the way it was intended to be used which is ask for trouble. You could buy some middleware appliance but with Scale-Out Backup Repository Archive Tier functionality on the immediate horizon this sounds like a substantial infrastructure investment that you’re only going to get limited return of business value on. It might make sense to wait.
Finally a little bit of disclaimer. I tried to pull the pricing numbers from old quotes that I have (hence the LTO-6 and LTO-8 tapes) to try and keep the math grounded in something like reality. Your prices may vary wildly and I highly encourage you to compare all the different cost options and spend some time to try to capture all of the potential costs of cloud services that may be hidden (i.e., it’s not just paying for storage). Cloud services and their pricing are constantly changing too so it’s worth checking with Microsoft to get these numbers from the source.
Until next time, stay frosty.