Tag Archives: PaaS

Using Azure Cool Blob Storage with Veeam 9.5u3

“Hey, AKSysAdmin. I want to push all our backups to cheap Azure storage. Can you do a proof-of-concept for me and a quick cost write up?”

We are all eagerly awaiting the implementation of Veeam’s Scale-Out Backup Repository Archive Tier functionality in v10. The Archive Tier functionality will allow Veeam customers to leverage to “cheap” cloud storage like AWS’s S3 and Glacier and Azure’s rather hilariously named Cool Blob Storage. In the meantime if you wanted to use Azure Blob Storage right now what are your options?

  • A “middleware” appliance like NetApp’s AltaVault, Microsoft’s StorSimple or a VTL
  • Roll your own IaaS solution in Azure

The first option is pretty straight-forward. You buy an appliance that provides a storage target for your on-prem Veeam Backup and Replication server and send your Backup Copy jobs to that Backup Repository. Once your backups are located there, “magic” happens that handles the hot/warm/cold tier-ing of the data out to Azure as well as the conversion from structured data to unstructured data.

The second option is a little more complicated. You’ll need to spin up an Azure IaaS VM, attach blob storage to it and make it usable to your on-prem Veeam infrastructure.

 

Before we go too much further we should probably talk about the different blob storage types.

Block Blobs

These are pretty much what they sound like, block based storage of large contiguous files. They work great for things are not accessed via randomized read and writes. The individual blocks stored in each blob are referenced by a BlockID and can be uploaded/modified/downloaded simultaneously, assembled and then committed with a single operation . You can see how well this type of storage lends itself to streaming services where large files are split into smaller pieces and uploaded or downloaded sequentially.  The maximum size (as of writing) for a block blob is about 4.75TBs.

Page Blobs

Page blobs are composed of 512-byte pages optimized for random read and write operations. Changes to the pages require immediate commits unlike block blobs. Page blobs work great for things like virtual disks where some other mechanism is organizing the data inside the blob. Page blobs are used for the underlying storage for Azure IaaS data disks. The maximum size (as of writing) for a page blob is 8TBs.

Azure Blob Storage Tiers: Hot, Cool and Archive

Azure Storage Accounts allow you group all your various pieces of blob storage together for the purposes of management and billing. With Blob and General Purpose v2 Storage Accounts you can elect to use storage tiers. Cool Blob Storage has lower storage costs (and higher access costs) and is intended for things like short-term backup and disaster recovery data. Archive storage has even lower storage costs (and even higher access costs) and is designed for data that can tolerate hours of potential retrieval time. Archive storage is intended for long-term backups, secondary backup storage or data that has archival requirements. In order to read the data in an archive storage the blob needs to be rehyrdated which can take up to 15 hours. Blob size is also a factor in rehyrdation time.

I should mention that the option to have your blobs stored in locally redundant storage (LRS) or globally redundant storage (GRS) exists for all of these flavors.

 

This is all great but how do I use it?

Well if you went with the first option you break out your wallet for a capital purchase and follow Veeam’s Deployment Guide for AltaVault or vendor equivalent.

The second option is a little more involved. You need to deploy an instance of Veeam’s Cloud Connect for the Enterprise, add some data disks to the resulting Azure Iaas VM, configure them in Windows, setup a Backup Repository using them and finally add the resulting repository to your on-prem install as a Cloud Backup Repository. For the price of the IaaS VM and the underlying storage you now have a cloud-based backup repository using Azure blob storage.

Here’s why you probably don’t want to do this.

Veeam will support Azure Cool Blob storage fairly soon so you have to ask yourself if it makes sense to buy a purpose built “middleware” appliance to bridge the gap. A few years ago it would of been a no-brainer but with more and more backup vendors supporting cloud storage natively it seems like market for these devices will shrink.

The second option has some issues as well. Your freshly created Cloud Backup Repository is backed by Azure IaaS data disks which sit on top of page blob storage. Guess what page blobs don’t support? Storage tiers. If you create a storage account in the cool tier you’ll notice the only container option you have is for block blobs. If you try and add a data disk to your IaaS VM using a blob storage account you get this error:

Not going to work.

What if you setup a Azure File Storage container and utilized it instead of a data disk? Same problem. Only block blob storage supports archiving tiers at this point in time.

What if you just provisioned extra data disks for your VM, and use Storage Spaces and ReFS to get your storage? Well that will sort of work but there many limitations:

  • Data disks are limited to 4TBs
  • Most IaaS VMs only support 15 data disks
  • If you need more than 15 data disks your IaaS VM is going to get really expensive
  • You have to correctly manage and configure a VM with 15 disks using Storage Spaces
  • All your disks are running on page blob storage which is not really that cheap

The “roll-your-own-IaaS” solution will be performance and capacity limited right out of the gate. It will be complicated and potentially brittle and it doesn’t take advantage of the pricing of storage tiers making it rather pointless in my opinion.

Why you still may want to do this

If the backup dataset that you want to archive is fairly small this might still make sense but if that’s the case I would forgo the entire exercise of trying to cram a round peg into a square hole and look very seriously at a DRaaS provider like Iland where you will get so much more than just cloud storage for your backups for what will likely be a competitive price.

Why even if you still want to do this it’s probably not a good idea

Everything is elastic in the cloud except the bill and unless you have an accurate picture of what you really need you might be surprised once you get that bill. There is a bunch of things that are not really accounted for in your traditional on-premise billing structure: IP addresses, data transfer between virtual networks, IOPS limited performance tiers and so on. In short, there is a lot more to doing the cost analysis than just comparing the cost of storage.

Speaking of – let’s take a look at the current storage prices and see if they really are “cheap”. These prices are based on the Azure Storage Overview pricing and are located in the WestUS2 region of Azure ComCloud.

Standard Page Blobs (Unmanaged Disks)

LRS ZRS GRS RA-GRS
$0.045 per GB N/A per GB $0.06 per GB $0.075 per GB

This also comes with a $0.0005 per 10,000 transactions charge when Standard Page Blobs are attached to a VM as an Unmanaged Disk.

 

Block Blob Pricing

Hot Cool Archive
First 50 terabyte (TB) / month $0.0184 per GB $0.01 per GB $0.002 per GB
Next 450 TB / Month $0.0177 per GB $0.01 per GB $0.002 per GB
Over 500 TB / Month $0.017 per GB $0.01 per GB $0.002 per GB

There are also some operational charges and data transfer costs

Write Operations* (per 10,000) $0.05 $0.10 $0.10
List and Create Container Operations (per 10,000) $0.05 $0.05 $0.05
Read Operations** (per 10,000) $0.004 $0.01 $5
All other Operations (per 10,000), except Delete, which is free $0.004 $0.004 $0.004
Data Retrieval (per GB) Free $0.01 $0.02
Data Write (per GB) Free $0.0025 Free

 

To replace our rather small GFS tape set we’d need somewhere north of 100TBs. The first problem is with the limitation requiring us to use page blob backed data disks, we won’t even be able to meet our capacity requirements (4TBs per data disk, 15 data disks per IaaS VM = 60 TBs).

If we put aside the capacity issue, let’s look at a notional cost just for comparison’s sake: 100TBs * 1024 = 102,400 GBs * $0.045 = $4,608 per month. This doesn’t include the cost of the IaaS VM and associated infrastructure you may need (IP addresses, Virtual Networks, Site-to-Site VPN, etc.) nor any of the associated transaction charges.

The storage charge is more than expected since we’re not really using the technology as intended. Block blob storage in the archive tier gets us a much more respectable number: 100TBs * 1024 = 102,400 GBs * $0.002 = $204.8 per month. BUT we need to factor in the cost of some kind of “middleware” appliance to utilize this storage so tack on an extra $40-$60k (it’s hard to pin this cost down since it will come via a VAR so I could be totally off). If we “op-ex” that cost over three years it’s an additional $1388.00 a month bringing your total to $1593.68 per month for “cheap” storage.

OK. Looks like our “cheap” cloud storage may not be as cheap as we thought. Let’s take a look at our on-premise options.

LTO data tapes… personally I loath them but they have their place. Particularly for archiving GFS data sets that are small. A 24 slot LTO-6 tape library like Dell’s TL2000 is around $20k and 40 LTO-6 tapes with a raw capacity of 100TBs (not including compression) comes to about $602 per month over three years.

What about on-premise storage? A Dell MD1400 with 12 10TB 7.2K RPM NLSAS drives is somewhere in the $15-$20k range and brings 80TBs of usable storage in RAID-60 configuration. Allocated out over three years this comes to roughly $555 per month.

Summary

Technology choices are rarely simple and no matter how much executive and sales folks push “cloud-first” like it’s some kind of magic bullet, cloud services are a technology like any other with distinct pros and cons, use cases and pitfalls. Getting an accurate picture of how much it will cost to shift a previously capital expense based on-premise service to cloud services is actually a fairly difficult task. There are a tremendous amount of things that you get “included” in your on-premise capital purchases that you have to pay for every month once that service is in the cloud and unless you have a good grasp on them you will get a much bigger bill than you expected. I really recommend SysAdmin1138’s post about the challenges of moving an organization to this new cost model if you are considering any significant cloud infrastructure.

If you want to use Azure Blob Storage right now for Veeam the answer is: You can but it’s not going to work the way you want, it’s probably going to cost more than you think and you’re not really using the technology the way it was intended to be used which is asking for trouble. You could buy some middleware appliance but with Scale-Out Backup Repository Archive Tier functionality on the immediate horizon this sounds like a substantial infrastructure investment that you’re only going to get limited return of business value on. It might make sense to wait.

Finally a little bit of disclaimer. I tried to pull the pricing numbers from old quotes that I have (hence the LTO-6 and LTO-8 tapes) to try and keep the math grounded in something like reality. Your prices may vary wildly and I highly encourage you to compare all the different cost options and spend some time to try to capture all of the potential costs of cloud services that may be hidden (i.e., it’s not just paying for the storage). Cloud services and their pricing are constantly changing too so it’s worth checking with Microsoft to get these numbers from the source.

Until next time, stay frosty.

Scheduling Backups with Veeam Free and PowerShell

Veeam Free Edition is an amazing product. For the low price of absolutely zero you get a whole laundry list of enterprise-grade features: VeeamZip (Full Backups), granular and application-aware restore of items, native tape library support and direct access to NFS-based VM storage using Veeam’s NFS client. One thing that Veeam Free doesn’t include however is a scheduling mechanism. We can fix that with a little bit of PowerShell that we run as a scheduled task.

I have two scripts. The first one loads the Veeam PowerShell Snap-In, connects to the Veeam server, gets a list of virtual machines and then backs them up to a specified destination.

 

I had Veeam setup on a virtual machine running on the now defunct HumbleLab. One of the disadvantages of this configuration is I don’t have separate storage to move the resulting backup files onto. You could solve this by simply using an external hard drive but I wanted something a little more… cloud-y. I setup Azure Files so I could connect to cheap, redundant and most importantly off-site, off-line storage via SMB3 to store a copy of my lab backups. The biggest downside to this is security. Azure Files is really not designed to be a full featured replacement for a traditional Windows file server. It’s really more of SMB-as-a-Service offering designed to be programmatically accessed by Azure VMs. SMB3 provides transit encryption but you would still probably be better off using a Site-to-Site VPN between your on-prem Veeam server and a Windows file server running as VM in Azure or by using Veeam’s Cloud Connect functionality. There’s also no functionality replacing or replicating NTFS permissions. The entire “security” of your Azure Files SMB share rests in the storage key. This is OK for a lab but probably not OK for production.

Here’s the script that fires off once a week and copies the backups out to Azure Files. For something like my lab it’s a perfect solution.

 

Until next time, stay frosty!

Don’t Build Private Clouds? Then What Do We Build?

Give Subbu Allamaraju’s blog post Don’t Build Private Clouds a read if you have not yet. I think it is rather compelling but also wrong in a sense. In summation: 1) Your workload is not as special as you think it is, 2) your private cloud isn’t really a “cloud” since it lacks the defining scale, resiliency, automation framework, PaaS/SaaS and self-service on-demand functionality that a true cloud offering like AWS, Azure or Google has and 3) your organization is probably doing a poor job of building a private cloud anyway.

Now lets look at my team – we maintain a small Cisco FlexPod environment – about 14 ESXi hosts, 1.5TBs RAM and about 250TBs of storage. We support about 600 users and I am primary for the following:

  • Datacenter Virtualization: Cisco UCS, Nexus 5Ks, vSphere, NetApp and CheckPoint firewalls
  • Server Infrastructure: Platform support for 150 VMs, running mostly either IIS or SQL
  • SCCM Administration (although one of our juniors has taken over the day to day tasks)
  • Active Directory Maintenance and Configuration Management through GPOs
  • Team lead responsibilities under the discretion of my manager for larger projects with multiple groups and stakeholders
  • Escalation point for the team, point-of-contact for developer teams
  • Automation and monitoring of infrastructure and services

My-day-to-day consists of work supporting these focus areas – assisting team members with a particularly thorny issue, migrating in-house applications onto new VMs, working with our developer teams to address application issues, existing platform maintenance, holding meetings talking about all this work with my team, attending meetings talking about all this work with my managers, sending emails about all this work to the business stakeholders and a surprising amount of tier-1 support (see here and here).

If we waved our magic wand and moved everything into the cloud tomorrow, particularly into PaaS where the real value to cost sweet spot seems to be, what would I have left to do? What would I have left to build and maintain?

Nothing. I would have nothing left to build.

Almost all of my job is working on back-end infrastructure, doing platform support or acting as an human API/”automation framework”. As Subbu states, I am a part of the cycle of “brittle, time-consuming, human-operator driven, ticket based on-premises infrastructure [that] brews a culture of mistrust, centralization, dependency and control“.

I take a ticket saying, “Hey, we need a new VM.” and I run some PowerShell scripts to create and provision above said new VM in a semi-automated fashion, I then copy the contents of the older VM’s IIS directory over. I then notice that our developers are passing credentials in plaintext back and forth through web forms and .XML files between different web services which kicks off a whole week’s worth of work to re-do all their sites in HTTPS. I then setup a meeting to talk about these changes with my team (cross training) and if we are lucky  someone upstream actually gets to my ticket and these changes go live. This takes about three to four weeks optimistically.

In the new world our intrepid developer tweaks his Visual Studio deployment settings and his application gets pushed to an Azure WebApp which comes baked in with geographical redundancy, automatic scale-out/scale-up, load-balancing, a dizzying array of backup and recovery options, integration with SaaS authentication providers, PCI/OSI/SOC compliance and the list goes on. This takes all of five minutes.

However here is where I think Subbu get its wrong: Of our 150 VMs, about 50% of them belong to those “stateful monoliths”. They are primarily composed of line-of-business applications with proprietary code bases that we don’t have access to or they are legacy applications built on things like PowerBuilder and no one understands how they work anymore. They are spread out across 10 to 20 VMs to provide segmentation but have huge monolithic database designs. It would cost us millions of dollars to re-factor the application into a design that could truly take advantage of cloud services in their PaaS form. Our other option would be cloud-based IaaS which is not that different from the developer’s perspective than what we are currently doing except that it costs more.

I am not even going to touch on our largest piece of IT spend which is a line-of-business application that has “large monolithic databases running on handcrafted hardware.” in the form of an IBM z/OS mainframe. Now our refactoring cost is in the ten of millions of dollars.

 

If this magical cloud world comes to pass what do I build? What do I do?

  • Like some kind of carrion lord, I rule over my decaying infrastructure and accumulated technical debt until everything legacy has been deprecated and I am no longer needed.
  • I go full retar… err… endpoint management. I don’t see desktops going away anytime soon despite all this talk of tablets, mobile devices and BYOD.
  • On-prem LAN networking will probably stick around but unfortunately this is all contracted out in my organization.
  • I could become a developer.
  • I could become a manager.
  • I could find another field of work.

 

Will this magical cloud world come to pass?

Maybe in the real world but I have a hard time imaging how it work for us. We are so far behind in terms of technology and so organizationally dysfunctional that I cannot see how moving 60% of our services from on-prem IaaS to cloud-based IaaS would make sense, even if leadership could lay off all of the infrastructure support people like myself.

Our workloads aren’t special. They’re just stupid and it would cost a lot of money to make them less stupid.

 

The real pearl of wisdom…

The state of [your] infrastructure influences your organizational culture.Of all things in that post, I think this is the most perceptive as it is in direct opposition to everything our leadership has been saying about IT consolidation. The message we have continually been hearing for the last year and a half is that IT Operations is a commodity service – the technology doesn’t matter, the institutional knowledge doesn’t matter, the choice of vendor doesn’t matter, the talent doesn’t matter: It is all essentially the same and it is just a numbers game to find the implementation that is the most affordable.

As a nerd-at-heart I have always disagreed with this position because I believe your technology choices determine what is possible (i.e., if you need a plane but you get a boat that isn’t going to work out for you) but the insight here that I have never really deeply considered is that your choice of technology drastically effects how you do things. It effects your organization’s cultural orientation to IT. If you are a Linux shop, does that technology choice precede your dedication to continuous integration, platform-as-code and remote collaboration? If you are a Windows shop, does that technology choice precede your stuffy corporate culture of ITIL misery and on-premise commuter hell? How much does our technological means of accomplishing our economic goals shape our culture? How much indeed?

 

Until next time, keep your stick on the ice.