State of the Cloud Database: April 2018

State of the Cloud Database: April 2018

It's an exciting time to be a database professional. The technology is advancing quickly, large datasets are easier to handle than ever before, and the cloud is opening up new functionality and use cases that we couldn't tackle before.

My fellow database professionals, I tell you today that the state of the cloud database is strong!

Today we explore three of the platform as a service (PaaS) offerings available in Microsoft Azure. Sure, there is always the option to deploy a database solution on infrastructure as a service (IaaS) but that's not fun. You have to manage server and application patches, worry about disks, backups, and a whole host of other things. I mainly work with customers in the traditional Microsoft database world, so we are going to focus there. Azure now has MySQL, PostgreSQL, and NoSQL(Cosmos DB) managed services as well. The scope for this discussion will be the three Microsoft SQL options.

Azure SQL Database

Azure SQL Database was Microsoft's first entry into the database PaaS world and it has come a long way. In this model customers purchase a single database. Need to develop a new application that needs a database, you pay for that database. Need to migrate 4 databases to Azure, you buy 4 databases. Want to move that old Access database back end to something more managed, you buy a database. You get the idea, the service is purchased and managed at a per-database level. Sure, there is a SQL Server that goes along with those databases, but that is really just a logical server. You don't pay for the server and it has very few pieces of functionality that are tied back specifically to the individual databases.

What's been happening lately in Azure SQL Database?

Over the years the SQL Database service has gained a vast amount of functionality and now is very close in terms of functionality to SQL Server. You still won't find server level functionality like SQL Agent, but there is an answer for that now in Azure SQL Database Managed Instances. Last calendar year it was announced that Azure SQL Database would be getting more storage options. Traditionally, when you purchased a database it came with a set of storage to go along with the compute. Buy an S3 and get 250GB of storage. Need 300GB of storage, but only S3 compute? Well, you were out of luck and had to jump all the way to a P1 database with 500GB of storage. Thankfully that has all changed with the announcement of storage add-ons. Announced in August, this feature is currently still in preview. One complaint that has long accompanied the SQL Database service was the sizes available. The maximum size for a SQL Database was only expanded to 1TB in the last year and a half or so. Again, a new development on this front was announced in March to allow for up to 4TB of data. I would only expect to see this number keep rising over the next year. More recently, a pricing model change was made allowing the purchase of a SQL Database by vCore and memory rather than DTU. This additional option has been met with great applause for database professionals used to procuring servers and tracking performance in an on-prem data world. From a security and compliance standpoint the recent ability to bring your own key to Azure SQL for TDE is a great new development. This feature was announced in preview last year and became generally available in the last couple of weeks. Also on the security front is the ability to connect your database to a VNet allowing for an extra layer of security. Remove public IP address whitelisting on the firewall, hook up a VNet and all the traffic will need to flow through that VNet to get to your database. Finally, for users that need expanded backup retention functionality the integration with Long Term Backup Retention has only gotten better. The feature has technically been in preview since October 2016, but the experience continues to improve. Now, with direct portal integration it is easier than ever to hold onto your backups for up to 10 years at a very low cost.

Azure SQL Database Elastic Pools

I wanted to call our elastic pools, not as a separate offering but as a separate piece of functionality in Azure SQL Database. One of the main reasons for this is that it ties into the next topic, Azure SQL Database Managed Instance and could potentially be a point of confusion for some users. Elastic pools are simply another way to buy database compute. It doesn't really change the functionality of Azure SQL Database (at the core of the product anyway) but rather helps procure buckets of compute power as opposed to dedicated database compute power. What do I mean by this? Imagine you need 2 P1 databases, that's to say, 2 databases that each need to use 125 DTUs of compute power. You would pay $930/month ($465/database * 2 databases). Now, assuming the usage pattern for those databases falls into the use case for an elastic pool, you could instead purchase a "pool of 125 DTUs" at the price of $854/month and divide that up among up to 50 databases. The more P1 databases you could fit into that pool, the greater your savings. The reason I bring this up as a possible point of confusion is that we are not purchasing multiple databases for a single set price, but still based on the amount of compute and storage you need. Even thought we are buying sets of databases it still doesn't change the fact that each database is technically an individual, separate entity from the backend service perspective. The Azure SQL service doesn't see the 2 databases in our pool as an instance of SQL Server that is tied together and get us more functionality. We still don't have true cross database query functionality or any of the other features that come along at the instance level of SQL Server that have been mentioned in the previous section of this post. This model change from buying a database and the associated compute to buying a set of compute and associating it will databases that are still independent of one another. In the Azure portal those databases will still show up separately and will be managed separately from one another. So, what if I do need access to those "unsupported features" of Azure SQL? What if I want SQL Agent support, cross database queries, linked server support, or change data capture support? That brings us to our next topic and arguably the best new service for SQL to be introduced to date: Azure SQL Database Managed Instances.

Azure SQL Database Managed Instance

When Azure SQL Databases first launched users immediately wanted more features. What everyone realized they really wanted was a managed service that took care of Windows patches, SQL Server patches, was always running the latest version of SQL Server, handled backups, built in high availability, managed disaster recovery setup with just the click of the mouse, and...all of the functionality of SQL Server that they knew and loved. With Azure SQL Database they got all of that except the last part, they got a subset of functionality. That is all changing! On March 7 the Microsoft team announced it's latest offering in managed database: Azure SQL Database Managed Instance. This new offering provides everything that users have been asking for, a fully managed version of SQL Server that provides access to all the functionality of SQL Server. Unlike Azure SQL Database, this service is purchased at the instance level (as the name would suggest) which means you can put all the databases you want on there for the same price (purchasing compute and sharing among databases is where the similarities with Azure SQL Database Elastic Pools both begins and ends). Now users will get all the benefits of the full version of SQL Server like SQL Agent, database mail, service broker, native database restores (using a .bak file), and change data capture all while getting the benefits of managed service running in the cloud which opens up access to new cloud only features like threat detection. Best of all, if you have existing SQL Licenses that you no longer need on-prem or that you would like to leverage in the cloud instead you can take advantage of the Azure Hybrid Benefit for SQL Server. This allows you to use licenses with Software Assurance to bring down the cost of running SQL Server in the cloud.

Azure SQL Data Warehouse

Finally, we come to SQL Data Warehouse. Up to this point each service has been the SQL Server engine we all know and love. It is what is known as Symmetric Multi-Processing or SMP SQL. SQL Data Warehouse moves the SQL Server engine in to Massively Parallel Processing or MPP world. In this configuration a single control node is the brains of the operation and one or more compute nodes handle the heavy lifting of processing the data. Rather than scaling up, adding more memory or CPU to the server, this model allows us to scale out, adding more servers to separate the workload into smaller, more manageable chunks. If you have heard of or worked with Microsoft's Analytics Platform System (APS) then you will be right at home as this is the cloud version of that setup. Because SQL Data Warehouse is built largely on the same SQL platform many of the features of Azure SQL Database apply here as well. So things like bring your own key and VNet service endpoint improvements that have come to Azure SQL have also come to SQL Data Warehouse. Most recently, Azure Monitor support has been added allowing for a much more robust monitoring experience directly from the Azure portal. Some things that you used to have to go write queries for can now be surfaced in the portal reducing the roll your own code requirements. On the performance front the new Optimized for Compute tier allows the data and compute to sit closer to each other allowing for up to 10x faster queries, up to 5x compute power is available, and it unlocks unlimited columnstore. This compute tier is available now in public preview. Other miscellaneous improvements in the last year include the availability of replicated tables, introduction of identity columns, and increased PolyBase row size from 32kb to 1MB.

Wrap Up

All in all it has been a great year for Azure SQL PaaS database technology. We are absolutely in a better place today than we were a year ago and I believe the advancements in individual database technology combined with the introduction of managed instance provide a solid foundation for all development workloads from lift & shift with Managed Instance all the way to siloed agile development strategies and ISV applications with SQL Database. I'm really looking forward to the advancements the rest of 2018 and early 2019 bring to Azure SQL.