5 Reasons Your Data Warehouse Belongs in Microsoft Azure

By Marcus Crast

By now, you’ve probably heard chatter about Microsoft SQL Data Warehouse (SQL DW), but what is it? And why should you be using it?

Well, to start, it’s a cloud-based, scale-out database capable of processing massive volumes of data—both relational and non-relational. Built on massively parallel processing (MPP) architecture, it’s a beast more than capable of handling your enterprise workload. SQL DW leverages the Microsoft Azure platform, is simple to deploy and gives you the ability to scale compute while keeping your budget under control.

Essentially, SQL DW includes all the architecture and technical goodies I just mentioned, but it presents itself to the developer/user in the same way that Azure SQL Database does. As such, Azure SQL Database is the latest version of SQL Server fully managed by Microsoft.

You can check out Microsoft’s product page to get into the minutia, but for now, let’s talk about the many benefits of SQL DW migration:

The Price is Right

Let’s face it, cost is a huge factor when shopping for relational database storage (and storage in general). But why waste your money running compute on datasets you’re not using? Because SQL DW has a decoupled storage and compute model, it’s solely based on how fast you’ll need to read and write data.

For instance, let’s say you’re loading billions of rows into a database—a process that could take days upon days. But with SQL DW, you can essentially “turn a knob” to speed up load time allowing you to populate and run queries faster. You can even “pause” compute, giving you even more freedom to better manage your cloud costs.

To (Nearly) Infinite Storage and Beyond

Just the DBA work alone in managing a VLDB (Very Large Database) 500GBs or greater is not a task for the faint of heart, not to mention the hardware cost and finding someone qualified enough to make the hardware hum in a cost-effective manner. These massive databases must be spread out with multiple files in the background, so there’s an art and science to determining optimal file layout—a task you need not worry about with SQL DW.

The service does allow you to determine per table how data is distributed across the nodes of the MPP architecture. This doesn’t guarantee distribution will be used or how many nodes it needs; only the data content can determine that.

Let’s look at it this way, if you tell seven members of your team that they don’t have to think about their database file management again—that they don’t have to worry about storage failure, dying network cards or file bloating—you’re going to free up that team (and budget) to work on projects that actually matter.

Scale Compute Faster Than a Speeding Bullet

You can provision a data warehouse in SQL DW in just minutes and scale your compute in seconds to keep up with demand. Unlike other cloud services that can require significant time to scale up, SQL DW truly delivers on the promise of cloud elasticity, making it ideal for batch-based data warehouse workloads.

The fact that compute scales almost instantaneously means that you can use SQL DW to service customers and scale the data tier on-demand—completely independent of storage.

cost-comparison-dashboard
In this sample dashboard image, using an elastic compute and a start-and-stop approach with SQL DW could result in annual savings of $14,999 to $73,886. (click to enlarge)

For example, 10th Magnitude has a client that runs premium tier from 9 a.m. to 5 p.m. and then scales back to standard tier when they don’t need the workload power. Pretty convenient, right?

A True Blue MPP as PaaS

SQL DW is, for all intents and purposes, Platform as a Service (PaaS), so there’s no infrastructure to support and barely any configuration to manage. By combining MPP (Massively Parallel Processing) architecture and Azure storage capabilities, SQL DW can:

  • Grow or shrink storage independent of compute
  • Grow or shrink compute without moving data
  • Pause compute capacity while keeping data intact
  • Resume compute capacity at a moment’s notice

Basically, queries are parallelized and SQL DW can scale up and down the amount of parallelization and compute power. It also scales into the 100TB-plus range in terms of data volume, making it more useful for very large or quickly growing workloads. What’s not to like there?

I Like Big Data and I Cannot Lie

SQL DW supports a direct query model to Azure Blob Storage, allowing you (big) Big Data insights. This means that you can store things like IoT telemetry (data) in storage blobs and then use Transact-SQL (T-SQL) constructs to query that IoT data in the DW with a view or application query.

Are you thinking to yourself, “does not compute?” In essence, built into SQL DW is Microsoft’s blazing-fast PolyBase technology that simplifies and enables distributed analytics, allowing you to run a single T-SQL query using common tools over multiple data sources.

If you’re still not convinced you can improve your bottom line by scaling data storage more effectively in the cloud—without over-provisioning or over-paying—shoot me an email and let’s set up a call to chat.

To stay up-to-date on the latest and greatest in cloud, follow us on Twitter, too.

As always…keep calm and cloud on.