SQL Server on Azure: Why an IaaS VM?

Published by Marco Obinu on 30 December 201930 December 2019

In the previous article, I did an overview of different kinds of SQL Server implementations on Azure, introducing some of the advantages you can leverage while using a PaaS service, like a reduction in TCO of the solution or in administrative effort.

Even if the above advantages can sound promising, there are some limitations in these services that can drive you to the adoption of a SQL Server VM in IaaS. This latter approach requires effort both in the management of the deployed solution and, especially, in the design phase; on the other side, it allows you to use all the features you typically use on-premises and gives you full control over the deployment.

A SQL Server VM on Azure IaaS is ideal when:

You have to deal with legacy apps. PaaS backward compatibility is excellent, especially with the Managed Instances, but you can’t obtain full control over the system. If your app requires a specific SQL Server build or the vendor doesn’t support a PaaS implementation for the SQL stack, you can switch to a SQL VM to control all the configuration aspects of the instance and to use all the product features;
You need to support a hybrid scenario. If you want to extend an existing datacenter, on-premises or distributed in another cloud, for Business Continuity / Disaster Recovery scenarios, you need to use features that actually are supported on SQL VMs only;
You need a high-performance environment for your workloads. Although PaaS services are powerful, they still cannot obtain the same amount of virtual hardware you can assign to VMs.

The reasons shown above are some of the drivers that can lead you to IaaS. But there are some downsides: one of these is the complexity of the solution in terms of design and management.

How to choose a family VM?

One of the biggest challenges deriving from using VMs is the sizing phase. You need to deploy a performant VM to maximize the Return of Investment of the system. The billing model used by Azure is mainly based on consumption, where you pay for resources you use: you don’t want to break the bank while trying a brute force approach in solving performance issues, by adding more virtual hardware.

Sizing of a SQL VM on Azure is all about optimization, and this requires a good understanding of how things work under the hood.

Performance of a virtualized SQL Server instances are generally dependent on:

# of vCore assigned to the VM, and compute power of the CPU physically installed inside the host
Amount of RAM available
Storage performance and configuration
Network bandwidth and latency, especially in the context where data replication is used for HA

The above “numbers” can be obtained from the software vendor, or you measure them by execution an assessment of your existing infrastructure. You can use different tools to collect performance data: I usually use PAL to collect and analyze performance counter on the monitored system; you can also try to use to assess your on-prem environment.

Once determined the tech specs of the system, you need to define which Azure VM family and which size can satisfy your needs.

We have families of VMs specialized for different workload – optimized for CPU tasks, or storage, or memory, or GPU computing and so on – each one with a specific ratio between CPU and RAM assigned, storage capabilities and network throughput.

If more than a family can satisfy your needs, you should select the one which offers more compute power. With SQL VMs, the highest part of the bill is represented by licensing costs, which depends on the number of available cores: your primary goal is to obtain powerful cores, to contain their number and consequently the licensing fees.

You can determine which VM family has the highest compute performance by looking at the Azure Compute Unit indicator reported in the documentation; typically, at a higher ACU value correspond a more performant CPU.

Usually, your choice is between Memory-Optimized VMs with access to Premium Storage – DS_v2, ES_v3, etc. – that offer an optimal CPU:RAM ratio for SQL Server. If there’s a suitable size, I prefer to use DS_v2 VMs; this preference will probably change soon, in favor of new Easv4 VMs, based on AMD EPYC and generally available since Ignite 2019.

Now, it’s time for size definition

Once you determine the right family for your needs, you have to define the size of the machine; a bigger size corresponds to more CPU, more RAM and additional throughput for storage and network. These latter aspects are often underestimated, but they’re fundamental for a production system.

A production SQL Server solution deployed on Azure should only consider Premium SSD disks or Ultra Disks, that can deliver adequate throughput and latency for a top tier workload.

You should plan a storage subsystem composed of several disks able to deliver expected performance, with different cache settings depending on the kind of database file they host. Moreover, your VM size should be big enough to leverage the performance offered by the disks. It’s not so unusual to find VMs that can’t handle all the IOPS or the throughput the disks can deliver, and this usually corresponds in a waste of money in expensive drives that aren’t used efficiently.

On the networking side, it’s essential to have enough bandwidth and low latency, especially in the HA context that will be discussed in the next article. We can maximize results on this front by using a VM that supports Accelerated Networking, and by using Proximity Placement Groups for systems composed by more than one VM.

At this point, you should have defined family and size of the VM; now, it’s the turn for some configuration optimization at the operating system and SQL Server level. This topic is quite long to be discussed here; you can find an excellent article that summarizes all the best practices on the official documentation.

Sound complex, isn’t it? Well, it’s not simple, but it’s nothing prohibitive once you understand the basics of the sizing process. Additionally, you can also use DevOps tools, like ARM templates and PowerShell DSC, to automate the deployment process, reducing the complexity of deployment and standardizing your approach to the configuration of different environments.

To complete the big picture for SQL IaaS VMs, it’s necessary also to introduce HA and cost reduction topics. These will be treated in the next article, so please be patient!

See you again in a few days!