Your Azure SQL Database and Managed Instance is too big

Don’t buy for the future

With the cloud, it’s important to understand the concepts of scaling and elasticity. Your cloud services should be scaled at the right size to handle current workloads, with a small buffer. With Azure SQL Database, there is no need to buy that 32 vCore service, when all you need right now is 4. The rest will be available when you need them. You should be ready to scale up or out when the time is right but not before. With the cloud, you should shift your thinking from a CAPEX operating model to and OPEX operating model.

CAPEX vs OPEX

Capital Expenditure (CAPEX) describes where an up-front purchase is made to provide value into the future. e.g. A Server. Once it is purchased, you don’t need to buy it again and can use it until it outlives is usefulness. It’s a big upfront expense to provide long term value. For something like a database server, you might forecast what size you will need in 5 years and buy accordingly.

Operational Expenditure (OPEX) on the other hand describes ongoing costs and operating expenses. Items such as electrical power, rent and staff salaries come under OPEX. Pay As You Go cloud solutions fall under operational expenditure (OPEX) as they are ongoing costs. With this operating model, you should only pay for what you need, and adjust your scale as driven by demand, not future projections.

Resource utilisation and purchase models

DTU Model

Small to medium single databases can use the DTU purchase model which is a blend of cpu, memory and IO. They come with fixed storage per instance type. When monitoring performance, you should be looking mainly at DTU percentage used and storage percentage used.

If you are averaging above 85% and have regular peaks higher than that for DTU usage, you should scale up. You can adjust this figure based on how relaxed or nervous you are the DTU getting pinned at 100%.

If you are averaging below 50% and have few or no peaks above this, you can scale down. (Providing the lower class has enough storage) Again, adjust this to suit your own needs but there is no need to pay double unless you absolutely need to bake in a lot of spare capacity.

vCore

The vCore purchase model provides more flexibility than the DTU model. You can choose your compute and storage resources separately. There is no way to allocate extra memory however, this is bound to the number of vCores you have. Managed Instances only use the vCore purchase model.

I would apply a similar rule to scaling up or down here based on needs. Any instance using < 50% of CPU is a good candidate for scaling down.

Azure SQL Database Service Tiers

Ensure you are on the right service tier. Azure SQL Single DB comes with a choice of Basic, Standard and Premium.

Basic (DTU)

If you are on the basic service tier, then from a cost optimization process there is not too much more to say. It has the cheapest instance types, and the lowest resources so if it is doing a job for you, then there is no need to change. It only offers 7 days backup retention so be aware.

Standard (DTU)

A large number of workloads will be running on standard. If you are on the standard tier, you can scale down based on utilisation however you should bear a couple of things in mind.

S2 and below does not allow columnstore indexing
S1 instance types and below use HDD storage and this might have an impact on your query durations
- S2 and above uses SSD

Premium (DTU)

Moving to the premium service tier should be well thought out as it includes the most expensive instances, and may mean that you are over provisioned. If you are on a premium instance, look closely at your utilisation, particularly DTU percent used. For seemingly over provisioned instances, you should also take note of a couple of factors before making a decision. Premium is particularly good for IO intensive queries due to a higher average IOPS and a lower IO latency.

Scaling down from premium service tier means:

A decrease in average IOPS from 25 to (1-4) IOPS per DTU
An increase in IO latency from 2ms to 5ms
A new maximum database size of 1 TB

* Changing instance size but staying with premium service tier affects resources only

If you can handle the drop in instance size and/or reduction in service tier, I would recommend scaling down. Remember to keep a closer eye on newly changed instances, and scale back up if performance is no longer acceptable.

https://docs.microsoft.com/en-us/azure/azure-sql/database/service-tiers-dtu

General Purpose (vCore)

General purpose is the budget option but is suitable for a lot of production workloads. If you are currently on GP and you are happy with your performance, then do not change to a higher service tier.

Business Critical (vCore)

Most production workloads can run in Business Critical but you might be over provisioned. 1 read-only replica is built into the price so you can offload read-only workloads to your replica. It doesn’t come cheap though and is more than double the price of a General Purpose instance due to increased hardware and licensing. Scaling down to General Purpose will take away this replica so beware if you use it. A good option may be to scale to a lower Business Critical Instance, so that you still have a read replica built into the price.

Hyperscale (vCore)

Hyperscale is not currently available with Managed instance. (January 2021)

Your instances on Hyperscale are roughly double the cost of General Purpose but less than Business Critical. The main considerations are database size and log write throughput. On an exceptionally busy and large database, hyperscale is often the best choice.

Be aware of the reduced performance and features available if you think you can scale down to General Purpose. Changing tiers down from Hyperscale is not a native or easy task. It isn’t supported so you will need to export and import your data to a new database.

Scaling down to GP means:

Your database must be less than 4TB
You will move to premium remote storage, rather than local SSD
You cannot continue to use In-Memory OLTP
It’s a lot of work

https://docs.microsoft.com/en-us/azure/azure-sql/database/service-tiers-general-purpose-business-critical

Summary

The main thing is to regularly review your instances and make sure you are not paying for too much headroom. Make sure that by scaling down, you still have the functionality you need. If you scale down and performance is not what you expected, you should scale back up. Cost optimization shouldn’t come at the expense of performance. It is about ensuring that you are paying for what you need, and nothing more.

If you missed it, please post 1 in this cost optimization series, please have a read about Azure IaaS SQL Backups.