2020 blog – a year in numbers
Blog Performance
A few thoughts on my blog performance for 2020.
I have more than doubled my page views from 2019 to 2020 so I’m naturally delighted about that. At the time of writing and with 13 days of 2020 to go, I have had 16,773 page views. The average time on each page was 3 minutes and 53 seconds which tells me that real people are reading the content. I wasn’t really tracking this all year so these numbers were a bit of a surprise to me. In the full year of 2019, I only had 7018 page views with an average page viewing time of 3 minutes and 26 seconds.
Things are going quite well and I’ve decided to aim for 30,000 page views in 2021. Doubling the numbers again is not going to be easy, but I hope to achieve this with a steady flow of new content and updating some old material. Breaking it down, it is a modest target of 82 page views per day so is achievable if I continue to add good content.
Regular Content
I’ve been trying to put out regular content, although it is not all the same theme. I tend to find that when I solve an interesting problem, I tell myself that it would make an interesting blog post. So I’ve blogged on Azure, PowerShell, Performance tuning and whatever the #tsql2sday topic was that month. My aim for 2021 is to be a bit more specialised so you will see more posts on Azure SQL DB and Managed Instance, as well as performance tuning. These are areas I would like to work more on in 2021. Of course, I’m going to keep going with the #tsql2sday posts because they usually provide an interesting topic I wouldn’t otherwise think of.
Blog successes
One of my recent posts on database mail with managed instances has gained some traction. Although only published in October, it is already in my top 10 page views for the year. My feeling is that this is the sort of thing that people are googling for now, the MS documentation isn’t amazing either and its on trend.
One of my first ever blogs about sending html emails using database mail is at number 9. I find that hard to believe because when I see old dates on blog posts, I tend not to want to read them. I remember publishing this post and wondering if anyone would read it and even worse, if I would be ridiculed for it.
My number 1 post for the 2nd year running was EC2 SQL Server Backups to Amazon S3. I haven’t read it for a while and I think it might be in need of a major update, so I plan to do that in the next few weeks.
My 2nd most popular post, also for the 2nd year running was How to create a table in AWS Athena. I used Athena a lot in my old job and I found the documentation difficult to understand so I put together my own instructions. I can’t believe that it is still one of my most popular posts and in fact, it grew from 768 views in 2019 to 2741 views in 2020. It’s a sign to me that AWS Athena is gaining a lot of popularity. The post is 3 years old though so I think it needs to be refreshed too.
Blog disappointments
I wrote a full video series about SQL Server on AWS RDS. It wasn’t a hit. In fact, it only accounted for 246 page views. The videos are on YouTube as well and have only had 237 views. The videos were all subtitled to help with accessibility and it took me a long time. My son also did some subtitling and he got some pocket money for helping out. I have to learn from this though, I didn’t appear in the videos, they were purely PowerPoint and demos and when I think of the videos that I enjoy watching, they always have an engaging presenter.
I’m currently creating a new video series called fundamentals of Azure SQL DB Managed Instances. I hope to learn from the mistakes of my RDS course, and hopefully it will reach a wider audience.
IaaS++ (Azure SQL Server IaaS Agent Extension)
IaaS++
Most DBAs or cloud practioners have seen a graph similar to this ⬇. It shows the flexibility and responsibilities between different methods of adopting SQL in Azure. SQL on VMs gives you the most flexibility but also the most administrative work. SQL DB single instance handles almost all of the “heavy lifting” (things like backup,os patching, installation etc), but gives you the least flexibility. Azure SQL DB managed instance lies some where in between. SQL on VMs are known as Infrastructure As A Service (IaaS). SQL DB (Single DB or managed instance) are known as Platform As A Service (PaaS).
But now there is another option, called SQL Server IaaS Agent extension. I think of it as IaaS++ as it extends your SQL VMs to give them some of that heavy lifting funtionality that the PaaS offerings provide, whilst still allowing you full control over the instance.
What do you get with SQL Server IaaS Agent extension?
The main two items I will go into here are automated backups and automated patching. These are a standard on most PaaS products, with all cloud providers, however it is only down to the introduction of this “IaaS++” extension, that you can now get this for SQL on VMs.
You can also configure storage, high availability, Azure Key Vault integration and R services, as well as enabling a subscription wide view of all your instance and license types, however this post only focuses on automated backups and patching.
Real world scenarios
Patching
My client had fallen behind with patching and needed to ensure that important servers were patched regularly. By enabling automated patching, it meant that they could have only the important patches applied during an agreed window, and then look at other patches and cumulative updates when it suited them. They had a test environment that mirrored production, with a 3 node availability group cluster. (Automatic failover was enabled) so I was able to test the solution there, before going anywhere near production. The plan was as simple as this:
- Add a 90 minute window at 12:00 for Server1
- Add a 90 minute window at 02:00 for Server2
- Add a 90 minute window ar 04:00 for Server3.
This approached allowed 30 minutes at the end of each window for VMs to be restarted before the next VM’s window would start.
- Click on automated patching from the SQL Virtual Machine in Azure Portal.
- Update the toggles to set your patching window.
- Daily or weekly schedules can be chosen.
- If patches are applied, your VM will be restarted.
This approach allowed them to move from 44 outstanding patches to 4 on 3 servers without manual intervention. Failovers happened seemlessly. I’d just urge a word of caution with critical production systems, as this will restart your VMs. Are you ready for that? My advice is get comfortable with it on non prod systems before starting on production.
I think it’s a great feature. It’s not always possible to just go to Managed Instance so for those of us who need a full SQL install, this is a handy hybrid.
Backups
Another client was using the Ola Hallengren solution for managing backups. It’s the best solution out there when you need to configure your own backups but what if your cloud provider will do it for you? This client also didn’t have an experienced DBA, so in this case, it is better to let Microsoft do it. What’s more, you can configure a retention period between 1 and 30 days to stop your storage costs from ever increasing.
Before starting, make sure you don’t have your own backup solution running in parallel.
- Click on automated backups
- Configure the toggles to suit your needs
- Link it to a storage account
- Check your backups are working as expected and can be restored
These tasks can be automated as well using PowerShell or Azure CLI. I’ll maybe cover this in a future blog.
Popular posts on johnmccormack.it
Locks, blocks and deadlocks in SQL Server
Terminology matters: Locks, blocks and deadlocks
I’ve lost count of how many times people have told me there is deadlocking in the database, only to check and see no deadlocks have occurred. In this scenario, it is usually blocking they are trying to describe. As DBAs or developers, it is important to know the difference between locks, blocks and deadlocks.
Read on, or just skip to the video
What are SQL Server locks
Locks are essential for ensuring the ACID properties of a transaction. Various SELECT, DML and DDL commands generate locks on resources. e.g. In the course of updating a row within a table, a lock is taken out to ensure the same data cannot be read or modified at the same time. This ensures that only data that is committed to the database can be read or modified. A further update can take place after the initial one, but they cannot be concurrent. Each transaction must complete in full or roll back, there are no half measures.
It should be noted that isolation levels can have an impact on the behaviour of reads and writes, but this is generally how it works when the default isolation level is in use.
Lock types
I don’t want to write a full post about lock types, mainly because the ultimate guide already exists, along with a matrix showing lock compatibility across all possible lock combinations. For a simple explanation of the basics:
- If data is not being modified, concurrent users can read the same data.
- As long as the isolation level is the SQL Server default (Read Committed)
- This behaviour changes however if a higher isolation level such as serializable is being used.
- If data is being modified, the select query will have to wait on acquiring the shared lock it needs to read data.
What is blocking
Blocking is the real world impact of locks being taken on resources and other lock types being requested which are incompatible with the existing lock. You need to have locks in order to have blocking. In the scenario where a row is being updated, the lock type of IX or X means that a simultaneous read operation will be blocked until the data modification lock has been released. Similarly, data being read blocks data from being modified. Again, there are exceptions to these based on the isolation level used.
Blocking then is a perfectly natural occurrence within SQL Server. In fact, it is vital to maintain ACID transactions. On a well optimised system, it can be hard to notice and doesn’t cause problems.
Problems occur when blocking is sustained for a longer period of time, as this leads to slower transactions. A typical connection timeout from a web app is 30 seconds so anything above this leads to lots of exceptions. Even at 10 or 15 seconds, it can lead to frustrated users. Very long blocking can bring full servers to a stand still until the lead blockers have cleared.
Identifying blocking
I simply use Adam Machanic’s sp_whoisactive stored procedure. You could use sp_who2 if you absolutely can’t use 3rd party scripts, but this proc is pure t-sql so argue your case.
EXEC sp_whoisactive @find_block_leaders = 1
To kill or not to kill
Sometimes you may have no option but to kill spids in order to clear blocking but it is not desirable. I’m generally a bit happier killing a select query if it is causing blocking, because it won’t result in a DML transaction failing. It might just mean that a report or user query fails.
Multiple identical blockers
If you have multiple blockers and they are all similar or identical, it could mean that an end user is rerunning something that keeps timing out on the app layer. These app timeouts don’t correlate to SQL timeouts so it can be the case that user just keeps hitting f5, oblivious that this is making the problem worse. I’m a lot happier killing these spids, but it’s important to say to the end user where possible, so they don’t keep doing the same thing.
It could also be that a piece of code which is called regularly has regressed and no longer completes quickly. You’ll need to fix this or the blocking headache won’t go away.
What are deadlocks?
Deadlocks occurs when two or more processes are waiting on the same resource as well as waiting on the other process to finish before they can move on. With a scenario like this, something has got to give or they will be in a stand off until the end of time. They are resolved by SQL Server picking a victim, usually the least expensive transaction to roll back. This is like having one of your blocking queries automatically killed to get things moving again. It’s far from ideal, leads to exceptions and may mean that some data intended for your database never got there.
How to check for deadlocks
I like to use sp_blitzlock from Brent Ozar’s first responder kit. If I’m in firefighting mode, I’ll just check for the previous hour. You can also pick out deadlocks from the SQL Server Error Log, or you can set up extended events to capture them.
-- Deadlocks in last hour DECLARE @StartDateBlitz datetime = (SELECT DATEADD(HH,-1,GETDATE())),@EndDateBlitz DATETIME = (SELECT GETDATE()) EXEC sp_BlitzLock @EndDate = @EndDateBlitz, @StartDate = @StartDateBlitz
Simulating blocking
If you want to simulate blocking, you can try this on the the Wide World Importers database.
/* Run each of these, in order, in a different SSMS window. */ -- Query 1 (This naughty person went to lunch and never committed their update) BEGIN TRANSACTION UPDATE [WorldWideImporters].[Sales].[Customers] SET CustomerName = 'SpinTail Toys (Head Office)' WHERE customerID = 1 -- COMMIT -- Only run the commit above after all the queries have been run and you have observed blocking. Query 2 will finish instantly. -- Query 2 (I just want my select results, but there is an uncommitted transaction blocking me) SELECT * FROM [WorldWideImporters].[Sales].[Customers] WHERE customerID = 1 -- Query 3 (Check the wait_info) USE DBA EXEC sp_whoisactive @find_block_leaders = 1 -- You should see a wait type of LCK_M_S on your select query. This means the thread is waiting to acquire a shared lock.
The image below shows the output of the 3 queries side by side. Query 1 completes quickly, but notice it is uncommitted. Query 2 will not complete until Query 1 is committed or rolled back. Running Query 3 (sp_whoisactive) lets you know which spids are causing the blocking and which are being blocked.
I’ve tried to keep the post on locks, blocks and deadlocks about the differences. I haven’t gone too technical with isolation levels, wait types or lock compatibility. The post is aimed at newer DBAs and developers to help them grasp the technology and understand the distinct purposes of locks, blocks and deadlocks.
Popular posts on johnmccormack.it
PowerShell Splatting – What was wrong with backticks?
T-SQL Tuesday
This month’s T-SQL Tuesday is hosted by Lisa Griffin Bohm. (b|t). Lisa asks “This month, I’d like those of you who have presented, or written a presentation, to share something technical THAT DID NOT RELATE to the topic of the presentation, that you’ve learned in writing or giving the presentation.” I’m going to write about how I came across PowerShell Splatting and how it made me better at PowerShell, despite presenting on a cloud topic.
Powershell splatting
At DataScotland 2019, I did a presentation on AWS RDS for SQL Server. The technical content was about how RDS works, what you can do with it, and how to provision it etc. As part of my demos, I decided to use AWS PowerShell commands. When I had made this presentation at previous events, I had used AWS CLI so I had to update my code examples. I’m ok with PowerShell, but I’m not an expert. I just wanted to show that were were multiple ways to interface with AWS.
My code was full of backticks. You could say I was daft about backticks. I loved them and thought they made my PowerShell code readable because they stopped the lines from running off the monitor. Someone asked me why I don’t use PowerShell splatting? “Whatting” I asked? I had never heard of splatting.
At the break, I spoke to a couple of people who were more experienced in PowerShell than me. They advised that PowerShell splatting was the way to go for large commands because they make the code more readable. More readable was definitely something I was interested in so I decided to go away and learn about splatting, and update my code for future events.
So what is PowerShell Splatting?
Rather than passing a long list of parameters into commands, you can create a variable in advance to hold these values. The variable is an array or a hash table and includes as many parameters as you need. If you need to pass in parameter names and values, use a hash table. If you just need to pass in a list of parameter values, you should use an array. Then, when you run the command, you simply pass in the hash table parameter instead of all the individual parameters.
Example command before splatting
New-RDSDBInstance -dbinstanceidentifier "datascotland-posh" -region "eu-central-1" -VpcSecurityGroupId "sg-00a1234g567c3d4ab" ` -allocatedstorage 20 -dbinstanceclass "db.t2.micro" -engine "sqlserver-ex" ` -masterusername "rds_name" -masteruserpassword "secure_pw_here" -availabilityzone "eu-central-1a" ` -port 50000 -engineversion "14.00.3049.1.v1"
Example command after splatting
$HashArguments = @{ dbinstanceidentifier= "datascotland-posh" region = "eu-central-1" VpcSecurityGroupId = "sg-00a1234g567c3d4ab" allocatedstorage = 20 dbinstanceclass = "db.t2.micro" engine = "sqlserver-ex" masterusername = "rds_name" masteruserpassword = "secure_pw_here" availabilityzone = "eu-central-1a" port = 50000 engineversion = "14.00.3049.1.v1" } New-RDSDBInstance @HashArguments
At a glance
As you can see, the 2nd example which uses splatting is easier to read and you can pick out the value of each parameter at a quick glance. It was worth learning to make my code more readable and improve my overall PowerShell knowledge.
“Writing this post has just made me realise that I should update my RDS course ↙ as the examples in it don’t use splatting. 🤦♂️”
- « Previous Page
- 1
- …
- 5
- 6
- 7
- 8
- 9
- …
- 21
- Next Page »