SQL Server Archives - Page 5 of 6

3 things a new DBA should learn first

6th November 2019 By John McCormack Leave a Comment

3 things a new DBA should learn first

Why would I put a post together about the 3 things a new DBA should learn first? Well, I was asked by a colleague what I thought she should be focusing on learning and mastering as newish SQL DBA. She was looking to become more experienced and confident in her role. Obviously this is an opinion piece but I felt it was worth sharing. The 3 suggestions are not in any particular order, and of course I’m sure not everyone will agree but it’s an opinion based on my own experience as a SQL DBA. (Feel free to disagree in the comments, I’m sure people will strongly about some item’s I’ve omitted)

What is happening on the server right now
Backup and restore of SQL Databases
Scripting i.e T-SQL and PowerShell

What is happening on the SQL server right now

Help, the SQL Server is on fire and my really important process is not working. Why is the server so slow?

This scenario will happen to you. Invariably, the person (often a manager) will stand over you and expect you to knock out one or two lines of t-sql wizardry to get things running along smoothly again. First of all, I should say that in a perfect world, you will ask them to raise a ticket and you will work on it according to its priority against your other tasks. Then you can let them know what the issue was and what can be done to prevent it happening again. But we rarely work in a perfect world. In this scenario, you need one primary line of t-sql to get started.

EXEC sp_whoisactive

The procedure sp_whoisactive doesn’t come with SQL Server. It is a community script created by Adam Machanic. You can download it from GitHub. If you don’t have it installed on your SQL Servers, It is something your should really consider as it gives much more useful and readable information that sp_who2 and it’s a lot easier than pulling together your own code. It’s mature and very safe and has been installed on SQL Servers worldwide since 2007.

sp_whoisactive offers loads of optional parameters that allow you to customise and sort the output according to your own preference but just running the proc on its own without parameters will give you an ordered list of everything that is executing, at that point in time. (Ordered by duration descending). If you see things running in the minutes that usually take seconds, maybe you need to see if they are blocking other transactions.

One parameter I find really useful during an incident is @find_block_leaders. By running [sql]EXEC sp_whoisactive @find_block_leaders = 1[/sql] , you can see exactly how many sessions are being blocked from each blocking session. In the example below, you can see that an INSERT transaction in session_id 52 is blocking 5 other sessions. Each of these are trying to read from the table with an open insert transaction so they are blocked. You either need to wait for 52 to finish or you need to kill it in order for the other transactions to move on.

A quick note on killing spids. I really only recommend this if you know what the process is and you have an idea of how long it will take to rollback. (Remember those other 5 spids are still blocked until the rollback completes and this is a single threaded process)

Of course, it might not be blocking and in that case, you will need more scripts to analyse what is running regularly, how queries are performing and if any optimisations are needed. This will need to be in another blog as I want to keep this post fairly succinct.

Backup and restore of SQL Databases

Knowing how to back up a database is an essential skill of a DBA, certainly one of the top required skills. Equally important is knowing how to restore your database backups. This is something you should regularly practice to ensure that when/if the time comes that you need to act quickly, you are well rehearsed. Trying to restore a database to a point in time in the middle of a P1 emergency, is the stuff of nightmares if you haven’t done it before.

Learn the different types of backups available and how often you should be doing each of them. This will vary depending on your business needs. Even on the same instance, some databases may need point in time recovery and others wont. e.g. It might be fairly acceptable to back up your small master database once per day but you cannot afford to lose more than 5 minutes of orders in the event of your instance going down. In the case of your orders database, you will need a transaction log backup every 5 minutes. Depending on the size of your database, you will be looking at a combination of regular full and differential backups or just regular full backups (as well as your transaction log backups of course)

If you are on a PAAS database such as Azure SQL Database or AWS RDS, the backups can be done automatically without any administration effort but you will still want to practice restoring to a new instance, possibly into a new region or availability zone in the event of a disaster.

Other backup related topics to look into are compression, encryption, striping and retention management.

Scripting i.e T-SQL and PowerShell

This may be a strong opinion but I believe point and click DBAs are going the way of the dinosaurs. SQL Server Management Studio (SSMS) still makes it possible to do lots of work with knowing much T-SQL or PowerShell but this does not scale. If you end up managing tens or even hundreds of SQL Servers, scripting will be your friend. The more tedious work you can automate, the more time you have to work on more interesting tasks.

Learn T-SQL

Despite me saying that point and click is not the way to go, SSMS is very useful for providing the T-SQL code for how to do something. Say I wanted to create a new sql login that doesn’t expire and I want to give it sysadmin permissions. The first time I do this, I can step through the GUI but instead of clicking OK, I can click Script and the code is opened out in SSMS for me to review and run. I can also save this as a .sql script and either use it as a template for creating future sql logins, or refer to it often enough that I learn the syntax.

USE [master]
GO
CREATE LOGIN [john] WITH PASSWORD=N'ae34bhijkfgcd5', DEFAULT_DATABASE=[master], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
GO
ALTER SERVER ROLE [sysadmin] ADD MEMBER [john]
GO

Learn Powershell

If this seems daunting, please don’t worry. There is a huge community project called dbatools that has more functionality for managing SQL Server than the official Microsoft SQL Server module. It’s a great way to start running Powershell commands and building your knowledge as you go.

You can start with commands as an when you need them and build up your knowledge from there. The website is extremely useful and by tagging @psdbatools on twitter, you will usually get help pretty quickly from someone involved. As your skills and confidence increase, you may even choose to add code to the project by making a pull request. As I write this post, dbatools has over 15,000 commits from 175 contributors. (I’m one of them in a VERY small way)

Summary

There are more than 3 things a new DBA should learn, a lot more but these 3 items will help you remain calm in a crisis and let you automate work, meaning you can be more efficient and have time to spend on interesting work.

Efficient maintenance of SSISDB

7th August 2019 By John McCormack 3 Comments

Maintenance of SSISDB within SQL Server

The SSIS Server Maintenance Job is used for maintenance of SSISDB. It manages the retention of operations records in SSISDB. I noticed it had been turned off by someone last year and it hadn’t run since. Therefore, SSISDB had become bloated and there was only 10MB left on the data drive meaning the database could no longer auto grow.

Nobody was aware, why would they be? After all, nothing was failing. We didn’t have disk space monitoring enabled so the only time we found out there was a problem was when the disk had filled up.

I made 2 unsuccessful attempts at running the SSIS Server Maintenance Job. However, after several hours of processing and still no available free space in the database, I knew the job wasn’t coping with the sheer number of rows it had to delete. The deletes all happen from the parent table (internal.operations) and then all child tables using using cascading deletes. This approach maintains referential integrity but is not great for performance.

Due to this, I needed a new approach to the maintenance of SSISDB. As we hadn’t maintained these tables for 13/14 months, I was asking too much of SQL Server to let me delete everything at once. (Truncates wouldn’t do because I had to keep the last 2 weeks data).

A bit of investigation showed me that these were the related tables.

internal.event_message_context
internal.event_messages
internal.executable_statistics
internal.execution_data_statistics
internal.execution_component_phases
internal.execution_data_taps
internal.execution_parameter_values
internal.execution_property_override_values
internal.executions
internal.operation_messages
internal.extended_operation_info
internal.operation_os_sys_info
internal.validations
internal.operation_permissions
internal.operations

My approach

Amend the retention period (days) in catalog.properties to 400 (because 14 was unmanageable with > 400 days of history)
Write a delete script or find a reliable one that does this work due to SSISDB’s native stored procedures failing to cope
Ensure SSISDB is in SIMPLE recovery model because it will reduce t-log growth
Run the delete script and see how it performs and how much space is freed up in order that the days for deletions can be optimised
Repeat steps 1-4 (each time lowering retention period (days)) until I achieve my target retention period of 14
Ensure this never happens again 😎 (because it’s no fun getting 300 failure emails an hour)

1. Reduce retention period (days) in catalog.properties to 400 (This allowed me delete rows based on only 22,000 IDs)

To do this in T-SQL:

[sql]
EXEC [SSISDB].[catalog].[configure_catalog] @property_name=N’RETENTION_WINDOW’, @property_value=400
[/sql]

To do this is SSMS:

Right click SSISDB from Integration Services Catalog in SQL Server Management Studio. Then, amend Retention Period (days) to a suitable initial value – in my case 400 days.

2. Script out or find a reliable script that does this work manually.

I struck gold with a superb script from Tim Mitchell which honours the Retention Period (days) value. I decided this was better than writing my own. Please follow the link to review this script along with other useful information (or likewise, get the latest version from Tim’s Github).

3. Ensure SSISDB is in SIMPLE recovery model (as it helps with transaction log)

[sql]
SELECT DB_NAME(database_id) AS DBName,recovery_model_desc
FROM sys.databases
WHERE DB_NAME(database_id) = ‘SSISDB’
[/sql]
If it’s not, I recommend you ALTER database to SIMPLE recovery in order to minimise logging. It’s not an essential step but will save bloating your transaction log.

[sql]
ALTER DATABASE [SSISDB] SET RECOVERY SIMPLE WITH NO_WAIT
[/sql]

4. Run script (from step 2) and see how it performs and how much space is freed up

[sql]
SELECT NAME,
size/128/1024.0 AS FileSizeGB,
size/128/1024.0 – CAST(FILEPROPERTY(NAME, ‘SpaceUsed’) AS INT)/128/1024.0 AS FreeSpaceGB,
(size/128/1024.0 – CAST(FILEPROPERTY(NAME, ‘SpaceUsed’) AS INT)/128/1024.0) / (size/128/1024.0) * 100 AS PercentFree
FROM SSISDB.sys.database_files
WHERE NAME = ‘SSISdata’;
[/sql]

5. Repeat steps 1-4 (each time lowering retention period (days)) until you achieve your target retention period

I started small by reducing the retention period (days) to 390. This allowed me to measure the time and impact of removing 10 days of data. I then went down to a retention period of 360 days. I found 14 days to be the best performing decrements so I kept doing this until I had only 14 days of data remaining. Following this, I kept the new script in place (scheduled nightly) via SQL Agent. There was no need to continue using the SSISDB cleanup stored procedure internal.cleanup_server_retention_window.

6. Ensure this never happens again

They say prevention is better than cure. Here are some ideas on how to implement this and ensure that your maintenance of SSISDB is ongoing:

SQL Agent Job to check if date of oldest id is greater than expected (And alert of not)
Write a check in PowerShell or even better, consider writing a command for DBATools.
Professional monitoring tools may alert on this but I haven’t checked.

Alert if your availability group fails over

10th July 2019 By John McCormack Leave a Comment

A simple way to send an alert if your Always On Availability Group fails over. You can run the t-sql below or set it up visually in SQL Agent.

USE [msdb]

/*
Sends an alert if the AG group fails over.
I've limited the requests to one alert per group (by specifying the DB name in @event_description_keyword
Let me know what you think of this approach, I found it better than multiple alerts for each DB in the Availability Group
*/

EXEC msdb.dbo.sp_add_alert @name=N'AG role change - AG1',
@message_id=1480,
@severity=0,
@enabled=1,
@delay_between_responses=0,
@include_event_description_in=1,
@database_name=N'',
@event_description_keyword=N'PRD_DB1',
@notification_message=N'There has been a failover of your Always On Availability Group AG1 - Please investigate.',
@job_id=N'00000000-0000-0000-0000-000000000000'

-- Add an operator if it doesn't already exist
EXEC msdb.dbo.sp_add_operator @name=N'John McCormack',
@enabled=1,
@pager_days=0,
@email_address=N'john.mccormack@example.com'

-- Add a notification
EXEC msdb.dbo.sp_add_notification @alert_name=N'AG role change - AG1', @operator_name=N'John McCormack', @notification_method = 1

Distributed Replay Error: Failed to set proper database for the connection

3rd May 2019 By John McCormack Leave a Comment

Distributed Replay Error: Failed to set proper database for the connection – Troubleshooting dreplay.exe

If you see Failed to set proper database for the connection in your replayresult.trc, it is worth looking at the row below it (in profiler).

Check if the TextData for the row below (with the same ReplaySequence) would run from the database listed in the DatabaseName column.

Hint, either its a database that you haven’t restored to your test environment or you have named the databases differently to production. e.g. you have prefixed the names with perf_

https://johnmccormack.it/2016/10/error-dreplay-could-not-find-any-resources-appropriate-for-the-specified-culture-or-the-neutral-culture/

Useful links

Test read intent connections to an AG Listener

19th April 2019 By John McCormack 4 Comments

To test read intent connections to an AG Listener, I prefer to use SQLCMD but you can also test easily using SSMS.

SQLCMD

The -Kreadonly switch is your key to success here but remember to also specify the database using -d. When not set (and with an initial catalog of master for my login), I found I always got the primary instance back during my check. This simple omission cost me hours of troubleshooting work, because I was convinced my listener wasn’t working correctly. In fact, I just wasn’t testing it correctly.

Read/Write Connections
-sqlcmd -S "SQL01-AG1-list" -d WideWorldImporters -E -q "SELECT @@SERVERNAME;"

Read only Connections
-sqlcmd -S "SQL01-AG1-list" -d WideWorldImporters-E -q "SELECT @@SERVERNAME;" -Kreadonly

The instance that you are connected to will show in the command prompt. Type exit to leave sqlcmd.

SSMS

In object explorer, click Connect and choose Database Engine.

Then, in the bottom right hand side of the dialog box, click on Options >>

In Connection Properties, Click Connect to database and then <Browse server ..>. Choose a DB that is in your availability group.

Then click on Additional Connection Parameters and type in ApplicationIntent=ReadOnly

Click connect and run SELECT @@SERVERNAMEand you should expect to see the instance name of your secondary replica, providing you have set up the read only routing correctly.

If you change connection and remove ApplicationIntent=ReadOnly from the Additional Connection Parameters, you should see the result as the name of your primary instance.

Summary

Hopefully these 2 simple techniques to test read intent connections to an AG Listener will be useful and help save you time. It’s a simple blog post but I wanted to write it because I was looking at the problem in too much depth and missing the obvious mistake of choosing my database context.

3 things a new DBA should learn first

3 things a new DBA should learn first

What is happening on the SQL server right now

Backup and restore of SQL Databases

Scripting i.e T-SQL and PowerShell

Learn T-SQL

Learn Powershell

Summary

Further reading

Efficient maintenance of SSISDB

Maintenance of SSISDB within SQL Server

My approach

1. Reduce retention period (days) in catalog.properties to 400 (This allowed me delete rows based on only 22,000 IDs)

2. Script out or find a reliable script that does this work manually.

3. Ensure SSISDB is in SIMPLE recovery model (as it helps with transaction log)

4. Run script (from step 2) and see how it performs and how much space is freed up

5. Repeat steps 1-4 (each time lowering retention period (days)) until you achieve your target retention period

6. Ensure this never happens again

Further reading

Alert if your availability group fails over

Distributed Replay Error: Failed to set proper database for the connection

Distributed Replay Error: Failed to set proper database for the connection – Troubleshooting dreplay.exe

Related posts

Useful links

Test read intent connections to an AG Listener

SQLCMD

SSMS

Summary

Popular posts