Scripts Archives - John McCormack DBA

Free space in SQL server filegroups query

25th March 2022 By John McCormack 1 Comment

I just realised that in all my scripts that I use on a regular basis, I didn’t have one for working out free space in SQL Server filegroups. It’s not something that comes up too often but it’s handy to know. For methods of working out space in individual files, you could refer to this post on mssqltips.

-- Free space by filegroup
SELECT
FILEGROUP_NAME(data_space_id) as FilegroupName,
SUM(size/128.0) AS CurrentSizeMB, 
SUM(size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS INT)/128.0) AS FreeSpaceMB,
(
	SUM(size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS INT)/128.0) / -- FreeSpaceMB
	(SUM(size/128.0))  -- CurrentSizeMB
	* 100 -- Convert to percentage
) AS FilegroupPercentFree,
COUNT(*) as NumberOfFilesInFilegroup
FROM sys.database_files
WHERE data_space_id <> 0
GROUP BY FILEGROUP_NAME(data_space_id);

POST #100

I just realised as well that this is post #100 on my blog. I started it just to save some of my queries and ideas. Post #1 was a really simple query but it got me going.

How to change the slow query log threshold on RDS

19th August 2021 By John McCormack 1 Comment

What is the slow query log

Before we discuss how to change the slow query log threshold on RDS, let’s quickly establish what the slow query log is.

The slow query log will record all queries which are above the threshold level. The default value is 10 (seconds) but you can set it higher or lower depending on your requirements. It is useful for finding slow queries and allows you to pick out candidates for tuning.

If you set the threshold too low, it can increase I/O overhead on your instance and use a lot of valuable disk space. If you set it too high, it might not capture enough useful information.

AWS Web console

If you are doing this change as a one off, it might be simpler to just use the AWS web console.

Log into AWS
Select RDS
Find out which parameter group your instance is in
1. Select instance then click configuration.
2. Scroll down to see the parameter group
3. It will be a hyperlink so just click on it
In the parameter group page, use the search box and search for slow_query_log
1. Ensure it is set to 1
2. If not, click edit parameters and change the value to 1.
Change search box to long_query_time
1. Set it to any value between 1 and 31536000
2. Save changes

AWS CLI

aws rds modify-db-parameter-group --db-parameter-group-name "primary-mysql-5point7" --parameters "ParameterName='long_query_time',ParameterValue=1,ApplyMethod=immediate" --profile dev
# If you don't have multiple profiles set, leave out --profile dev.

AWS PowerShell

$HashArguments = @{
    DBParameterGroupName = "mysql5point7-monster-param-grp-ci"
    Parameter=@{ParameterName="long_query_time";ParameterValue="10";ApplyMethod="Immediate"}
    Profilename = "Dev"
    }
Edit-RDSDBParameterGroup @HashArguments

No restart needed

Changing either slow_query_log or long_query_time parameters can both be safely done without the need to restart your instance. This is because RDS defines them as Dynamic parameters. Take note when changing parameters that are defined as Static as they will require a restart. This either means short but immediate downtime or waiting until your next maintenance window.

PowerShell Splatting – What was wrong with backticks?

8th December 2020 By John McCormack Leave a Comment

T-SQL Tuesday

This month’s T-SQL Tuesday is hosted by Lisa Griffin Bohm. (b|t). Lisa asks “This month, I’d like those of you who have presented, or written a presentation, to share something technical THAT DID NOT RELATE to the topic of the presentation, that you’ve learned in writing or giving the presentation.” I’m going to write about how I came across PowerShell Splatting and how it made me better at PowerShell, despite presenting on a cloud topic.

Powershell splatting

At DataScotland 2019, I did a presentation on AWS RDS for SQL Server. The technical content was about how RDS works, what you can do with it, and how to provision it etc. As part of my demos, I decided to use AWS PowerShell commands. When I had made this presentation at previous events, I had used AWS CLI so I had to update my code examples. I’m ok with PowerShell, but I’m not an expert. I just wanted to show that were were multiple ways to interface with AWS.

My code was full of backticks. You could say I was daft about backticks. I loved them and thought they made my PowerShell code readable because they stopped the lines from running off the monitor. Someone asked me why I don’t use PowerShell splatting? “Whatting” I asked? I had never heard of splatting.

At the break, I spoke to a couple of people who were more experienced in PowerShell than me. They advised that PowerShell splatting was the way to go for large commands because they make the code more readable. More readable was definitely something I was interested in so I decided to go away and learn about splatting, and update my code for future events.

So what is PowerShell Splatting?

Rather than passing a long list of parameters into commands, you can create a variable in advance to hold these values. The variable is an array or a hash table and includes as many parameters as you need. If you need to pass in parameter names and values, use a hash table. If you just need to pass in a list of parameter values, you should use an array. Then, when you run the command, you simply pass in the hash table parameter instead of all the individual parameters.

Example command before splatting

New-RDSDBInstance -dbinstanceidentifier "datascotland-posh" -region "eu-central-1" -VpcSecurityGroupId "sg-00a1234g567c3d4ab" `
    -allocatedstorage 20 -dbinstanceclass "db.t2.micro" -engine "sqlserver-ex" `
    -masterusername "rds_name" -masteruserpassword "secure_pw_here" -availabilityzone "eu-central-1a" `
    -port 50000 -engineversion "14.00.3049.1.v1"

Example command after splatting

$HashArguments = @{
dbinstanceidentifier= "datascotland-posh"
region = "eu-central-1"
VpcSecurityGroupId = "sg-00a1234g567c3d4ab"
allocatedstorage = 20
dbinstanceclass = "db.t2.micro"
engine = "sqlserver-ex"
masterusername = "rds_name"
masteruserpassword = "secure_pw_here"
availabilityzone = "eu-central-1a"
port = 50000
engineversion = "14.00.3049.1.v1"
}
New-RDSDBInstance @HashArguments

At a glance

As you can see, the 2nd example which uses splatting is easier to read and you can pick out the value of each parameter at a quick glance. It was worth learning to make my code more readable and improve my overall PowerShell knowledge.

“Writing this post has just made me realise that I should update my RDS course ↙ as the examples in it don’t use splatting. 🤦‍♂️”

SQL Server on Amazon RDS (Free Course)

Logical query processing

30th September 2020 By John McCormack Leave a Comment

Logical query processing

Logical query processing describes the way in which the SQL Server processes a select query. This post used to be called “How does the SQL Server engine process my select query” but no-one was really searching for that. It was originally posted in Feb 2016, and was updated on September 2020.

Declarative Language

T-SQL is the dialect of SQL used with SQL Server and is a declarative language. A declarative language takes the user’s request (or query) and works out the best way how to return the data. This differs from an imperative language in which the programmer tells the program how it should process the statement. With this is mind, it should be fairly easy with some simple syntax just to tell SQL Server what we want and get the results back.

An example in English could be “Bring me all my black shoes from the cupboard”. In SQL (and with a bit of stretch of the imagination), this would be written as:

SELECT item FROM dbo.Cupboard
WHERE item = 'shoes'
AND colour = 'black';

Despite the way we write the query, the SQL Server engine has other ideas about the processing order. The table below shows the order in English versus the order in SQL.

ENGLISH	SQL SERVER
SELECT	FROM
FROM	WHERE
WHERE	SELECT

So conversely, if we were to say this in English, it would be something like “From the cupboard, find all the black shoes and bring them to me”.

Why is it done this way?

SQL Server uses Logical query processing. First, I should point out we will be discussing 6 main clauses (but there are more). They are listed below in logical order with a description. If you read all 6 descriptions, it should be fairly clear why SQL Server processes the select query in this order. e.g. You cannot select something if the place it supposed to be (a table) does not exist. So we go to the table first.

THE MAIN CLAUSES (In logical order)	DESCRIPTION
FROM	The FROM clause is executed first. This also includes JOIN which is a way of combining data from multiple tables into one result set. Thinking about this logically, the table(s) being queried must be identified first or we would have nothing to SELECT from.
WHERE	A clause such as `WHERE WHERE item = 'shoes'`means that rows are filtered. Rows are not returned if the item does not equal shoes. Remember you must refer to the column names rather than an alias you have given as the WHERE clause is evaluated before the SELECT.
GROUP BY	You can retrieve a set of all distinct row values based on what is included in the SELECT. e.g. `SELECT Item, Colour FROM [TestDB].[dbo].[Cupboard] GROUP BY Item,Colour` This can also be aggregated. In this example, I show how many times these combinations of values are repeated throughout the table by using COUNT(). `SELECT Item, Colour, Count() AS [Count] FROM [TestDB].[dbo].Cupboard] GROUP BY Item,Colour`
HAVING	The HAVING clause is similar to WHERE in that it filters the data based on a predicate. The main difference is that it is evaluated AFTER the data has been grouped (and so is evaluated per group) unlike a WHERE clause which you can see is evaluated before a GROUP BY. If we enhance the example above to include a HAVING clause (say for groups with a count > 1), the syntax would be: `SELECT Item, Colour, COUNT() as [Count] FROM [TestDB].[dbo].[Cupboard] GROUP BY Item,Colour HAVING COUNT() > 1`
SELECT	Whilst it is written at the beginning of the T-SQL query, the SELECT clause is evaluated near the end. Remember to include a DISTINCT after your SELECT clause if you want to ensure you return no duplicates. At this stage, you are returning a relational result. You have not guaranteed the order in which you want to see the result set. This means the same query could return the same results in a different order time and time again.
ORDER BY	The ORDER BY clause allows you to present the data in a deterministic way. e.g.[sql]ORDER BY [Count] Desc[/sql] Without an order by, there is no guarantee that the same query will return the rows in same order if executed more than once. Notice here that because this is evaluated last and unlike the WHERE clause, you can refer to column Aliases because they have now been created.

Automate Your Stress Away – Run sp_whoisactive on a schedule

8th September 2020 By John McCormack 1 Comment

Automate collection of sp_whoisactive on a schedule

This post is part of the September 2020 edition of #tsql2sday. Thanks to Elizabeth Noble for hosting (blog|twitter) and for coming up with a great topic. SQL Server has a great tool for scheduling repeatable tasks called SQL Agent. SQL Server also has a great community, who have written SQL Scripts so established and well known that some might think they come with SQL Server (or at least should). This post combines using SQL Agent with one of those invaluable community scripts, to help solve a real world problem.

When I took over a fairly troublesome SQL Server, I was always being asked what was running when the SQL Server was slow. That was fine when it was during the day and I could look. But often I would be asked, why was the server slow at 9pm last Friday? Or some other time. It’s not always the easiest thing to do as SQL Server doesn’t natively keep this information for you.

A quick but effective way to automate this is to download and schedule Adam Machanic’s sp_whoisactive stored procedure. I run this every 10 minutes but I have seen it scheduled as often as every 30-60 seconds. It depends on your system and your needs.

How to automate running sp_whoisactive on a schedule

Download sp_whoisactive from GitHub if you don’t have it already
Run the code to create the stored procedure on your SQL Server
Use my t-sql below to create an agent job that collects the data (and deletes older rows)
Change any of the parameters to suit your own needs
Review data next time you are asked “What was running at X time”? by simply querying the table

Create SQL Agent Job

! Important ! I use a utility database called DBA for stored procedures such as sp_whoisactive. If you don’t have a database called DBA, please either create one or change the reference to the name of the database you want to create your table in.

USE [msdb]

GO

BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0

IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'sp_whoisactive_log_to_table',
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=0,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N'Log output of sp_whoisactive every x minutes',
@category_name=N'[Uncategorized (Local)]',
@owner_login_name=N'sa', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Write output to table',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=3,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'TSQL',
@command=N'IF NOT EXISTS ( SELECT 1 FROM dba.sys.tables WHERE NAME = ''sp_whoisactive_log'')
BEGIN
CREATE TABLE dba.dbo.sp_whoisactive_log
(
id INT IDENTITY(1,1) PRIMARY KEY,
[dd hh:mm:ss:mss] VARCHAR(20) NULL,
session_id SMALLINT NOT NULL,
sql_text XML NULL,
login_name sysname NULL,
wait_info NVARCHAR(4000) NULL,
CPU VARCHAR(30) NULL,
tempdb_allocations VARCHAR(30) NULL,
tempdb_current VARCHAR(30) NULL,
blocking_session_id SMALLINT NULL,
blocked_session_count VARCHAR(30) NULL,
reads VARCHAR(30) NULL,
writes VARCHAR(30) NULL,
physical_reads VARCHAR(30) NULL,
query_plan XML NULL,
locks XML NULL,
used_memory VARCHAR(30) NULL,
[status] VARCHAR(30) NULL,
open_tran_count VARCHAR(30) NULL,
percent_complete VARCHAR(30) NULL,
[host_name] sysname NULL,
[database_name] sysname NULL,
[program_name] sysname NULL,
start_time datetime NULL,
login_time datetime NULL,
request_id SMALLINT NULL,
collection_time datetime NULL
)
CREATE INDEX idx_collection_time ON dba.dbo.sp_whoisactive_log (collection_time)
END

/* Load data into table.
If you want to change parameters, you will likely need to add columns to table
*/
EXEC sp_whoisactive @get_locks = 1, @find_block_leaders = 1, @get_plans = 1, @destination_table = ''sp_whoisactive_log''',
@database_name=N'DBA',
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Delete old rows',
@step_id=2,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'TSQL',
@command=N'DELETE dba.dbo.sp_whoisactive_log
WHERE collection_time < DATEADD(MONTH,-1,GETDATE())',
@database_name=N'master',
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Every x minutes',
@enabled=1,
@freq_type=4,
@freq_interval=1,
@freq_subday_type=4,
@freq_subday_interval=10,
@freq_relative_interval=0,
@freq_recurrence_factor=0,
@active_start_date=20200209,
@active_end_date=99991231,
@active_start_time=0,
@active_end_time=235959
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO

This job will run sp_whoisactive on a schedule every 10 minutes and record the output in the table dba.dbo.sp_whoisactive_log. It also deletes any rows older than 1 month.

Hope this helps
John

POST #100

What is the slow query log

AWS Web console

AWS CLI

AWS PowerShell

No restart needed

Further reading

T-SQL Tuesday

Powershell splatting

So what is PowerShell Splatting?

At a glance

Logical query processing

Declarative Language

Why is it done this way?

Automate collection of sp_whoisactive on a schedule

How to automate running sp_whoisactive on a schedule

Create SQL Agent Job