Data Scotland 2019 presentation:
My Data Scotland 2019 presentation: You can run SQL Server on AWS
My Data Scotland 2019 presentation: You can run SQL Server on AWS
This post is about database recovery models for SQL Server databases. Having the correct recovery model for a database is crucial in terms of your backup and restore strategy for the database. It also defines if you need to do maintenance of the transaction log or if you can leave this task to SQL Server. Let’s look at the various recovery models and how they work.
The three database recovery models for SQL Server:
Every operation is logged and written to the transaction log. This is required for very low or no data loss requirements and means that regular log backups are required. If they are not taken, the transaction log will grow indefinitely (or until it fills up the disk or hits the maximum sized specified). If this occurs, a 9002 error is generated, and no further data modifications can take place until the log is backed up or truncated. Full recovery model is usually used in production, especially for OLTP workloads.
Without a full backup being taken, databases in the full recovery model are treated as if they are in simple recovery model, the log is automatically truncated. So it is imperative to take a full backup of your database when you first use full recovery. (A differential backup would also suffice if there has previously been a full backup that would form part of your backup chain).
Choose the full recovery model if you want or need to do any of the following things:
Only extents that are modified are recorded in the transaction log. This means they are reliant upon the database’s data files. Certain types of transaction are classed as minimally logged including:
You are also required to take log backups with the bulk-logged recovery model. The log backups refer to the Bulk Changed Map (BCM) to identify modified extents in need of backup. Read up more on BCM: Further Reading
Choose the bulk-logged recovery model if you want or need to do any of the following things:
The transaction log is automatically managed by SQL Server and cannot be backed up. SQL Server reclaims the space used by automatically truncating the log. This recovery model is mostly used for non-prod environments. Although SQL Server manages the transaction log in simple recovery model, there is still a chance your t-log could fill up. This would occur if open transactions are left in place (not committed or rolled back) or by long running transactions.
Choose the simple recovery model in the following scenarios:
The default recovery model is determined by the recovery model of the model database but can be changed at any time using an ALTER DATABASE command. If you do change the recovery model, you’ll need to understand the impact on the backup chain.
There’s a handy DBATools PowerShell command for checking that the database you think are in Full Recovery model are actually in Full Recovery Model. Test-DbaDbRecoveryModel
Test-DbaDbRecoveryModel -SqlInstance localhost | Select InstanceName, Database, ConfiguredRecoveryModel, ActualRecoveryModel | Out-GridView
Note: In the output below, you can see that NebulaDataSolutions database is configured for FULL recovery but is actually in SIMPLE. This is because there is no valid Full backup for NebulaDataSolutions.
I was asked by a colleague why his where clause wasn’t being selective when filtering on a space value. The column was a char(1) data type. To understand the curious case of the space in char(1), we need to understand how the char data type works and also a bit more about the need for it in this scenario.
My colleague had built a user table which included a gender column. A char(1) was chosen because one letter could be used to record the gender of the user. It was also small which helped with performance and overall size of the table. All non specified or non binary genders were assigned a ‘ ‘. (A space) The assumption was that the space character would behave the same as an M or F when used in the SELECT clause.
A char is fixed-length data type and any value that fall short of the fixed length is padded out with empty spaces. This means that inserting ” into the gender column has the same affect as entering ‘ ‘. This can be tested using the LEN function:
DECLARE @tbl TABLE (name char(10)) INSERT INTO @tbl VALUES ('') -- empty char(1) INSERT INTO @tbl VALUES (' ') -- a space in char(1) INSERT INTO @tbl VALUES ('John') INSERT INTO @tbl VALUES ('John ') SELECT name,LEN(name) as len_name FROM @tbl
The spaces do not affect the LEN of the value of the char column
CREATE TABLE dbo.People (Description NVARCHAR (20),Gender CHAR(1)) INSERT INTO dbo.People(Description,Gender) VALUES(N'Male', 'M') INSERT INTO dbo.People(Description,Gender) VALUES(N'Female', 'F') INSERT INTO dbo.People(Description,Gender) VALUES(N'Other with space', ' ') INSERT INTO dbo.People(Description,Gender) VALUES(N'Other without space', '') INSERT INTO dbo.People (Description,Gender) VALUES(N'Other as O', 'O')
-- Select all rows and show the LEN of the gender column SELECT Description, Gender, LEN(Gender) AS Len_Gender FROM dbo.People
-- We get the same result with either query so the space is ignored SELECT * FROM dbo.People WHERE Gender = ' ' SELECT * FROM dbo.People WHERE Gender = ''
-- We get the row we want when we use a value such a O instead of a space SELECT * FROM dbo.People WHERE Gender = 'O'
The best option is use to a different character like a ‘O’ or something else suitable, anything other than a space in char(1) would work. If you weren’t too concerned about space used in this column, using a longer varchar data type would allow user to enter anything they wanted. Failing that, a TINY int would allow 256 options and still only uses 1 byte.
Yes, I know there a whole discussion about whether we should be storing gender and possible values but this is simply a post about using CHAR(1).
Slightly tenuous link: https://johnmccormack.it/2016/02/how-does-the-sql-server-engine-process-my-select-query/
The SSIS Server Maintenance Job is used for maintenance of SSISDB. It manages the retention of operations records in SSISDB. I noticed it had been turned off by someone last year and it hadn’t run since. Therefore, SSISDB had become bloated and there was only 10MB left on the data drive meaning the database could no longer auto grow.
Nobody was aware, why would they be? After all, nothing was failing. We didn’t have disk space monitoring enabled so the only time we found out there was a problem was when the disk had filled up.
I made 2 unsuccessful attempts at running the SSIS Server Maintenance Job. However, after several hours of processing and still no available free space in the database, I knew the job wasn’t coping with the sheer number of rows it had to delete. The deletes all happen from the parent table (internal.operations) and then all child tables using using cascading deletes. This approach maintains referential integrity but is not great for performance.
Due to this, I needed a new approach to the maintenance of SSISDB. As we hadn’t maintained these tables for 13/14 months, I was asking too much of SQL Server to let me delete everything at once. (Truncates wouldn’t do because I had to keep the last 2 weeks data).
A bit of investigation showed me that these were the related tables.
To do this in T-SQL:
EXEC [SSISDB].[catalog].[configure_catalog] @property_name=N'RETENTION_WINDOW', @property_value=400
To do this is SSMS:
Right click SSISDB from Integration Services Catalog in SQL Server Management Studio. Then, amend Retention Period (days) to a suitable initial value – in my case 400 days.
I struck gold with a superb script from Tim Mitchell which honours the Retention Period (days) value. I decided this was better than writing my own. Please follow the link to review this script along with other useful information (or likewise, get the latest version from Tim’s Github).
SELECT DB_NAME(database_id) AS DBName,recovery_model_desc FROM sys.databases WHERE DB_NAME(database_id) = 'SSISDB'
ALTER DATABASE [SSISDB] SET RECOVERY SIMPLE WITH NO_WAIT
SELECT NAME, size/128/1024.0 AS FileSizeGB, size/128/1024.0 - CAST(FILEPROPERTY(NAME, 'SpaceUsed') AS INT)/128/1024.0 AS FreeSpaceGB, (size/128/1024.0 - CAST(FILEPROPERTY(NAME, 'SpaceUsed') AS INT)/128/1024.0) / (size/128/1024.0) * 100 AS PercentFree FROM SSISDB.sys.database_files WHERE NAME = 'SSISdata';
I started small by reducing the retention period (days) to 390. This allowed me to measure the time and impact of removing 10 days of data. I then went down to a retention period of 360 days. I found 14 days to be the best performing decrements so I kept doing this until I had only 14 days of data remaining. Following this, I kept the new script in place (scheduled nightly) via SQL Agent. There was no need to continue using the SSISDB cleanup stored procedure internal.cleanup_server_retention_window.
They say prevention is better than cure. Here are some ideas on how to implement this and ensure that your maintenance of SSISDB is ongoing:
AZCopy is a useful command line utility for automating the copying of files and folders to Azure Storage Account containers. Specifically, I use AZCopy for SQL Backups but you can use AZCopy for copying most types of files to and from Azure.
In this blog post example (which mirrors a real world requirement I had), the situation is that whilst I need to write my SQL backups over a network share, I also want to push them up to Azure Storage (in a different region) to allow developers quicker downloads/restores. This is why I need to use AZCopy. If I only needed my backups to be written to Azure, I could have used BACKUP TO URL instead.
I use Shared Access Signatures (SAS token) to authorise access to the storage account container. You can create an SAS token for any Access key by clicking Generate SAS and connection string button from storage accounts/Shared Access Signature.
For this scenario, my SAS token is limited to write / add / create operations and cannot be used to read or delete data. This is just to limit the scope for security reasons, but you can decide whatever meets your needs.
SAS tokens have an expiry date. I’ve set mine to expire on 16th August 2019 so I will need to create a new token at this time and update my AZCopy command if I want to continue with the automated copying via a scheduled script.
Storage Account: johnbox Container: instance1
The basic copy command is simple. An example is:
azcopy cp "<source location>" "<destination location>"
For using an SAS token, we append this to the destination url:
azcopy cp "C:\temp\johns_backup.bak" "https://johnbox.blob.core.windows.net/instance1?sas-token"
However, the azcopy sync option is more efficient because it won’t force a copy over the network if the source and destination already match. The example below also syncs the full folder, not just a specific file.
azcopy sync "C:\temp" "https://johnbox.blob.core.windows.net/instance1?sas-token"
I chose to schedule this in a sql agent job. The job step is directly after the step where I back up to my network share. You could also use windows task scheduler if you prefer.
Download AZCopy from https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10. The page also gives instructions on how to run and authenticate AZCopy commands.
If you have time, please have a look at my other Azure posts.