John McCormack DBA

SQL Server Databases and Cloud

  • Personal
    • About
  • Free Training
    • SQL Server on Amazon RDS (Free Course)
    • Free practice questions to help you pass DP-900
  • Save money in Azure
    • Azure IaaS SQL Backups – Stop burning money
    • Your Azure SQL Database and Managed Instance is too big
    • Turn the cloud off at bedtime to save 70%
    • Your Azure SQL Virtual Machine might be too big
    • Save money with Azure SQL DB serverless
    • Save up to 73% with reserved instances
    • Delete unused instances to save money in Azure
  • Hire me
    • 60 minute cost optimization
    • Let me solve your SQL Server problems
    • Take a look at my Sessionize speaker’s profile

John McCormack's blogs

Running SQL Server on RDS SQLBits presentation

2nd March 2019 By John McCormack Leave a Comment

Running SQL Server on RDS

I presented on Running SQL Server on RDS at SQLBits. It was  a lightning talk, so only 5 minutes.

You can download my slides from here:

Running SQL Server on RDS

Further interesting links:

My more detailed post on running SQL Server on AWS (RDS vs EC2)

Amazon RDS for SQL Server FAQs

SQL Server Agent Information

Filed Under: AWS RDS, AWS SQL Server, front-page Tagged With: presentation, rds, SQL server, sqlbits

Outstanding batch wouldn’t clear on sys.dm_fts_outstanding_batches

16th February 2019 By John McCormack Leave a Comment

Outstanding batch wouldn’t clear on sys.dm_fts_outstanding_batches

I had an issue where an outstanding batch wouldn’t clear on sys.dm_fts_outstanding_batches. This meant the full-text index was not working when queried using the CONTAINSTABLE syntax. I used a series of queries from Glen Allan Berry’s post to help me identify what the problem was with my full-text index.

sys.dm_fts_outstanding_batches

I restarted the Full-Text service with no success. This seemed like the natural thing to do and my colleagues said that it had worked in the past when FTS had stalled. Unfortunately, this didn’t help the situation.

I then rebuilt the Full Text Catalog and still had no success. This took 5 minutes however the rebuild was blocked so would normally have been much quicker.

As the full-text index was built on an indexed view, my next idea was to rebuild the clustered index of the view. To my surprise, this actually solved the problem and I no longer had any outstanding FTS batches. My query using CONTAINSTABLE returned a success and the developers confirmed their application was fixed. In my opinion, this is well worth a try.

sys.dm_fts_outstanding_batches cleared

The point of sharing this post is the hope that it will help someone else with the same issue. It’s not my greatest ever post and I’m not certain why the index rebuild helped but its been a good lesson to me that I need to learn more about Full-Text Indexing for SQL Server. Anything else I learn will be used to add useful information to this blog post.

Any suggestions for good resources in the comments would be highly appreciated as this looks like an area that has been far from extensively covered by other bloggers.

Filed Under: front-page, Guides Tagged With: FTS, Full-Text Catalog, Full-Text Index, Full-Text outstanding batches

A month in the life of a DBA contractor

8th February 2019 By John McCormack Leave a Comment

DBA Contractor in GlasgowI left what I considered to be a great permanent job to become a DBA contractor in Glasgow. It was one where I could pick some of my own projects, work with new and interesting technology and work in a team which collaborated well, liked each other and were good at sharing knowledge.

My reasons for moving on were two-fold. First of all, I wanted some career progression and it unfortunately just wasn’t available in my permanent job. Secondly, I wanted to work more with SQL Server and Azure. My last job started out as a SQL DBA but as a lot of people in this line of work will know, the duties have evolved considerably over time. This is mostly due to large scale cloud adoption. In my case, my workplace started heavily using AWS along with other RDBMSs and data platforms. (RDS MySQL & Aurora, ElasticSearch, Elastic Map Reduce (EMR), Glue and Athena). This gives you a lot to learn. Don’t get me wrong, I threw myself into learning about cloud solutions and I loved working with AWS (hopefully I will again) but as time went on, I was worried I would start to lose some SQL Server knowledge.

So what has the first month been like?

I have been working exclusively on the Microsoft Data Platform. At my first client, I’ve worked with SQL Server on-premises, SQL Server running on Azure VMs, Azure SQL DB, Azure Analysis Services and Azure SQL Data Warehouse.

On-Premises

The on-prem SQL Servers are mostly for hosting the databases of legacy applications and 3rd party vendor products. Some of the versions are ‘rather old‘ although these instances actually give the business the fewest headaches. I’ve worked on decommissioning unused databases that were still online, server side traces and native backups.

Azure VMs running SQL Server (IaaS)

These host the databases that have the most active development work. I’ve looked at performance issues and SQL Server configurations to help improve performance. I’ve shared best practice with the team in terms of tempdb and helped to reduce the volume of unnecessary emails from servers. (You know the ones which someone set up years ago but never induce any action from the DBAs).

Azure Analysis Services (PaaS)

This was fun. I only did some basic configuration and permissions but as I’ve never used Analysis Services much over the years, it was cool to see what was involved.

Azure SQL Data Warehouse (Paas)

It was great to get my hands on tech like this. I spent time modernising their manual point and click refresh process with PowerShell. This was a bit of a learning curve but extremely rewarding. This process will make future refreshes much easier. (I’ll share the code once I’m confident it’s good, I’d still like a few more run-throughs before doing this). Next month, I’d like to try to implement a self service process for the developers that allows them to do their own refreshes, freeing up DBA time to work on more critical items.

Azure SQL DB (PaaS)

So far, I haven’t had to do too much with Azure SQL DB on this project. I’ve listed out the instances and instance types using PowerShell. My plan is to review if these are right sized or if efficiencies can be made and I’ll also make sure the alerting is set up correctly.

Plans for next month

I’d like to offer some in depth server health reviews and work with the developers to help them make their code run faster. There is a lot of blocking due to long running stored procedures and sub optimal code. Improving a few of the big hitters will make them much happier and will ease the strain on some of our servers – at least that’s the idea.

Further Reading

  • Azure SQL Data Warehouse
  • PowerShell cmdlets and REST APIs for SQL Data Warehouse

Filed Under: Azure, Azure SQL Analysis Services, Azure SQL Data Warehouse, Azure SQL DB, front-page, Personal

How do I run SQL Server on AWS?

6th November 2018 By John McCormack 3 Comments

How do I run SQL Server on AWS?

Running SQL Server on AWS can be done in 2 ways.

  • Relation Database Service (RDS): AWS’s managed solution where some of the administration (maintenance, backups and patching) is handled for you.
  • EC2: Your very own virtual machine in the cloud. With EC2, you manage SQL Server, just like you would do on-premises. This gives you full control over your SQL instance.

Why not just use Azure?

Azure may be the right choice for you. This purpose of this post is to show that you do have options and that Azure is not the only choice. Really, you should weigh up what you need from your cloud provider and make a choice.  Managing a multi-cloud environment can be challenging. If you are working for a company already heavily invested it AWS, it may make more sense to stay with AWS. Even if not, AWS’s SQL Server offering is mature and RDS’s features are improving all the time so you should consider what you can achieve with AWS.

Differences/Similarities/Pros and Cons

Feature EC2 RDS
Use SSMS ✔ ✔
Use database as source or target for SSIS ✔ ✔
Full control over the instance ✔
Automated backups ✔
Automated patching ✔
Always On Availability groups ✔ ✔
AWS-managed Multi AZ deployment ✔
Log Shipping  ✔
Replication  ✔
Database mail  ✔
Linked Servers  ✔

A full list of included and excluded features can be viewed at https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_SQLServer.html 

RDS key facts

  • Limited control
    • The master user name you use when you create a DB instance is a SQL Server Authentication login that is a member of the processadmin, public, and setupadmin fixed server roles. It is not a member of the sysadmin fixed server role.
    • Anyone who creates a db is assigned to db_owner role on that DB.
  • More administration tasks are handled automatically e.g. backups, OS patching and SQL patching
  • Bring Your Own licence is no longer available on RDS
  • License included (covers multi AZ)
    • Standard
    • Web (Must only be used for public and internet-accessible webpages, websites, web applications, and web services.)
    • Enterprise
  • Backups
    • Can still be pushed to S3
    • Native backup and restore  using .bak files to/from S3
    • Versions: 2008 R2 – 2017
  • Limited to 30 databases per instance
  • Window/Mixed authentication
    • http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_SQLServerWinAuth.html
  • Multi AZ is very simple to set up
  • TDE at rest
  • SSL for data in transit
  • No database Mail

EC2 approach

  • Full control of instance
  • Bring your own licence (BYOL) options
  • Licence included (AMI)
    • Standard
    • Enterprise
  • High Availability Disaster Recovery HADR
    • You can use Always On Availability Groups
      • Automatic failover between AZs
      • Synchronous commit
      • Latency about 1ms to 2 ms
    • Manual failover between Regions option for increase resiliency

Backups

EC2, you are responsible for all of your own backups. Backups can be pushed to S3 to save reduce costs. I have a full post dedicated to SQL Server backups on EC2. You could also store backups on EBS volumes attached to your instance however this is much more expensive than storing on S3.

RDS will handle backups for you. You can restore to a point in time in RDS withing the backup window you configure. Transaction logs are taken every 5 minutes so you may experience data loss up to 5 minutes. All databases are restored with 1 second of each other. The downside to this is that transactions spanning multiple databases may not be recovered consistently.

Summary

Features are being added to RDS all the time however if you need features in the excluded list, you would need to go with an EC2 installation. However if your business size doesn’t allow for DBAs or database specialists, RDS may be the best option as AWS take care of all the installation and maintenance for you.

Resources

  • Microsoft SQL Server on Amazon RDS
  • Importing and Exporting SQL Server Databases
  • Restoring a DB Instance to a Specified Time
  • EC2 SQL Server Backups to Amazon S3

Filed Under: AWS RDS, front-page

Glasgow Super Meetup – AWS Athena Presentation

26th October 2018 By John McCormack 1 Comment

The Glasgow Super Meetup was a joint event between Glasgow Azure User Group, Glasgow SQL User Group and Scottish PowerShell & DevOps User Group. I did an AWS Athena Presentation to the group.

Speaking about AWS Athena at the Glasgow Super Meetup might seem like an odd choice since most attendees will use Azure heavily or be more interested in SQL Server, however I was pleasantly surprised by the interest that people took in the subject matter. It was only a lightning talk so there wasn’t time to answer questions however I was asked a number of questions during the break by attendees.

I showed how tables can be archived out of the database and into S3, at a fraction of the price yet the data can still be queried if needed using Athena. I stressed that Athena isn’t intended as a replacement for an RDBMS and as such, queries will be slower than SQL Server however it is much cheaper to store large amounts of data in flat files in object storage (such as S3), rather than  expensive block storage which is used with databases. So if the use case fits, such as infrequently accessed archive data, then it is something to consider. I’ve uploaded my slides and also linked to a recording of the event. If you want to try the code, you’ll find it below.

Slides Recording

Demo

Description

As a proof of concept, I want to export the data from the Sales.SalesOrderHeader table in Adventureworks2012 to flat files using BCP. The data would be partitioned into unique days using the OrderDate column. This data is then exported to the local file system and then uploaded to Amazon S3. The next steps include creating a table in Athena, querying it to review the data and validating the correct data has been uploaded.

Code

  1. Run select query with dynamic sql to generate PowerShell and BCP command. (Run query then select/copy full column and paste into PowerShell)
    1. SELECT DISTINCT
      OrderDate,
      'New-Item -ItemType directory -Path C:\Users\jmccorma\Documents\SQL_to_S3_Demo\Output_Files\year='+CONVERT(varchar(4), OrderDate, 102)+'\month='+CONVERT(varchar(2), OrderDate, 101)+'\day='+CONVERT(varchar(2), OrderDate, 103)+' -ErrorAction SilentlyContinue' as PoSH_command,
      'bcp "SELECT SalesOrderID, RevisionNumber, OrderDate, DueDate, ShipDate, Status, OnlineOrderFlag, SalesOrderNumber, PurchaseOrderNumber, AccountNumber, CustomerID, SalesPersonID, TerritoryID, BillToAddressID, ShipToAddressID, ShipMethodID, CreditCardID, CreditCardApprovalCode, CurrencyRateID, SubTotal, TaxAmt, Freight, TotalDue, Comment, rowguid, ModifiedDate FROM [AdventureWorks2012].[Sales].[SalesOrderHeader] WHERE OrderDate = '''+convert(varchar, OrderDate, 23)+'''"
      queryout "c:\users\jmccorma\Documents\SQL_to_S3_Demo\Output_Files\year='+CONVERT(varchar(4), OrderDate, 102)+'\month='+CONVERT(varchar(2), OrderDate, 101)+'\day='+CONVERT(varchar(2), OrderDate, 103)+'\SalesOrderHeader.tsv" -c -t\t -r\n -T -S localhost\SQLEXPRESS' as bcp_command
      FROM [AdventureWorks2012].[Sales].[SalesOrderHeader]
  2. Highlight column PoSH_command, copy and then paste into Powershell window
  3. Highlight column bcp_command, copy and then paste into Powershell or command window
  4. Upload from local file system to AWS S3. You must have an S3 bucket created for this and you must have configured an IAM user in AWS to do this programatically. You can upload manually using the AWS console if you prefer.
    • aws s3 sync C:\SQL_to_S3_Demo\Output_Files s3://athena-demo-usergroup/ Change to your local file location and your s3 bucket
  5. Create database and table in Athena (copy code into AWS console) and load partitions
    • CREATE DATABASE adventureworks2012;
    • -- Athena table created by John McCormack for Glasgow User Group
      CREATE EXTERNAL TABLE `SalesOrderHeader`(
      `SalesOrderID` INT,
      `RevisionNumber` TINYINT,
      `OrderDate` TIMESTAMP,
      `DueDate` TIMESTAMP,
      `ShipDate` TIMESTAMP,
      `Status` TINYINT,
      `OnlineOrderFlag` BOOLEAN,
      `SalesOrderNumber` STRING,
      `PurchaseOrderNumber` STRING,
      `AccountNumber` STRING,
      `CustomerID` INT,
      `SalesPersonID` INT,
      `TerritoryID` INT,
      `BillToAddressID` INT,
      `ShipToAddressID` INT,
      `ShipMethodID` INT,
      `CreditCardID` INT,
      `CreditCardApprovalCode` STRING,
      `CurrencyRateID` INT,
      `SubTotal` DECIMAL(12,4),
      `TaxAmt` DECIMAL(12,4),
      `Freight` DECIMAL(12,4),
      `TotalDue` DECIMAL(12,4),
      `Comment` STRING,
      `rowguid` STRING,
      `ModifiedDate` TIMESTAMP
      )
      PARTITIONED BY (
      `year` string,
      `month` string,
      `day` string)
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY '\t'
      ESCAPED BY '\\'
      LINES TERMINATED BY '\n'
      LOCATION
      's3://athena-demo-usergroup/'
      TBLPROPERTIES (
      'has_encrypted_data'='false',
      'skip.header.line.count'='0')
    • MSCK REPAIR TABLE salesorderheader;
  6. Run these queries in SSMS and Athena to review the data is the same
    • This performs a row count and checks the sum of one particular column(territoryid). This is fairly rudimentary check and not guaranteed to be unique but it is a simple way of having a degree of confidence in the exported data.
    • -- Validate Athena data is correct
      -- Athena
      SELECT COUNT(*) as row_count,SUM(territoryid) as column_sum FROM "adventureworks2012"."salesorderheader"
      WHERE year='2014'
      AND month = '01'
      AND day = '23';-- SQL Server
      SELECT COUNT(*) as row_count,SUM(territoryid) as column_sum FROM adventureworks2012.sales.salesorderheader
      WHERE OrderDate = '2014-01-23 00:00:00.000'
  7. Now it is uploaded, you can query any way you like in Athena. It is worth noting that partitioning improves the performance of the query and makes the query cheaper because it scans less data. If partitioning data, you should use the partition key in your query otherwise it will scan all of data. Note the difference between the 2 queries below.
    • -- Not using partition (12 seconds - scanned 7.53MB)
      SELECT * FROM "adventureworks2012"."salesorderheader"
      WHERE OrderDate = CAST('2014-01-23 00:00:00.000' as TIMESTAMP);-- Using Partition (1.8 seconds - scanned 15.55KB - 1/6 of the duration and 1/495 of cost)
      SELECT * FROM "adventureworks2012"."salesorderheader"
      WHERE year='2014'
      AND month = '01'
      AND day = '23';

Further resources:

  • https://aws.amazon.com/athena/ 
  • https://aws.amazon.com/s3/

Filed Under: AWS, AWS Athena, front-page

  • « Previous Page
  • 1
  • …
  • 14
  • 15
  • 16
  • 17
  • 18
  • …
  • 22
  • Next Page »
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

John McCormack · Copyright © 2025

 

Loading Comments...