Search Results for: ssis

How to easily upload, download and migrate SSIS Packages with DTUTIL

9th June 2023 By John McCormack 2 Comments

SSIS and DTUTIL background

Woman writing code with reflection in glasses

SQL Server Integration Services (SSIS) is a powerful tool, but migrating packages across SQL Servers can be a slow and thankless task if you don’t use automation. The single best way to do this is by using DTUTIL, a command-line utility provided directly by Microsoft. When it comes to moving 1000s of packages, I would only use DTUTIL because it can achieve in minutes what it would takes days to achieve using point and click.

–

Migrating a single package

Migrating individual packages from one SQL Server to another directly is also possible with DTUTIL. This is an example command below:

dtutil /SQL "SourceFolder/SourcePackage" /SOURCESERVER "SourceServer" /COPY SQL;"DestinationFolder/DestinationPackage" /DESTSERVER "DestinationServer"

Specifies the source package location and server with /SQL “SourceFolder/SourcePackage” and /SOURCESERVER “SourceServer”.
Copies the package to the destination SQL Server with /COPY SQL;”DestinationFolder/DestinationPackage”.
Identifies the target SQL Server with /DESTSERVER “DestinationServer”.

–

Migrating multiple packages

This process can be scripted similarly for multiple package migrations. The easiest way is to generate the commands using T-SQL, print them to screen then copy and paste in to a command window.

If you run the command below on the source server, it will list out a command to migrate every package.

DECLARE 
	@sourceserver sysname = N'DBSERVER1'
	,@destserver sysname = N'DBSERVER2'
	,@foldername sysname = N'DATAWAREHOUSE'
	
SELECT 
'DTUTIL /SQL "'+sspf.foldername+'\'+ssp.[name]+'" /SOURCESERVER "'+@sourceserver+'" /COPY SQL;"'+sspf.foldername+'\'+ssp.[name]+'" /DESTSERVER "'+@destserver+'"' as cmd
FROM msdb.dbo.sysssispackages ssp
JOIN msdb.dbo.sysssispackagefolders sspf
ON ssp.folderid = sspf.folderid
-- WHERE sspf.foldername = @foldername -- Uncomment line to only use specific folder

–

Upload SSIS packages from disk

Uploading an SSIS package from your disk to SQL Server using DTUTIL is quite straightforward. The command line syntax you will use looks like this:

dtutil /FILE c:\mypackage.dtsx /DestServer DBSERVER2 /COPY SQL;"foldername/mypackage"

–

Download SSIS packages to disk (Great for backup)

Downloading an SSIS package to your disk from SQL Server using DTUTIL is also straightforward. The command line syntax you will use looks like this:

dtutil /SQL mypackage /COPY FILE; c:\destPackage.dtsx

Again, to download multiple packages, a script that lists out and provides a command would be the most efficient way to proceed. Try adapting the ‘migrate’ script above.

–

Summary

In conclusion, DTUTIL is a highly useful tool for managing SSIS package migration. While it might take a bit of practice to get comfortable with the syntax, its flexibility and efficiency make it worth the effort. Remember to always check your commands before executing them to prevent any unwanted actions. A full list of command is available from Microsoft Learn.

And that’s it, folks! I don’t have much other SSIS content, only this one which is more of a hack. With these simple and effective DTUTIL commands, you can now manage your SSIS packages with ease. Feel free to share this blog post with anyone who might find it helpful, and don’t hesitate to leave a comment below if you have any questions or want to share your own experiences with DTUTIL.

Search and replace inside SSIS package

3rd November 2016 By John McCormack 1 Comment

Search and replace inside SSIS package

I had the situation recently where I needed to search and replace a string inside an SSIS package. The reason for this was about 25 Execute SQL tasks within the package all included some hard coded T-SQL that pointed at a decommissioned network share. I needed to change the T-SQL in each task to point to a new network share. I know the package should have used a variable for this but I needed a quick fix.

If you realise that your SSIS package is simply just some XML which is made to look graphical by SSDT or BIDS, it becomes clear that this is a very simple process. It is just like doing a replace all in notepad which most people will have done many times. This is much easier and quicker than manually opening each and every Execute SQL Task and making each change manually.

Steps

Make a copy of your SSIS package (just in case)
Right click on the .dtsx file and choose Open with -> Notepad
In notepad, press Ctrl-H
1. Put the old value in Find What
2. Put the new value in Replace with
3. Click Replace All
Save your package
Open in SSDT or BIDS to confirm your changes have worked.

Deleting data from a data warehouse

9th January 2018 By John McCormack Leave a Comment

This post about deleting data from a data warehouse is my first post in the #tsql2sday series. This month’s host is Arun Sirpal. (blog)

T-SQL Tuesday #98 – Your Technical Challenges Conquered

Please write about and share with the world a time when you faced a technical challenge that you overcame.

Deleting data from a data warehouse

I was tasked with deleting a large amount of data from our data warehouse. This was because we had sold off a small part of our company (based in a different country) and as such, all associated data had to be removed from our data warehouse. The data warehouse pooled the data from our various OLTP data sources and stored it in one data warehouse.

Challenges

Find out the scope of the data
Identify the data to be purged
Identify the order for the purge to take place
- Why was this important?
Create a process for deleting data that will not interfere with our daily loads
Test, and then deploy with confidence

Find out the scope of the data & Identify the data to be purged

I had to identify which tables held the data. I created a tracking table and inserted the names of all of the tables in our two databases which held customer data.
Looping through all of the tables, I identified which tables had 0 rows where sourceid = 5 and marked these as completed in my tracking table.
The remaining tables containing rows where sourceid = 5 would be the tables where the purge was needed.

Identify the order for the purge to take place

It was not possible or practical to just delete the data based on a list of table names where data was held. I had to consider referential integrity and identify a precedence order in which to carry out the deletes, table by table. Many of our tables had Foreign Key constraints so SQL Server simply prevents you from deleting out of order. Some of our tables had multiple Foreign Key relationships and some went down through many levels. This made determining the order of deletions a difficult task.

Explanation:
If you try to delete a row in a primary key table, the delete will fail when the primary key value corresponds to a value in the foreign key constraint of another table. To make this change, you must do your delete of the foreign key data in the foreign key table first, and then delete from the primary key table.

I wrote a stored procedure which reviewed all of the foreign keys in the database and identified the correct order in which tables could be deleted from. Some tables had no foreign key relationships so these could be done in any order. They were the lowest precedence and were done first. The next set of tables were those foreign key tables referenced by a table with a primary key, but did not reference any other tables. These tables were processed next. This process continued on creating a hierarchy which eventually allowed me to identify the correct order in which the tables could be processed.

Create a process for deleting data that will not interfere with our daily loads

Over the years I have used SSIS for many large administration projects. It is useful where a complex workflow has to be identified and where there are many moving parts. SSIS can do so much more than just ETL. With this in mind, I created an SSIS package which would contain all of the code.

A main outline of the SSIS package

Steps to create objects only do so if the object doesn’t already exist.

Create and pre-populate (with table names and expected row counts) a metadata tracking table. As the package was running through, this would allow me to tell at a glance how many rows had been deleted, how many tables had been completed and how far the task had progressed overall. This information was important and also allowed me to provide regular updates to management.
Create a stored procedure which would determine the order in which tables should be purged.
Run the stored procedure and update the metadata tracking table with the precedence values.
Delete the rows. The deletes would be done one table at a time, in optimised batches following the precedence order set out in the metadata tracking table to prevent error. The number of rows deleted for each table was updated in the tracking table.

With the package created, I set up a SQL Agent Job and set a schedule for times outside of the data loads. I also added in a step to ensure the loads were not still running when the job went to run.

Test, and then deploy with confidence

With a process created that worked, I had to set about testing it on a large scale to ensure a smooth process by the time my package was going to target our production database. Fortunately, we had 4 full scale dev and QA environments, each of them had a full set of data. There were slight differences in data volumes due to refresh dates, and also some newer tables existed that weren’t in production.

Having so many environments allowed me to focus on batch size to get the most efficient number of rows to delete per run. I set this as a user variable in SSIS which allowed me to pass in the value via the agent job step. If I felt it needed adjusted, I could just adjust the job step without amending the package.

Overall Result

The production run completed over a couple of weeks. In that time, several billion rows of data over hundreds of tables were deleted.

Alternative options

As I was confident all related data would be deleted, I could have disabled the Foreign Key constraints and deleted in any order. I decided against this for the following reasons.

The full purge would take two weeks. As each of the tables also held data that should remain, and also because we would still be running daily loads, I felt this was unacceptable as it could lead to periods of time where the referential integrity was in doubt.
I could forget to re-enable some constraints.
Once re-enabled, the foreign key constraints would be untrusted. They would need to be checked to reset the is_not_trusted value to 0 in sys.foreign_keys.

Efficient maintenance of SSISDB

7th August 2019 By John McCormack 3 Comments

Maintenance of SSISDB within SQL Server

The SSIS Server Maintenance Job is used for maintenance of SSISDB. It manages the retention of operations records in SSISDB. I noticed it had been turned off by someone last year and it hadn’t run since. Therefore, SSISDB had become bloated and there was only 10MB left on the data drive meaning the database could no longer auto grow.

Nobody was aware, why would they be? After all, nothing was failing. We didn’t have disk space monitoring enabled so the only time we found out there was a problem was when the disk had filled up.

I made 2 unsuccessful attempts at running the SSIS Server Maintenance Job. However, after several hours of processing and still no available free space in the database, I knew the job wasn’t coping with the sheer number of rows it had to delete. The deletes all happen from the parent table (internal.operations) and then all child tables using using cascading deletes. This approach maintains referential integrity but is not great for performance.

Due to this, I needed a new approach to the maintenance of SSISDB. As we hadn’t maintained these tables for 13/14 months, I was asking too much of SQL Server to let me delete everything at once. (Truncates wouldn’t do because I had to keep the last 2 weeks data).

A bit of investigation showed me that these were the related tables.

internal.event_message_context
internal.event_messages
internal.executable_statistics
internal.execution_data_statistics
internal.execution_component_phases
internal.execution_data_taps
internal.execution_parameter_values
internal.execution_property_override_values
internal.executions
internal.operation_messages
internal.extended_operation_info
internal.operation_os_sys_info
internal.validations
internal.operation_permissions
internal.operations

My approach

Amend the retention period (days) in catalog.properties to 400 (because 14 was unmanageable with > 400 days of history)
Write a delete script or find a reliable one that does this work due to SSISDB’s native stored procedures failing to cope
Ensure SSISDB is in SIMPLE recovery model because it will reduce t-log growth
Run the delete script and see how it performs and how much space is freed up in order that the days for deletions can be optimised
Repeat steps 1-4 (each time lowering retention period (days)) until I achieve my target retention period of 14
Ensure this never happens again 😎 (because it’s no fun getting 300 failure emails an hour)

1. Reduce retention period (days) in catalog.properties to 400 (This allowed me delete rows based on only 22,000 IDs)

To do this in T-SQL:

[sql]
EXEC [SSISDB].[catalog].[configure_catalog] @property_name=N’RETENTION_WINDOW’, @property_value=400
[/sql]

To do this is SSMS:

Right click SSISDB from Integration Services Catalog in SQL Server Management Studio. Then, amend Retention Period (days) to a suitable initial value – in my case 400 days.

2. Script out or find a reliable script that does this work manually.

I struck gold with a superb script from Tim Mitchell which honours the Retention Period (days) value. I decided this was better than writing my own. Please follow the link to review this script along with other useful information (or likewise, get the latest version from Tim’s Github).

3. Ensure SSISDB is in SIMPLE recovery model (as it helps with transaction log)

[sql]
SELECT DB_NAME(database_id) AS DBName,recovery_model_desc
FROM sys.databases
WHERE DB_NAME(database_id) = ‘SSISDB’
[/sql]
If it’s not, I recommend you ALTER database to SIMPLE recovery in order to minimise logging. It’s not an essential step but will save bloating your transaction log.

[sql]
ALTER DATABASE [SSISDB] SET RECOVERY SIMPLE WITH NO_WAIT
[/sql]

4. Run script (from step 2) and see how it performs and how much space is freed up

[sql]
SELECT NAME,
size/128/1024.0 AS FileSizeGB,
size/128/1024.0 – CAST(FILEPROPERTY(NAME, ‘SpaceUsed’) AS INT)/128/1024.0 AS FreeSpaceGB,
(size/128/1024.0 – CAST(FILEPROPERTY(NAME, ‘SpaceUsed’) AS INT)/128/1024.0) / (size/128/1024.0) * 100 AS PercentFree
FROM SSISDB.sys.database_files
WHERE NAME = ‘SSISdata’;
[/sql]

5. Repeat steps 1-4 (each time lowering retention period (days)) until you achieve your target retention period

I started small by reducing the retention period (days) to 390. This allowed me to measure the time and impact of removing 10 days of data. I then went down to a retention period of 360 days. I found 14 days to be the best performing decrements so I kept doing this until I had only 14 days of data remaining. Following this, I kept the new script in place (scheduled nightly) via SQL Agent. There was no need to continue using the SSISDB cleanup stored procedure internal.cleanup_server_retention_window.

6. Ensure this never happens again

They say prevention is better than cure. Here are some ideas on how to implement this and ensure that your maintenance of SSISDB is ongoing:

SQL Agent Job to check if date of oldest id is greater than expected (And alert of not)
Write a check in PowerShell or even better, consider writing a command for DBATools.
Professional monitoring tools may alert on this but I haven’t checked.

How do I run SQL Server on AWS?

6th November 2018 By John McCormack 3 Comments

How do I run SQL Server on AWS?

Running SQL Server on AWS can be done in 2 ways.

Relation Database Service (RDS): AWS’s managed solution where some of the administration (maintenance, backups and patching) is handled for you.
EC2: Your very own virtual machine in the cloud. With EC2, you manage SQL Server, just like you would do on-premises. This gives you full control over your SQL instance.

Why not just use Azure?

Azure may be the right choice for you. This purpose of this post is to show that you do have options and that Azure is not the only choice. Really, you should weigh up what you need from your cloud provider and make a choice. Managing a multi-cloud environment can be challenging. If you are working for a company already heavily invested it AWS, it may make more sense to stay with AWS. Even if not, AWS’s SQL Server offering is mature and RDS’s features are improving all the time so you should consider what you can achieve with AWS.

Differences/Similarities/Pros and Cons

Feature	EC2	RDS
Use SSMS	✔	✔
Use database as source or target for SSIS	✔	✔
Full control over the instance	✔
Automated backups		✔
Automated patching		✔
Always On Availability groups	✔	✔
AWS-managed Multi AZ deployment		✔
Log Shipping	✔
Replication	✔
Database mail	✔
Linked Servers	✔

A full list of included and excluded features can be viewed at https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_SQLServer.html

RDS key facts

Limited control
- The master user name you use when you create a DB instance is a SQL Server Authentication login that is a member of the processadmin, public, and setupadmin fixed server roles. It is not a member of the sysadmin fixed server role.
- Anyone who creates a db is assigned to db_owner role on that DB.
More administration tasks are handled automatically e.g. backups, OS patching and SQL patching
Bring Your Own licence is no longer available on RDS
License included (covers multi AZ)
- Standard
- Web (Must only be used for public and internet-accessible webpages, websites, web applications, and web services.)
- Enterprise
Backups
- Can still be pushed to S3
- Native backup and restore using .bak files to/from S3
- Versions: 2008 R2 – 2017
Limited to 30 databases per instance
Window/Mixed authentication
- http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_SQLServerWinAuth.html
Multi AZ is very simple to set up
TDE at rest
SSL for data in transit
No database Mail

EC2 approach

Full control of instance
Bring your own licence (BYOL) options
Licence included (AMI)
- Standard
- Enterprise
High Availability Disaster Recovery HADR
- You can use Always On Availability Groups
  - Automatic failover between AZs
  - Synchronous commit
  - Latency about 1ms to 2 ms
- Manual failover between Regions option for increase resiliency

Backups

EC2, you are responsible for all of your own backups. Backups can be pushed to S3 to save reduce costs. I have a full post dedicated to SQL Server backups on EC2. You could also store backups on EBS volumes attached to your instance however this is much more expensive than storing on S3.

RDS will handle backups for you. You can restore to a point in time in RDS withing the backup window you configure. Transaction logs are taken every 5 minutes so you may experience data loss up to 5 minutes. All databases are restored with 1 second of each other. The downside to this is that transactions spanning multiple databases may not be recovered consistently.

Summary

Features are being added to RDS all the time however if you need features in the excluded list, you would need to go with an EC2 installation. However if your business size doesn’t allow for DBAs or database specialists, RDS may be the best option as AWS take care of all the installation and maintenance for you.

Resources

Search Results for: ssis

SSIS and DTUTIL background

Migrating a single package

Migrating multiple packages

Upload SSIS packages from disk

Download SSIS packages to disk (Great for backup)

Summary

Search and replace inside SSIS package

Steps

T-SQL Tuesday #98 – Your Technical Challenges Conquered

Deleting data from a data warehouse

Challenges

Find out the scope of the data & Identify the data to be purged

Identify the order for the purge to take place

Create a process for deleting data that will not interfere with our daily loads

Test, and then deploy with confidence

Overall Result

Alternative options

Maintenance of SSISDB within SQL Server

My approach

1. Reduce retention period (days) in catalog.properties to 400 (This allowed me delete rows based on only 22,000 IDs)

2. Script out or find a reliable script that does this work manually.

3. Ensure SSISDB is in SIMPLE recovery model (as it helps with transaction log)

4. Run script (from step 2) and see how it performs and how much space is freed up

5. Repeat steps 1-4 (each time lowering retention period (days)) until you achieve your target retention period

6. Ensure this never happens again

Further reading

Why not just use Azure?

Differences/Similarities/Pros and Cons

RDS key facts

EC2 approach

Backups

Summary