Incremental Loads in SSIS are often used to keep data between two systems in sync with one another. They are used in cases when source data is being loaded into the destination on a repeating basis, such as every night or throughout the day.
Typical incremental load processes within SSIS involve lookups to find changes and staging tables in order to update a database due to limitations with OLE DB/ADO.NET Destination components. This limits the data we are able to work with and can take more time to develop.
At KingswaySoft, many of our SSIS components include features that enable developers to easily develop performant incremental load solutions within just a couple of steps. In this post we will be looking at how this can be achieved using our Premium ADO.NET components within a single source to destination data flow without the need for any lookups or staging tables.
Traditional SSIS Incremental Load vs Upsert method
Simple data flow using SSIS Productivity Pack and Upsert without lookups, staging tables, or coding.
Our Premium ADO.NET components offer greater metadata handling and flexibility when working with relational databases. The destination component also includes the ability to Insert, Update, Delete, Upsert, or to insert a custom command to write data to a database.
In this post we will be using Upsert to assist with incremental load in SSIS. Upsert is a combination of Update and Insert. When writing to your target system with the Upsert action, the component will check if a record already exists in the system based on the matching field you specified. If it finds a match, it will update that record, if it does not, it will create a new record.
What this means for your incremental load development is less time developing your package and greater performance. You no longer need to perform a lookup and conditional split before the destination, you have the option of going directly from the source to the destination and the Upsert action is completely configured in 2 simple clicks.
Let’s take a closer look at the entire process for easy incremental loads in SSIS.
First we will create two connection managers (source & destination) using the out of box ADO.NET connection manager and selecting the database we wish to retrieve from.
Next, we will create a data flow task and enter into it. For this example we will drag out the Premium ADO.NET Source to pull from our source system though any other source you need to work with would also work.
From here we will configure our component, connecting to the database and selecting the table we wish to read from. Notice a SQL Query is already generated for us based on the columns stored in that table. The table we are using in this example is tracking changes using the field labeled ModifiedOn, this is a datetime field the database is configured to update when a row is added or modified. We can use this field to filter out any data that has not been updated since the last pull which is being stored in an SSIS variable. This will greatly help with performance when we write to our target system later on.
Now that we have our source configured we can connect it directly to the Premium ADO.NET Destination component.
In configuring our destination component we will select the 'Target Database' connection manager we established earlier. Next we will select Upsert as our write action. As mentioned this action a combination of Insert and Update which is what is allowing us to skip the lookup & conditional split steps traditionally used in incremental data loads in SSIS. From there we can select the appropriate table or create a new one. We will also select 'Use Bulk' to speed up performance (note as of the writing of this blog post, this option is only supported when working with SQL or Oracle databases).
Over on the Columns Page, you will notice any fields from the source with the same name as the destination have already been mapped. We can then use the drop-downs to complete any of the additional mappings. Next we need to select our key field. The key field is what the component will use to match records to check if the record exists in the target system, it will need to find an exact match. You can select 1 or multiple fields for this case. In our example we have specified the ContactID to be our key field.
We can then complete my component configuration and execute our task.
The majority of our other destination components also support Upset capability meaning this process can be replicated for performing incremental loads across many different applications. Some components may also include additional features to further enhance performance with incremental loads. Likewise, many of our other source components also support some form of filtering to assist with lowering the number of records we are working with to only those that have been changed.
This is a very simple and straight forward incremental load using SSIS and assumes records are not being removed from either database. For handling deleted records and other more advanced requirements there are many other tricks you can use using the SSIS Productivity Pack, we will cover some of them our next blog post.
This solution requires SSIS Productivity Pack, you can click here to download the free developer's license to get started.