Often when developing ETL packages there are times when you need to input a value to your target not found within your source system. When this occurs lookups are used, the general idea of the lookup is to retrieve the value you need by matching one or more fields from your source system to the corresponding field(s) in another system (lookup table). When a match is found, the lookup will pull the value you need from the lookup table.
As a basic example, a source system has FirstName, LastName, and Email however the target system requires the CustomerID field to be mapped. Since my source system does not have the CustomerID value, we can perform a lookup to find that value. To perform a lookup we pull a table we know has the CustomerID field we need to use at the lookup table, we then match a field we do have in our source to the corresponding field in our lookup table. For this example, based on the data we have available we would likely want to perform this matching on the email field. The lookup will use the email in the source and match it to a row with the same email in the lookup table. Once it finds a match, it will pull the CustomerID from that row so we have it available in our pipeline.
This is a very straight forward example however lookups can quickly become much more elaborate. It is not uncommon to see several lookups within one data flow in order to accommodate various conditions that need to be met. This leads to a rather complex data flow that can be difficult to set-up properly and maintain effectively.
At KingswaySoft we want to make a developer's job easier, this is why we added a lookup feature to our Premium ADO.NET Destination component in our most recent release. The lookup feature helps simplify many of these complexities and offers significant advantages by easily accommodating multiple lookup conditions as well as the ability to map mismatched data types. Instead of one or more individual lookup components, Premium ADO.NET Destination offers a built-in lookup functionality to perform multiple lookups directly within the target destination component.
Let’s take the same simple use case as an example. Here we are retrieving data from a source and inserting it to a database, we want to find the CustomerID so we perform a lookup.
Using a lookup component our data flow might look something like below:
In this data flow we’re retrieving the CustomerID from our lookup table when the email from the source matches the email from our lookup table. However our source includes both new and updated rows so there will be cases when no match is found. To handle this we need to configure an additional output for when no match is found.
We now have 2 outputs we need to configure; the match output will have rows that need to be updated in our database, the no match output has new rows that need to be created.
For the no match output we can perform a simple Insert using one of the standard database destination components (OLE DB, ADO.NET, or ODBC). However it becomes a bit trickier for the update records, the OLE DB Command component will need to be used and a SQL statement will needed to be written in order to perform an update.
This is just a very basic lookup example which requires a couple of components and a number of steps that would increase exponentially as more conditions are introduced. Using Premium ADO.NET Destination and with built-in lookup for the same example, we need only 1 component and with a couple simple steps requiring no coding. Our data flow will look like the following:
In this example, the lookup and create & update can all be handled within a single Premium ADO.NET Destination. Let’s take a closer look:
On the first page we make use of the Upsert write action which will allow us to both update existing records and create new records based on a matching criteria we specify, this is all handled natively by the component without us needing to generate a query. Configuring the upsert matching criteria is done on our columns page by selecting a key field. Our column mappings and lookups will also be configured on this page as show in the next screenshot.
Here we have email specified as our key field to perform our Upsert matching on. This means prior to writing a row, the component will first look for that email in our target, if it finds a match it will update that row, if it does not find a match it will create a new row.
You can also see which fields we’re performing lookups on. When performing the mappings, we have the option of selecting <lookup> instead of an input field item. When <lookup> is selected, we can then launch the Lookup Editor for that column.
When configuring the lookup the editor will allow you to select any table from the currently selected connection manager to perform the lookup. You will then select the field that should be retrieved and written to the target once the match is found. In this example we need to populate the ParentCustomerId field; to do so we are returning CustomerID from our lookup tale (LeadBase) and using the email field as the lookup condition to perform the matching. When the component finds a match between rows it will retrieve the CustomerID from our lookup table and pass it through to the ParentCustomerID field when writing to our target.
To find a match you can enter as many conditions as you need, you can perform the lookup based on one lookup field or multiple with support for AND as well as OR conditions. The lookup feature also supports various operators and the ability to match based on a field, SSIS variable, or a static value, as well as the option to provide a default value if no match is found. This allows the component to easily handle some complex lookup requirements.
We can also configure additional lookup fields within this same component. This means instead of a data flow with several lookups, joins, and destinations which can become time consuming to develop and maintain; we can drastically simplify this with Premium ADO.NET Destination:
Here we have all the complexities of the original data flow with multiple lookups and conditions easily in one component. This makes it much easier to develop initially and maintain later on as a quick look will tell us all the lookups we’re making and we can make modifications to a lookup without concern of disrupting the downstream pipeline.
In addition to offering lookup and upsert writing to drastically simplify SSIS data flows and development, Premium ADO.NET offers many other features to take advantages of including:
- Multiple write actions: Insert, Update, Upsert, Delete, Full Sync, Custom Command
- Greater Metadata handling: including support for mapping and performing lookup on different data types and the ability to refresh metadata instead of having to rebuild the component when there are metadata changes in the system.
- Lookup supports
- Multiple lookups, conditions, operators.
- Ability to match based on column field, variable, or static value.
- Mapping mismatched data types.
- Ability to provide a default value if no match is found
- Duplicate handling options
- Support for Bulk writing
- Supports SQL, ODBC, OLE DB, Oracle Client, ODP.NET Managed & Unmanaged, and MySQL data providers.
These are just some of the features included within the component, you can refer to Premium ADO.NET Destination documentation for further details on the features and how to use the component.
We hope you find the lookup feature within the destination useful. We have similar functionality available within our Microsoft Dynamics 365 CE/CRM/CDS Destination, Salesforce Destination, and SharePoint Destination components. For other cases when you require lookup we also offer a Premium Lookup component which will allow you to use any data input as the lookup source, supports retrieving multiple fields from the lookup table, and includes many unique matching options.
Please feel give Premium ADO.NET Destination a try and let us know what you are able to achieve with it!