Composing and Decomposing Data in SSIS

02 October 2020
KingswaySoft Team

In a perfect world, integrating data from one application or database system to another should be a straight-forward and simple process. For instance, if you are integrating client information, you should map the client's name in the first system to the name field in the second one with ease. This simple approach should also apply to the client's address, telephone number, email address and any other data collected. In theory, this should all work out well; however, there are situations where it is not always that effortless.

Let us consider this example:

Delimited Data and Tabular Data

Table A has multiple records, and each record belongs to a different set of groups. However, the group assignment for a certain record is comma-separated data values in a single cell representing all the groups a record can be a member of. This format is not compatible with the target integration system; therefore, data transformation is required so that each row can represent only one group assignment as shown in Table B.

A real-life example of this scenario could be a client with multiple phone numbers. The numerous telephone details are stored within a single field on the source application, separated by a comma. As for the destination application or database, each telephone number needs to be in its own field. This makes it challenging to extract the data into a typical tabular form to integrate successfully. 

This blog post will demonstrate how using our latest components within the SSIS Productivity Pack, Composition and Decomposition, users can parse each individual value from a delimited text form with ease and vice versa.

Using Decomposition Component

Continuing with our multiple telephone number example mentioned earlier, here is an SSIS demonstration where each contact could have multiple phone numbers. All the phone numbers for a given contact are included in a single cell, separated by a comma. By enabling Data Viewer, we can examine the input records' format and see how this data is represented.

Default Output Data Viewer of Delimited Data

As shown above, this may not be the standard tabular format we expect to integrate with another database system or application with a defined schema. Using the out-of-the-box components of SSIS, a possible solution for this problem would be to use a SQL script to turn the comma-separated string into individual rows from the Source itself. However, this would not work if the source system is not a database. A second possible solution would be using a Script component to parse this data; however, this will require writing your own C# or VB code, which can be complicated and harder to maintain in integration packages.

As of version 20.1 of our SSIS Productivity Pack, we have added new Composition and Decomposition components, which can be used to work with these types of data formats. The Decomposition component is a data flow transformation component used to split delimited input values into multiple rows. Back to our example, attach the source component to the Decomposition component. Configure the component as follows:

Decomposition Component Editor

First, select the Input Column that will include the multiple data values; in this case, this is the PhoneNumber column. Next, select the format of the column by working with the Source Type option. Choose from the following options:

  • Delimited Text - If the input is in delimited text format.
  • JSON Array - If the input is a JSON array.
  • XML Elements - If the input is XML format.

In this case, the input is a "Delimited Text" and thus, configure the Delimited Text settings based on the format expected. Finally, in the Output Settings section, choose to have the parsed data values in its own new Column, or replace the existing Input Column with the new data values.

Once the Decomposition Component has been configured, enable Data Viewer to examine the format of the output. As you can see, the data has been parsed and split into different rows.

Decomposition Component Data Flow Results

Now each record will represent a contact along with a single Phone Number value. If there is a contact with multiple phone numbers, there will be multiple records for the same contact; each contains a single phone number.

Using Composition Component

The Composition component performs the exact opposite use case of the example discussed earlier. Here the dataset includes repeating rows for the same contact, each with a different phone number. The goal is to group these data values in a single cell based on a given key. So the input data is represented in the following format:

Default Output Data Viewer of Tabular Data

In this case, use our Composition component, a data flow transformation component used to combine/merge multiple rows into one row based on a defined set of keys. Therefore, attach the source component to our Composition component and configure the component as follows:

Composition Component Editor

Select the Input Column that will include the data values to group together. In this scenario, this is the PhoneNumber column. Next, select the column's format by working with the Destination Type option consisting of the same three settings mentioned earlier (Delimited Text, JSON Array, XML Elements).

Select the Key Columns which will be the criteria of which values should be grouped. In this example, it is the ID field. This means that records that share the same value for the ID field will have their PhoneNumber values grouped and separated by a comma.

There are also additional options that can be configured based on the format of the input data. For instance, the "Input Records Are Scrambled" option can be enabled if the input records are not sorted based on the Key Field. Additionally, the Output Settings section can output the combined data values in its own new Column or replace the existing Input Column with the new data values.

With the Composition component configured, enable Data Viewer to examine the output format to show how the data has been grouped in a single row based on a common ID value.

Composition Component Data Flow Results

This way, each record will represent a contact and all the PhoneNumber values that belong to it based on the ID field as a key.

Closing Notes

With our new Composition and Decomposition components, working with delimited or tabular-formatted data is now easier than ever before. Whether looking to extract each individual data value into a typical tabular format from a delimited-text formatted cell (use Decomposition) or group data values together into a single cell where values are separated by a delimiter (use Composition), our SSIS Productivity Pack will provide you with the tools needed to integrate virtually any application or database system together, regardless of how the data is formatted.

Archive

October 2024 1 September 2024 1 August 2024 2 July 2024 1 June 2024 1 May 2024 1 April 2024 2 March 2024 2 February 2024 2 January 2024 2 December 2023 1 November 2023 1 October 2023 2 August 2023 1 July 2023 2 June 2023 1 May 2023 2 April 2023 1 March 2023 1 February 2023 1 January 2023 2 December 2022 1 November 2022 2 October 2022 2 September 2022 2 August 2022 2 July 2022 3 June 2022 2 May 2022 2 April 2022 3 March 2022 2 February 2022 1 January 2022 2 December 2021 1 October 2021 1 September 2021 2 August 2021 2 July 2021 2 June 2021 1 May 2021 1 April 2021 2 March 2021 2 February 2021 2 January 2021 2 December 2020 2 November 2020 4 October 2020 1 September 2020 3 August 2020 2 July 2020 1 June 2020 2 May 2020 1 April 2020 1 March 2020 1 February 2020 1 January 2020 1 December 2019 1 November 2019 1 October 2019 1 May 2019 1 February 2019 1 December 2018 2 November 2018 1 October 2018 4 September 2018 1 August 2018 1 July 2018 1 June 2018 3 April 2018 3 March 2018 3 February 2018 3 January 2018 2 December 2017 1 April 2017 1 March 2017 7 December 2016 1 November 2016 2 October 2016 1 September 2016 4 August 2016 1 June 2016 1 May 2016 3 April 2016 1 August 2015 1 April 2015 10 August 2014 1 July 2014 1 June 2014 2 May 2014 2 February 2014 1 January 2014 2 October 2013 1 September 2013 2 August 2013 2 June 2013 5 May 2013 2 March 2013 1 February 2013 1 January 2013 1 December 2012 2 November 2012 2 September 2012 2 July 2012 1 May 2012 3 April 2012 2 March 2012 2 January 2012 1

Tags