Find and Merge CRM Duplicates With Ease Using SSIS Productivity Pack

04 November 2016
Chen Huang
Data quality is critical to your application system's user adoption. It is important to make sure that the data in your application system is accurate, complete, and consistent across the application. However, identifying and eliminating duplicate data is not a trivial job without a proper tool, it can be an extraordinary effort, and it can be a frustrating process when not done properly. Living in the data integration space, we understand how important it is to keep your data clean. For this particular reason, our team has developed a comprehensive solution that can help work with the situation, in a productive way.

When it comes to data quality, Microsoft Dynamics CRM (or Dynamics 365 for Sales as the latest name) provides some out-of-box tools that can help work with duplicate data. It allows you to set up Duplicate Detection Rules which can help reduce the opportunity of importing duplicate records, but this doesn’t guarantee eliminating the duplicates that are already in the system, and also your CRM user could always choose to ignore those duplicate warning by entering duplicate data through CRM UI. In addition to that, the out-of-box duplicate detection capabilities are limited to certain matching options which makes it very inflexible.

With that being said, we saw a big need for a better duplicate detection tool, so we recently introduced a Duplicate Detector component in our SSIS Productivity Pack, which is a collection of premium SSIS components that help achieve more when using SSIS as your integration platform.

First, let’s have a quick look at the Duplicate Detector component.

What is Duplicate Detector?

Duplicate Detector is an SSIS component which can be used to scan incoming rows from a data source and identify duplicate rows. It comes with 7 different matching types to meet your different duplicate detection needs. (Click here for the detailed matching algorithm of the component)

SSIS Duplicate Detector

You would typically connect the Duplicate Detector component after an SSIS source component which reads data from the source system that you would like to analyze the duplicates. You will notice that the component comes with two outputs, which are Duplicate Rows and Unique Rows.

The Duplicate Rows output has four additional fields which can be used for data quality analysis:

  • GroupID – A unique GroupID value is given to each group of duplicate records.
  • Richness Score - Each row is given a richness score which represents data quality with the value = (number of fields are not NULL )/(total number of fields).  
  • Richness Rank - Each row in a duplicate group is given a rank based on the richness score.
  • Similarity Score – The similarity percentage of the duplicate rows compared to the top-ranked duplicate row in the group.

Find and Merge CRM Duplicates

Finding CRM duplicates and merging them accordingly becomes a simplified process using KingswaySoft Duplicate Detector and CRM Destination Component.

By taking advantage of the data quality analysis feature of our Duplicate Detector component, the top-ranked duplicate record containing the richest information within the group, can be considered the target record that is being merged into.

  • RichnessRank == 1: the target records that is being merged into.
  • RichnessRank!= 1: the subordinate records of the merging.

Then we use Left Join on GroupCode to find the GUID of the target duplicate record and the GUID of the subordinate records.

The last step is to merge the duplicates in CRM by using CRM destination component, the configuration is extremely simple with just a few clicks to map the input columns to the destination fields. 

The data flow would be something like this:

manage dynamics crm duplicates

Sample SSIS Package

A sample SSIS package can be download here.

The sample package was prepared in an SQL Server 2008 R2 format, and it should automatically upgrade if you are using any newer version of SSIS (you might need to do some layout adjustment to make it look better).

Archive

November 2024 3 October 2024 1 September 2024 1 August 2024 2 July 2024 1 June 2024 1 May 2024 1 April 2024 2 March 2024 2 February 2024 2 January 2024 2 December 2023 1 November 2023 1 October 2023 2 August 2023 1 July 2023 2 June 2023 1 May 2023 2 April 2023 1 March 2023 1 February 2023 1 January 2023 2 December 2022 1 November 2022 2 October 2022 2 September 2022 2 August 2022 2 July 2022 3 June 2022 2 May 2022 2 April 2022 3 March 2022 2 February 2022 1 January 2022 2 December 2021 1 October 2021 1 September 2021 2 August 2021 2 July 2021 2 June 2021 1 May 2021 1 April 2021 2 March 2021 2 February 2021 2 January 2021 2 December 2020 2 November 2020 4 October 2020 1 September 2020 3 August 2020 2 July 2020 1 June 2020 2 May 2020 1 April 2020 1 March 2020 1 February 2020 1 January 2020 1 December 2019 1 November 2019 1 October 2019 1 May 2019 1 February 2019 1 December 2018 2 November 2018 1 October 2018 4 September 2018 1 August 2018 1 July 2018 1 June 2018 3 April 2018 3 March 2018 3 February 2018 3 January 2018 2 December 2017 1 April 2017 1 March 2017 7 December 2016 1 November 2016 2 October 2016 1 September 2016 4 August 2016 1 June 2016 1 May 2016 3 April 2016 1 August 2015 1 April 2015 10 August 2014 1 July 2014 1 June 2014 2 May 2014 2 February 2014 1 January 2014 2 October 2013 1 September 2013 2 August 2013 2 June 2013 5 May 2013 2 March 2013 1 February 2013 1 January 2013 1 December 2012 2 November 2012 2 September 2012 2 July 2012 1 May 2012 3 April 2012 2 March 2012 2 January 2012 1

Tags