Using the Data Anonymizer Component

The Data Anonymizer component is a transformation component used to mask data. Select columns to mask and select their anonymization type. The values in these columns will be replaced with randomly generated data based on the anonymous type selected.

Columns Page

Select a column in the data grid view and configure its anonymization settings in the property grid to the right.

Data Anonymizer

Anonymizer Properties

Anonymizer properties for the generated field. These values are configurable.

  • Anonymization Type: The type of data to generate.

    The following are the Anonymization Types available for selection, some Anonymization Types have additional properties that can be configured:

    Anonymization Type Description
    <Ignore> The existing value of the column will remain.
    <Null> Every record will be NULL.
    <Variable> A variable can be specified in the field “Variable name” under variable properties.

    Address Line 1

    (since v9.0)

    This generates a street address. E.g. 123 Main Street. There is one additional parameter:

    • Include Address Line 2: Enabling this will generate more precise addresses. E.g. 123 Main Street Unit 04

    Address Line 2

    (since v9.0)

    This generates street address line 2 information. E.g. Unit 32, PO Box 4560, etc.
    Bool This generates a Boolean value. E.g. True or False
    City

    This generates a city name. E.g. Toronto, New York, etc. Additional parameters include:

    • Country (since v9.0): Select a country from the drop-down. The city generated will be in the selected country. Select the <Random> option to generate a city from any country.
    • State / Province (since v9.0): After selecting a country, select a state/province from the drop-down. The city generated will be in the selected state/province. Select the <Random> option to generate a city from any state/province.
    Company Catch Phrase This generates a company catch phrase.
    Company Name This generates a company name.
    Country This generates a country name. E.g. United States, Canada, etc.

    Country (ISO Alpha 2)

    (since v9.0)

    This generates an ISO-3166 Alpha-2 code. E.g. US, CA, etc.

    Country (ISO Alpha 3)

    (since v9.0)

    This generates an ISO-3166 Alpha-3 code. E.g. USA, CAN, etc.

    Country (ISO Numeric)

    (since v9.0)

    This generates an ISO-3166 Numeric code. E.g. 840, 124, etc.

    Credit Card CVV

    (since v9.0)

    This generates a credit card CVV. E.g. 100, 435, etc. Additional parameters include:

    • Credit Card Type: Select from a list of supported credit card types to auto-fill the Format parameter. Select the <Random> option and leave the Format parameter blank to generate any CVV.
    • Format: Specify a format to use when generating the CVV. This will convert all '#' to random numbers.

    Credit Card Number

    (since v9.0)

    This generates a credit card number. E.g. 3962 650752 6504, 3962 650752 6504, etc. There is one additional parameter:

    • Credit Card Type: Select from a list of supported credit card types. The number generated will respect the credit card type's number range and number format. Select the <Random> option to generate a credit card number from any credit card type.

    Credit Card Type

    (since v9.0)

    This generates a random credit card type. E.g. American Express, Visa, etc.

    Custom

    This option allows you to specify your own anonymous values. Additional parameters include:

    • List Of Values: Specify the available anonymous values.
    • Delimiter: Specify the delimiter for your list.
    • Spawn Order: Specify the order of the anonymous values - Random or Sequential.
    Date Time

    This generates a date time value. Additional parameters include:

    • Minimum Year: Specify the minimum year for the generated date time value.
    • Maximum Year: Specify the maximum year for the generated date time value.
    Domain Name This generates a domain name. E.g. emard.com, armstrong.co.uk, etc.

    Email (Business)

    (since v9.0)

    This generates a business email address. E.g. [email protected], [email protected], etc. There is one additional parameter:

    • Company: Specify a static company name for the generated email address value.

    Email (Personal)

    (since v9.0)

    This generates a personal (free) email address. E.g. [email protected], [email protected], etc. There is one additional parameter:

    Domain: Specify a static domain for the generated email address value.

    Facebook URL

    (since v9.0)

    This generates a fake Facebook profile URL. E.g. https://facebook.com/john.doe91, https://facebook.com/emily.smith48, etc.

    File Content

    Use this anonymization type to randomly select files from the directory defined. Additional Parameters include:

    • Path To Parent Directory: Specify the path of the parent directory which contains sample files.
    • File Selector: Specify wildcard characters to select files.
    • Include Subdirectories: Specify whether to read files under sub-folders.
    First Name

    This generates a first name. E.g. John, Emily, etc. There is one additional parameter:

    • Gender(since v9.0): Specify a gender from the drop-down to limit the names that can be generated to the selected gender. Select the <Random> option to generate any name regardless of gender.
    Full Name

    This generates a full name. E.g. John Doe, Emily Smith, etc. There is one additional parameter:

    • Gender(since v9.0): Specify a gender from the drop-down to limit the names that can be generated to the selected gender. Select the <Random> option to generate any name regardless of gender.
    GUID This generates a GUID value. E.g. 443a9a58-9afc-aec0-7570-fbaf61f49553, 70a0754f-dbe8-20a9-4e03-9c17d05d16d3, etc.

    Gender

    (since v9.0)

    This generates a gender. E.g. Male or Female
    IPv4 Address This generates an IPv4 address. E.g. 112.2.191.50, etc.
    IPv6 Address This generates an IPv6 address. E.g. a84:902f:e8ab:e255:e46c:182b:7a27:2bee, etc.

    Identification Number

    (since v9.0)

    This generates an identification number using the Luhn algorithm. E.g. 748-81-8416, 482-31-7146, etc. Additional parameters include:

    • Type: Select one of the preset identification number types to auto-fill the Format parameter. Select the <Custom> option to specify your own Format.
    • Format: Specify a format to use when generating the identification number. This will convert all '#' to random numbers.
    Incremental Value

    This outputs numbers incrementally. Additional parameters include:

    • Starting Value: Specify the starting number.
    • Incremental Value: Specify the value added to the last generated number.
    Last Name This generates a last name. E.g. Smith, Kiehn, etc.

    Name Prefix

    (since v10.0)

    This generates a name prefix.E.g. Mr, Miss, etc.

    Name Suffix

    (since v10.0)

    This generates a name suffix.E.g. PhD, Sr, etc.
    Number

    This generates a decimal number. The number of decimal places is represented by the Decimal Places parameter. Additional parameters include:

    • Minimum Value: Specify the minimum value of the number generation range.
    • Maximum Value: Specify the maximum value of the number generation range.
    • Decimal Places: Specify the number of decimal places. If decimal places are not desired, set this option to 0.
    Paragraph This generates a paragraph of fake/meaningless text.
    Random String

    This generates a random string. Additional parameters include:

    • Valid Characters: specify the valid characters.
    • Minimum Length: specify the minimum length of the random string.
    • Maximum Length: specify the maximum length of the random string.

    Regex

    (since v9.0)

    This generates a string based on a regular expression. For instance a regular expression of [A-Z]{3}-[0-9]{3} will generate values like: YMV-718, FOK-151, etc. There is one additional parameter:

    • Pattern: The regular expression to generate a string from.
    Row Index

    This generates a row index. Additional parameters include:

    • Starting Value: Specify the starting row index.
    • Incremental Value: Specify the value added to the last row index.
    Sentence This generates a sentence of fake/meaningless text.

    State/Province

    (since v9.0)

    This generates a state/province. E.g. Ontario, Texas, etc. There is one additional parameter:

    • Country: Select a country from the drop-down. The state/province generated will be in the selected country. Select the <Random> option to generate a state/province from any country.

    State/Province Abbreviation

    (since v9.0)

    This generates a state/province abbreviation. E.g. ON, TX, etc. There is one additional parameter:

    • Country: Select a country from the drop-down. The state/province abbreviation generated will be in the selected country. Select the <Random> option to generate a state/province abbreviation from any country.

    Static Value

    (since v9.0)

    Every record will have a static value. There is one additional parameter:

    • Value: Specify the static value to use.

    Street Name

    (since v9.0)

    This generates a street name. E.g. Main Street, Mills Drive, etc.

    Twitter Url

    (since v9.0)

    This generates a fake Twitter profile URL. E.g. https://twitter.com/john_doe, https://twitter.com/emily_smith, etc.

    URL This generates a URL. E.g. http://www.walker.info, http://www.braun.com, etc.

    Zip/Postal Code

    (since v9.0)

    This generates a zip/postal code. E.g. 66582, A1E 4G7, etc. Additional parameters include:

    • Country: Select a country from the drop-down to auto-fill the Format parameter. Select the <Random> option and leave the Format parameter blank to generate any country's zip/postal code. Note that some countries do not have a zip/postal code so doing this will result in some rows having no value.
    • Format: Specify a format to use when generating the zip/postal code. This will convert all '#' to random numbers and all '?' to random letters.
  • Random Seed: The random seed to use when generating random data. Specifying a seed will ensure that the same data is generated every time the component is executed. Specify 0 to not use a random seed (data will differ with each execution).
  • Use Source Data Affinity: Enabling this option will use both the random seed and the input value to generate random data.
Column Properties

Column properties for the field. These values are NOT configurable.

Variable Properties

This is enabled when the <Variable> is specified in Spawn Type and a variable or parameter can be chosen.

Source Data Affinity (since v9.0)

The source data affinity section allows you to easily edit the Use Source Data Affinity property of the selected column. A column set to use source data affinity will use both the random seed and the input value to generate random data. After you have configured this option, you can click the Apply to all columns button to set the Use Source Data Affinity property of all columns.

Random Seed (since v9.0)

The random seed section allows you to easily edit the Random Seed property of the selected column. A column with a random seed specified will ensure that the same data is generated every time the component is executed. A random seed of 0 means no random seed will be used and the data will differ with each execution. After you have specified a seed, you can click the Apply to all columns button to set the Random Seed property of all columns.

Entities (since v9.0)

Entities allow the data generated to be connected, increasing the quality of the data. When entities are not configured the data is disconnected like so:

First Name Last Name Email
John Doe parker.brown@yahoo.com
Emily Smith denis.russel@gmail.com

Notice the Email field is disconnected from the First Name and Last Name fields. Here is the same example data with a person entity configured:

First Name Last Name Email
John Doe john.doe@yahoo.com
Emily Smith emily.smith@gmail.com

If connected data is desired, launch the Entity Editor by clicking the Configure Entities... button. The Entity Editor will allow you to add, remove, and edit entities. After the entities have been configured, an icon will appear beside each column that is associated with an entity identifying what entity it is associated with.

Expression fx Button

Clicking the fx button to launch SSIS Expression Editor to enable dynamic updates of the property at run time.

Generate Documentation Button

Clicking the Generate Documentation button to generate a Word document that describes the component's metadata including relevant mapping, and so on.

Error Handling Page

The Error Handling page allows you to specify how errors should be handled when they happen.

Data Anonymizer Editor - Error Handling

There are three options available.

  1. Fail on error
  2. Redirect rows to error output
  3. Ignore error

When the Redirect rows to error output option is selected, rows that failed to be anonymized will be redirected to the 'Error Output' output of the Transformation Component. As indicated in the screenshot below, the green output connection represents rows that were successfully anonymized, and the red 'Error Output' connection represents rows that were erroneous.

Data Anonymizer Editor - Error Output

Entity Editor Page

The Entity Editor allows you to add, remove, and edit entities.

Data Anonymizer - Entity Editor

Add Entity

The Add Entity button allows you to add up to 5 entities for each entity type. There are currently 3 available entity types:

  • Person: Generates connected person data. The Person entity has the following properties
    • Gender: Generates a random gender (Male or Female)
    • First Name: Generates a random first name based on the Gender property
    • Last Name: Generates a random last name
    • Full Name: Generates a full name based on the First Name and Last Name properties
    • Company Name: Generates a random company name
    • Business Email: Generates a business email based on the Full Name and Company Name properties
    • Personal Email: Generates a personal email based on the Full Name property
    • Facebook URL: Generates a Facebook URL based on the Full Name property
    • Twitter URL: Generates a Twitter URL based on the Full Name property
  • Address: Generates connected Address data. The Address entity has the following properties:
    • Country: Generates a random country
    • State / Province: Generates a random state/province based on the Country property
    • City: Generates a random city based on the Country and State / Province properties
    • Zip / Postal Code: Generates a random zip/postal code based on the Country property
  • Credit Card: Generates connected Address data. The Address entity has the following properties:
    • Card Type: Generates a random credit card type
    • Card Number: Generates a random credit card number based on the Card Type property
    • CVV: Generates a random credit card CVV based on the Card Type property

After you add an entity, the Entity Editor will auto-map existing columns by spawn type and name to the properties of the entity. An entity property can be mapped with the following values

  • <Ignore>: The property will not have a column associated with it. However, if this entity property is used to generate other entity properties it will still be generated during runtime.
  • Existing Column: The property will be used to generate the value of the mapped column during runtime.

To quickly map/unmap an entity property, click the Map/Unmap button (broken link icon to the right of the entity property). It will first try and map by spawn type. If no matches are found it will then try and map by name. If still no matches are found it will set the mapping to <Add New Column>. Unmapping will always set the mapping to <Ignore>.

If the spawn type of the mapped column does not match the spawn type of the entity property, a warning icon will appear beside the entity property. To resolve the warning click on the icon. This will change the spawn type of the mapped column to the spawn type of the entity property when the Entity Editor is saved (the OK button is clicked). Resolving these warnings is not required because using a custom spawn type can be desired when generating entity data, but it also poses the risk of generating disconnected data, defeating the purpose of the entity.

You can remove an entity by clicking the Delete Entity button (top right of the entity).

Map Unmapped Properties

The Map Unmapped Properties button will perform the automap function on all properties of all entities.

Clear Mappings

The Clear Mappings button will set all properties of all entities to <Ignore>.

OK

The OK button will save and exit the Entity Editor. Any changes to the entities or columns will happen when this button is clicked.

Cancel

The Cancel button will exit the Entity Editor without saving.