Using the Address Parser Component
The Address Parser Component is an SSIS data flow pipeline component that can be used to extract the address details from free form-text. As it is a transformation component, the Address Parser component requires input from a Source component from the Upstream data flow.
There are three pages that can be configured:
- General
- Outputs
- Error Handling
General Page
The General page allows general properties to be set for the component. Properties are specific to the provider that would need to be chosen in the component.
- Provider
-
There are two providers that are supported:
- NetAddress
- LibPostal
Note: NetAddress only supports US and Canada addresses.
- Address Input Column
-
The Address Input Column can accept a field from the upstream data flow that has the Address data in it.
- Download Data File… (Available only when LibPostal provider is selected)
-
The Download Data File button can be clicked to download the data file with the extracted data.
-
Note: LibPostal data files are not up to date.
- City, State, Zip Code Input Column
-
The City, State, Zip Code Input Column can be used to choose an upstream column that contains the City, State and Zip code details.
- Are Address / City, State, Zip Combined (Available only when City, State, Zip Code Input Column field is not specified)
-
Enabling this option would let you specify a delimiter to identify the City, State and Zip Code details.
- Address / City, State, Zip Delimiter (Available only when Are Address / City, State, Zip Combined is enabled)
-
A delimiter can be provided to identify the City, State and Zip code from the combined input field data when the “Are Address / City, State, Zip Combined” property is set to “True”.
- Are Street Number / Suite Combined
-
Set this property to Boolean (True/False) to indicate whether or not the Street_Number property also contains a Suite_Number.
- Numeric Street Conversion
-
Set this property to Boolean (True/False) to indicate whether or not to convert a spelled-out ordinal street name to an ordinal number. (“Third” converts to “3rd”, etc.)
- Do Not Convert Numeric Address to Ordinal
-
Set this property to Boolean (True/False) to indicate whether or not to convert a numeric street name to an ordinal number. (“3” converts to “3rd”, etc.)
- Convert Box to PO Box
-
Set this property to Boolean (True/False) to indicate whether or not to convert "Box" to "PO Box" when "Box" is the only address found.
- Output Capitalization Strategy
-
There are four options that are supported to decide the Output Capitalization Strategy to indicate your capitalization preference for the output address.
- None
- Upper
- Lower
- Mixed
- Refresh Component
-
The refresh component button can be used to refresh the metadata on the columns page.
Outputs Page
The address parser component Outputs page can be used to decide which fields to return the values.
On the top left of the grid, the checkbox can be used to toggle the selection of all available fields. This is a productive way to check or uncheck all available fields.
If the Show Passthrough Columns checkbox is checked all of the input columns will be shown in the output grid. These columns can not be toggled off and are available only to give a better idea of what the output will look like.
The Columns Page grid consists of:
- Field Name: Column that will be retrieved from input data.
- Data Type: The data type of this field.
Error Handling Page
The Error Handling page allows you to specify how errors should be handled when they happen.
There are three options available.
- Fail on error
- Redirect rows to error output
- Ignore error
When the Redirect rows to error output option is selected, rows that failed to write to the Address Verification will be redirected to the ‘Error Output’ output of the Destination Component. As indicated in the screenshot below, the green output connection represents rows that were successfully written, and the red ‘Error Output’ connection represents erroneous rows. The ‘ErrorMessage’ output column found in the ‘Error Output’ may contain the error message that was reported by the server or the component itself.