HBase Destination Component
The HBase Destination Component is an SSIS data flow pipeline component that can be used to write data to HBase. You can Upsert, and Delete records using this component. There are three pages of configuration:
- General
- Columns
- Error Handling
The General page is used to specify general settings for the HBase Destination component. The Columns page is used to manage columns from the upstream component. The Error Handling page allows you to specify how errors should be handled when they occur.
General Page
The General page allows you to specify general settings for the component.
- Connection Manager
-
The HBase Destination Component requires a connection in order to connect to HBase. The Connection Manager drop-down will show a list of all connection managers that are available to your current SSIS packages.
- Namespace
-
The Namespace drop-down displays a list of Namespaces you can choose from, or specify the Namespace in the field.
- Table
-
The Table drop-down displays a list of available tables for the instance specified in the Connection Manager.
- Create Table...
-
This command will launch the HBase Table Creator, which you can use to create a database table based on the input columns from upstream components.
It auto-generates a command based on the selected Connection Manager and Input Columns to create a new table. You can further customize the command to suit your needs, and then click the 'Execute Command' button. You will be informed if the command was executed successfully or not, and the table you created will be selected in the Destination Table property.
- Namespace
-
The Namespace drop-down displays a list of Namespaces you can choose from, or specify the Namespace in the field.
- Table
-
The Namespace drop-down displays a list of Namespaces you can choose from, or specify the Namespace in the field.
- Time to Live (seconds)
-
The Time to Live (seconds) option allows you to set cells to expire after a number of seconds.
- Grid
-
The Grid section includes the Column Family, Column Name, and Column Type.
- Drop Table...
-
This option will remove the table from the database.
- Discover Metadata...
-
The HBase Metadata Discovery dialogue allows you to configure the following settings:
- Namespace: Specify the namespace of the field.
- Table: The Table drop-down displays all available tables.
- Batch Size: The Batch Size option allows you to specify how many records you want to send to the target database server at a time. The default value is 0 meaning SSIS buffer size will be used.
- Pages to Scan: This option allows you to specify the maximum number of retrieved pages that will be scanned.
- Preview: The Preview button displays the metadata in tables with a drop-down that allows you to select the data type. If preview data does not render correctly, the wrong data type may be selected.
-
The Namespace drop-down displays a list of Namespaces you can choose from, or specify the Namespace in the field.
- Batch Size
-
The Batch Size option allows you to specify how many records you want to submit to the HBase at a time (each service call).
- Action
-
Choose the Action from the drop-down list.
- Insert
- Update
- Upsert
- Delete
- Refresh Component Button
-
Clicking the Refresh Component Button will bring up a prompt for you to confirm the refresh. After clicking “OK”, it will remove any existing columns and add any columns it can find in the specified table.
- Reset Columns Button
-
Clicking the Reset Columns button will bring up a prompt for you to confirm the reset. After clicking “OK”, it will remove any existing columns and replace them with the Input Columns.
- Map Unmapped Fields Button
-
By clicking this button, the component will try to map any unmapped attributes by matching their names with the input columns from upstream components.
- Clear All Mappings Button
-
By clicking this button, the component will reset (clear) all your mappings in the destination component.
- Expression fx Icon
-
Click the blue fx icon to launch SSIS Expression Editor to enable dynamic updates of the property at run time.
- Generate Documentation Icon
-
Click the Generate Documentation icon to generate a Word document that describes the component's metadata including relevant mapping, and so on.
Columns Page
The Columns page of the HBase Destination Component allows you to map the columns from upstream components to the HBase Fields.
On the Columns page, you will see a grid that contains four columns as shown below.
The Columns Page grid consists of:
- Input Column: You can select an input column from an upstream component here.
- HBase Field: Column that will be retrieved from HBase.
- Data Type: The data type of this field.
- Properties window for the field selected:
- CodePage: Specify the Code Page of the field.
- Data type: The data type can be changed accordingly.
- Length: Specify the length of the fields. If the data type specified is a string, the length specified here would be the maximum size.
- Name: Specify the column name.
- Precision: Specify the number of digits in a number.
-
HBase Datatype: Specify the HBase Datatype by choosing from the drop-down. The below can be chosen from the list.
- None
- String
- Hash
- List
- Sorted Set
- Complex
- JSON
- Scale: Specify the number of digits to the right of the decimal point in a number.
- Column Family: Specify the column family to define the logical and physical grouping of columns.
- Import External Columns: Option to import the columns and their properties from a file.
- Export External Columns: Option to save the columns and their properties to a JSON file for later reuse.
- + sign: Add field to HBase.
- - sign: Remove field from HBase.
Error Handling Page
The Error Handling page allows you to specify how errors should be handled when they happen.
There are three options available.
- Fail on error
- Redirect rows to error output
- Ignore error
When the Redirect rows to error output option is selected, rows that failed to write to the HBase will be redirected to the ‘Error Output’ output of the Destination Component. As indicated in the screenshot below, the blue output connection represents rows that were successfully written, and the red ‘Error Output’ connection represents rows that were erroneous. The ‘ErrorMessage’ output column found in the ‘Error Output’ may contain the error message that was reported by the server or the component itself.