HBase Source Component

The HBase Source Component is an SSIS data flow pipeline component that can be used to read/retrieve data from HBase.

The component includes the following two pages to configure how you want to read data from HBase.

  • General
  • Columns

General Page

The General page of the HBase Source Component allows you to specify the general settings of the component.

HBase Source Component

Connection Manager

The HBase Source Component requires a connection in order to connect to an HBase instance. The Connection Manager drop-down will show a list of all connection managers that are available to your current SSIS packages.

Namespace Separator

The Namespace Separator drop-down displays a list of Namespaces you can choose from, or specify the Namespace in the field.

Table

The Table drop-down displays all available tables, which upon selection will discover and retrieve all metadata.

HBase Metadata Discovery

HBase Source Component - Add Metadata

The HBase Metadata Discovery dialogue allows you to configure the following settings:

  • Namespace: Specify the namespace of the field.
  • Table: The Table drop-down displays  all available tables.
  • Batch Size: The Batch Size option allows you to specify how many records you want to send to the target database server at a time. The default value is 0 meaning SSIS buffer size will be used.
  • Pages to Scan: This option allows you to specify the maximum number of retrieved pages that will be scanned.
  • Preview: The Preview button displays the metadata in tables with a drop-down that allows you to select the data type. If preview data does not render correctly, the wrong data type may be selected.
Batch Size

Choose the Batch Size for reading the data from the HBase instance.

Filter
Expression fx Icon

Click the blue fx icon to launch SSIS Expression Editor to enable dynamic updates of the property at run time.

Generate Documentation Icon

Click the Generate Documentation icon to generate a Word document that describes the component's metadata including relevant mapping, and so on.

Columns Page

The Columns page of the HBase Source Component shows you all available attributes from the table that you specified on the General page.

HBase Source Columns.png

On the top left of the grid, you can see a checkbox, which can be used to toggle the selection of all available fields. This is a productive way to check or uncheck all available fields.

The Columns Page grid consists of:

  • HBase Field: The column that will be retrieved from HBase.
  • Data Type: The data type of this field.
  • Properties window for the field selected:
    • + sign: Add field to HBase.
    • - sign: Remove field from HBase.
    • Column Properties
      • CodePage: Specifiy the Code Page of the field.
      • Data type: The data type can be changed accordingly.
      • Length: Specify the length of the fields. If the data type specified is a string, the length specified here would be the maximum size.
      • Name: Specify the column name.
      • Precision: Specify the number of digits in a number.
      • Scale: Specify the number of digits to the right of the decimal point in a number.
      • Column Family: Specify the column family to define the logical and physical grouping of columns
      HBase
      • HBase Datatype: Specify the HBase Datatype by choosing from the drop-down. You can choose from the below.
  • Import External Columns: Option to import the columns and their properties from a file.
  • Export External Columns: Option to save the columns and their properties to a JSON file for later reuse.