Using the Databricks Connection Manager

The Databricks Connection Manager is an SSIS connection manager component that can be used to establish connections with Databricks.

To add a Databricks connection to your SSIS package, right-click the Connection Manager area in your Visual Studio project, and choose "New Connection..." from the context menu. You will be prompted to the "Add SSIS Connection Manager" window. Select the "Databricks" item to add the new Databricks connection manager.

New Connection

Add Databricks Connection

The Databricks Connection Manager contains the following two pages which configure how you want to connect to Databricks.

  • General
  • Advanced Settings

General Page

The General page on the Databricks Connection Manager allows you to specify general settings for the connection.

Databricks Connection Manager

Authentication
Databricks Host

The Databricks host represents your specific Databricks workspace URL that contains your Databricks instance

Authentication Mode

There are four Authentication Modes available:

  • Azure Authorization Code
  • Azure Service Principal
  • OAuth M2M (machine to machine)
  • Personal Token
Azure Authorization Code
Generate Token File
This button allows you to log in to the service endpoint and authorize your app to generate a token.
Databricks Connection Manager - Generate Token File
Here you can enter the Tenant ID, Client ID, Client Secret, and Redirect URL The Redirect URL would be the one that you had specified in the App settings.
  • PKCE: The PKCE(Proof Key for Code Exchange) option may be enabled for PKCE App Type.
  • Use Default Browser to Sign In: When this option is checked the Sign In and Authorize button will open your default web browser to complete the OAuth authentication. When this option is unchecked, the Sign In and Authorize button will complete the entire OAuth authentication process inside the toolkit.
  • Sign In and Authorize: This button allows you to log in to the service endpoint and authorize your app to generate a token.
Path To Token File
The path to the token file on the file system. Please note that this field supports both file system paths as well as Azure Blob Storage Shared Access Signature (SAS) URL path. Note: The component supports Azure Blob Shared Access Signature (SAS) URL in the token file path.
Token File Password
The password to the token file.
Azure Service Principal

Databricks Connection Manager - Azure Service Principle.png
Client ID
The Client ID option allows you to specify the unique ID that identifies the application making the request.
Client Secret
The Client Secret option allows you to specify the client secret belonging to your app.
Tenant ID
The Tenant ID option allows you to specify the unique ID that identifies the tenant you are connecting to.
OAuth M2M (machine-to-machine)

Databricks Connection Manager - OAuth M2M.png
Client ID
The Client ID option allows you to specify the unique ID that identifies the application making the request.
Client Secret
The Client Secret option allows you to specify the client secret belonging to your app.
Personal Token
Databricks Connection Manager - Personal Token
To request a personal token. In your Databricks workspace, go to Settings | Developer | Access tokens | Manage, then click Generate new token.
Statement Execution Settings
Polling Rate
The polling rate determines the frequency for the Connection Manager to Poll the job
Polling Timeout
The timeout value allows you to specify the number of seconds the component will wait until it timeout
Test Connection

After all the connection information has been provided, you may click the Test Connection button to test if the connection settings entered are valid.

Advanced Settings Page

The Advanced Settings page on the Snowflake Connection Manager allows you to specify some advanced and optional settings for the connection.

Databricks Connection Manager - Advanced Settings

Proxy Server Settings
Proxy Mode

The Proxy Mode option allows you to specify how you want to configure the proxy server setting. There are three options available.

  • No Proxy
  • Auto-detect (Using system-configured proxy)
  • Manual
Proxy Server

Using the Proxy Server option allows you to specify the name of the proxy server for the connection.

Port

The Port option allows you to specify the port number of the proxy server for the connection.

Username (Proxy Server Authentication)

The Username option (under Proxy Server Authentication) allows you to specify the proxy user account.

Password (Proxy Server Authentication)

The Password option (under Proxy Server Authentication) allows you to specify the proxy user's password.

Note: The Proxy Password is not included in the connection manager's ConnectionString property by default. This is by design for security reasons. However, you can include it in your ConnectionString if you want to parameterize your connection manager. The format would be ProxyPassword=myProxyPassword; (make sure you have a semicolon as the last character). It can be anywhere in the ConnectionString.

Misc
Timeout (secs)

The Timeout (secs) option allows you to specify a timeout value in seconds for the connection. The default value is 120 seconds.

Retry on Intermittent Errors

This is an option designed to help recover from possible intermittent outages or disruption of service so the integration does not have to be stopped because of such temporary issues. Enabling this option will allow service calls to be retried upon certain types of failure. A service call may be retried up to 3 times before an exception is fired. Retries occur after 0 seconds, 15 seconds, and 60 seconds. Warning: although we have carefully designed this feature so that such retries should only happen when it is deemed to be safe to do so, in some extreme occasions, such retried service calls could result in the creation of duplicate data.