Documentation
Plugins
Destinations
Snowflake
Overview

Snowflake Destination Plugin

Latest: v3.2.10

The snowflake plugin helps you sync data to your Snowflake data warehouse.

There are two ways to sync data to Snowflake:

  1. Direct (easy but not recommended for production or large data sets): This is the default mode of operation where CQ plugin will stream the results directly to the Snowflake database. There is no additional setup needed apart from authentication to Snowflake.

  2. Loading via CSV/JSON from a remote storage: This is the standard way of loading data into Snowflake, it is recommended for production and large data sets. This mode requires a remote storage (e.g. S3, GCS, Azure Blob Storage) and a Snowflake stage to be created. The CQ plugin will stream the results to the remote storage. You can then load those files via a cronjob or via SnowPipe. This method is still in the works and will be updated soon with a guide.

Example Config

This example sets the connection string to a value read from the SNOWFLAKE_CONNECTION_STRING environment variable:

kind: destination
spec:
  name: snowflake
  path: cloudquery/snowflake
  version: "v3.2.10"
  write_mode: "append"
  spec:
    connection_string: "${SNOWFLAKE_CONNECTION_STRING}"
    # Optional parameters
    # migrate_concurrency: 1
    # batch_size: 1000 # 1K entries
    # batch_size_bytes: 4194304 # 4 MiB

The Snowflake destination utilizes batching, and supports batch_size and batch_size_bytes.

Authentication

Authentication of the connection to Snowflake can be specified using:

  • A username and password in the DSN:

    kind: destination
    spec:
      name: snowflake
      ...
      spec:
        connection_string: "user:pass@account/db/schema?warehouse=wh"
  • A private key inline:

    kind: destination
    spec:
      name: snowflake
      ...
      spec:
        connection_string: "user@account/database/schema?warehouse=wh"
        private_key: |
          -----BEGIN PRIVATE KEY-----
          MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQC2ajPRIbPtbxZ1
          3DONLA02eZJuCzsgIkBWov/Me5TL6cKN0gnY+mbA8OnNCH+9HSzgiU9P8XhTUrIN
          85diD+rj6uK+E0sSyxGk6HG17TyR5oBq8nz2hbZlbaNi/HO9qYoHQgAgMq908YBz
          ...
          DUmOIrBYEMf2nDTlTu/QVcKb
          -----END PRIVATE KEY-----
  • A private key included from a file:

    kind: destination
    spec:
      name: snowflake
      ...
      spec:
        connection_string: "user@account/database/schema?warehouse=wh"
        private_key: "${file:./private.key}"

    where ./private.key is PEM-encoded private key file with contents of the form:

    -----BEGIN PRIVATE KEY-----
    MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQC2ajPRIbPtbxZ1
    3DONLA02eZJuCzsgIkBWov/Me5TL6cKN0gnY+mbA8OnNCH+9HSzgiU9P8XhTUrIN
    85diD+rj6uK+E0sSyxGk6HG17TyR5oBq8nz2hbZlbaNi/HO9qYoHQgAgMq908YBz
    ...
    DUmOIrBYEMf2nDTlTu/QVcKb
    -----END PRIVATE KEY-----

Private Key Authentication Setup

The Snowflake guide for Key Pair Authentication (opens in a new tab) demonstrates how to create an RSA private key with the ability to authenticate as a Snowflake user.

Note that encrypted private keys are not supported by the Snowflake Go SQL driver, and hence not supported by the CloudQuery Snowflake plugin. You can decrypt a private key in file enc.key and store it in a file dec.key using the following command:

openssl pkcs8 -topk8 -nocrypt -in enc.key -out dec.key

Snowflake Spec

This is the top level spec used by the Snowflake destination plugin.

  • connection_string (string) (required)

    Snowflake connection_string.

    Example:

    # user[:password]@account/database/schema?warehouse=user_warehouse[&param1=value1&paramN=valueN]
    # or
    # user[:password]@account/database?warehouse=user_warehouse[&param1=value1&paramN=valueN]
    # or
    # user[:password]@host:port/database/schema?account=user_account&warehouse=user_warehouse[&param1=value1&paramN=valueN]
    # or
    # host:port/database/schema?account=user_account&warehouse=user_warehouse[&param1=value1&paramN=valueN]

    From Snowflake documentation:

    account - Name assigned to your Snowflake account. If you are not on us-west-2 or AWS deployment, append the region and platform to the end, e.g., <account>.<region> or <account>.<region>.<platform>.

  • private_key (string) (optional)

    A PEM-encoded private key for connecting to snowflake. Equivalent to adding authenticator=snowflake_jwt&privateKey=... to the connection_string but parses, validates, and correctly encodes the key for use with snowflake.

  • migrate_concurrency (integer) (optional) (default: 1)

    By default, tables are migrated one at a time. This option allows you to migrate multiple tables concurrently. This can be useful if you have a lot of tables to migrate and want to speed up the process.

    Setting this to a negative number means no limit.

  • batch_size (integer) (optional) (default: 1000)

    Number of records to batch together before sending to the database.

  • batch_size_bytes (integer) (optional) (default: 4194304 (= 4 MiB))

    Number of bytes (as Arrow buffer size) to batch together before sending to the database.

Underlying library

We use the official github.com/snowflakedb/gosnowflake (opens in a new tab) package for database connection.