Snowflake

Power Kubit with Secure Data Sharing

Introduction

Snowflake allows you to share a databases, schemas, tables and views with Kubit through a Secure Data Share . With this approach Kubit acts as a Consumer Account and can query data that has been imported with the Provider Account, but cannot perform any of the DML tasks that are allowed in a full account, such as data loading, insert, update, and similar data manipulation operations.

Steps

Requirements

Note that both the Provider and Consumer accounts of the Secure Data Share need to be on the same cloud infrastructure and in the same region. In case your Snowflake account does not match any combination in the table below we can provision a new matching account in a matter of hours.

RegionCloud VendorAccount LocatorAccount Name
US West 2AWSNK15162NK15162
US East 1AWSKUBITKUBIT
US East 2AWSSL59270KUBIT_US_EAST_2_AWS
EU Central 1AWSPC52456KUBIT_EUROPE_CENTRAL_1_AWS
EU West 1AWSTJ95313KUBIT_EU_WEST_1_AWS
EU West 4GCPKU21135KUBIT_EUROPE_WEST_4_GCP
EU NorthAzureGJ34763KUBIT_NORTHEUROPE_AZURE
AU EastAzureFZ59063KUBIT_AU_EAST_AZURE

1. Preparation

Identify which resources will be shared with Kubit.

  • Databases and Schemas
  • Tables
  • (Secure) Views

2. Create a Data Share

Here is an example script (sharing to nk15162 on US West Coast)

use role accountadmin;
create share <data_share>;
resource "snowflake_share" "kubit" {
  name     = "kubit_share"
  comment  = "Shared with Kubit"
  accounts = ["KUBIT_EUROPE_CENTRAL_1_AWS"] // use Account Name
}

3. Grant Usage Permissions

grant usage on database <DB> to share <data_share>;
grant usage on schema <SCHEMA> to share <data_share>;
grant select on all tables in schema <SCHEMA> to share <data_share>;
grant select on view <VIEW> to share <data_share>;
alter share <data share> set accounts = nk15162; -- use Account Locator

📘

Snowflake also offers a web interface to review and manage data shares. For details on this topic, please review Snowflake’s documentation here. Please also note, the following cases which require additional configuration. Here is how to:

Best Practices

Clustering Keys

  1. Kubit works with time-series data so clustering by date of the fact tables is essential for performance for tables bigger than 1TB.
  2. Since most Kubit reports also filter by event name it is beneficial to add the event name as a second cluster column.

Here's Snowflake's reference for Clustering Keys .

Troubleshooting

A view or function being shared cannot reference objects from other databases.

The error could be due:

  1. A missing privilege REFERENCE_USAGE on the database where the objects reside which is referenced in the view. Solution:
grant reference_usage on database <database_name> to share <data_share>;
  1. An object from a database on top of an INBOUND share can not be directly added or referenced to an object in the OUTBOUND share. There is no solution to this limitation - you can't share an INBOUND share directly. You have to copy the data into a table in order to use it in an OUTBOUND Share.
  2. If the referenced object or the shared object in the OUTBOUND share has a making policy applied, where the masking policy resides in a database other than the shared object. Solution:
grant reference_usage on database <masking_policy_database> to share <data_share>;
  1. You're referencing more than one database in your OUTBOUND share. You can review all the references with the following function:
select * from table(get_object_references(database_name=>'DB', schema_name=>'SCHEMA', object_name=>'VIEW'));

Solution: Please verify the required USAGE privileges were granted. See step 3 of this guide.

Recreate a shared view

When you recreate a view, even with the CREATE OR REPLACE statement, existing grants might be lost. Please, make sure to include the COPY GRANTS parameter. Once the new view is created you can review its grants with the following command:

SHOW GRANTS ON VIEW <view_name>;

Avoid select * in shared view definitions

Using select * in your view definitions may lead to query failures once the references table schema changes. Here's a typical error message:

View definition for 'Database.Schema.Resource_Name' declared X column(s), but view query produces Y column(s).

Please do not use select * in shared view definition in order to prevent such errors.