Skip to content

Snowflake Deployer

DataOps python framework for managing Snowflake environment.

All code stored in source control & deployed directly to Snowflake.

Reason for this project

There's 2 primary automated deployment methods for data warehouses. Terraform & Migration based tools (schemachange, flyway, etc).

Why Terraform is not ideal

Terraform is fantastic for slowly changing infastructure projects like aws cloud infrastructure deployments that have a single location for managing changes. Data warehouse changes have a much higher number of changes and often need to come from different applications. Some objects are managed by the replication tool, others by the transformation tool and others by the meta data management tool. Terraform simply wasn't meant to manage a complex environment like this. Given the "state" is stored in a seperate state file, state drift becomes a real problem with data warehousing and troubleshooting state drift is an extremely time consuming task.

Why Migration tools are not ideal

Because of the issues with Terraform, most companies turn to migration based tools. The problem here is scale as every statement needs to be written out in SQL, and every update needs an ALTER statement. This quickly becomes impossible to determine the state of any object at any one time. Some work arounds can be done with using CREATE or REPLACE statements, but this only goes so far as constanly replacing objects leads to issues with governance policies & tags, time travel, and dependencies.

Best of both worlds

The snowflake-deployer uses the best of worlds. A yml state based config files for each object with the flexibility that migration based tools give.

CREATES and ALTERS

The snowflake-deployer converts the yml config into CREATE or ALTER statements based on current objects within the database.

File Hash & Object Hash

File hashes and object hashes are both store and calculated at run time. So if either a file changed or an object changed from another application (ie. FiveTran loading a table and adding a column or a transformation removing tags), the deployer will detect that the object needs to be updated. This also allows the deployment to ignore objects that match the config file to prevent every object to be deployed on every run.

Deploy & Import

The Snowflake Deployer includes both a "deployer" for pushing yaml config to the database. As well as an "import" process for reverse engineering an existing database to config files.

Deployment Tags

Tags can specify objects only be deployed for certain environments.

Within each object config, add the following code (list of environments to deploy to):

DEPLOY_ENV:
- dev

Within each environment config file, add the following code (configured environment name):

DEPLOY_ENV: dev

Jinja Variables

Jinja variables can be used throughout the config files for parametization and cross file depency references.

Parametization

Any value in the config files can use a jinja variable for parametization by environment.

Example:

Object yaml - The "COMMENT" config references the "my_comment" variable.

COMMENT: {{my_comment}}

Config - the "deploy_config_dev.yml" environment config file stores the value of the my_comment value.

VARS:
- my_comment: some dev comment

NOTE: The import process will not bring in variables (no way to reverse engineer a variable reference). However, the import process will ignore any values that already have a variable reference.

References

ref - Object reference based on the naming convention of {db}{schema}{object_name}

role - Specific role reference based on the naming convention of {role}

Example

TAGS:
- {{ref('CONTROL__GOVERNANCE__ENV')}}: {{env}}
GRANTS:
- {{role('PROCESSING')}}: APPLY

NOTE: The import process will bring in jinja ref and role references.

Parallelization

Snowflake Deployer uses the jinja ref's to determine dependencies and creates a deployment parrallelization path. The MAX_THREADS parameter in the environment config file controls how many active threads can be running at once.

Commands

Deployment - deploy config files to Snowflake based on configuration in config file

snowflake-deploy deploy -c deploy_config_dev.yml

Import - Reverse engineer existing Snowflake account into yml config files

snowflake-deploy deploy -c deploy_config_dev.yml

Keys - create Public/Private RSA Tokens for authentication

snowflake-deploy keys -p Th1sI$@Pa$$w0d

Sample Project

See sample project for an example structure.

https://github.com/metaopslabs/snowflake_deploy_example/tree/main