Snowflake is one of the most popular Data Warehouses used for storage, processing, analysis, and app development (Snowflake Apps). The challenges encountered in manual tunning of Snowflake can be addressed by Terraform Snowflake integration. 

Terraform, an Infrastructure as Code automates the Snowflake deployment. It helps in managing Snowflake objects like roles, grants, users, etc. programmatically. 

Terraform Snowflake Integration will completely help you manage your Snowflake infrastructure and resources. Terraform Snowflake provider has made it easy to manage all the Snowflake objects.

With Terraform Snowflake provider, you can avoid situations where someone accidentally grants a user the wrong role manually.

This blog will introduce you to Snowflake and Terraform along with their key features. Furthermore, a step-by-step guide to performing Terraform Snowflake Integration is provided.

Table of contents

What is Snowflake?

Terraform Snowflake Integration: Snowflake Logo
Image Source: Wikipedia

Snowflake is one of the most popular enterprise-grade Cloud-based Data Warehouses that brings simplicity to its users without reducing the features being offered in any way. 

It is capable of automatically scaling resources up and down based on data requirements to ensure that users get the right balance of performance and cost. 

Snowflake is known as the ‘Near-Zero Management’ platform, as it requires very minimal tunning to get the best of its performance.

The key objects in Snowflake include Warehouses, Roles, Databases, Schemas, Tables, and views along with Grants that control these objects. 

The Snowflake web interface or the Snowflake SQL is used to work with these Snowflake objects. 

These Snowflake objects can be created, modified, and deleted. The Snowflake account configuration is done manually with the help of SQL snippets.

Read more about the 7 Comprehensive Aspects of Snowflake Features.

Introduction to Infrastructure as Code

Infrastructure as Code (IaC) comes to the rescue to tackle the problems incurred in the manual tunning of Snowflake. Instead of tunning each object one by one, Infrastructure as Code is issued to make the changes at once. 

Infrastructure as Code helps to make changes to the resources by updating the configuration files and deploying the changes using Command Line Interface(CLI).

Using Infrastructure as Code, you can easily understand the state of your resources by reading the Configuration file. You can make changes with ease in the same way similar to coding for getting git history, pulling requests, and code reviewing.

Pertaining to Snowflake, we need Infrastructure as Code to manage the key Snowflake resources like databases and users. 

The actual configuration of your virtual machines is done by Snowflake itself. There are a variety of Infrastructure as Code (IaC) tools like Chef, Puppey, Ansible, Terraform and etc.

Simplify Snowflake ETL and Analysis with Hevo’s No-code Data Pipeline

A fully managed No-code Data Pipeline platform like Hevo Data helps you integrate data from 100+ data sources (including 30+ Free Data Sources) to a destination of your choice such as Snowflake in real-time in an effortless manner. 

Hevo takes care of all your data preprocessing needs required to set up the integration and lets you focus on key business activities and draw a much more powerful insight on how to generate more leads, retain customers, and take your business to new heights of profitability. 

It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination.

Get Started with Hevo for Free

What is Terraform? 

Terraform Logo
Image Source: dev.classmethod

Terraform is one of the most preferred Infrastructure as Code (IaC) compatible with main cloud providers like AWS, CGP, etc. 

Terraform is declarative and can help you define the resources and configurations you need. It manages dependencies, notes the previous state, and makes all the appropriate changes to align to the new desired state.

Key concepts of Terraform

Some of the key concepts of Terraform are as follows:

Terraform Resources and .tf Files

Terraform resources are declared by the Terraform configuration language. This maps to the components of your infrastructure. These resources are configured in .tf config files. The .tf config files are present in your terraform directory. For Snowflake, the Terraform resources are defined by  snowflake_user resources in the CZI provider.

The .tfstate File

The current state of your resources is present in .tfstate file. This .tfstate file will not be edited manually but the Terraform CLI commands will modify it for you.

terraform plan andterraform apply

The terraform plan compares your current .tf files to your .tf state file and provides an output report.

This output report states how the configuration defined in the .tf file differs from the state of your current production infrastructure. The terraform apply implements the changes mentioned by the terraform plan in the current infrastructure.

Purpose of Terraform Snowflake Integration

There are a few notable problems encountered in the course of manual tunning of the Snowflake account.

These issues create a purpose for Terraform Snowflake providers. To list a few problems in the hand-crafted Snowflake account :

  • The manual object creation process is error-prone. There is no standard protocol followed while creating new objects in Snowflake.
  • A lot of confusion occurs in an organization while deciding the hierarchy in making changes in Snowflake.
  • The changes made are recorded in the QUERY HISTORY view. Parsing these history logs is hectic and needs knowledge of SQL and Snowflake.
  • The process of changing the writing SQL commands to adjust the default query timeout in Snowflake is really tedious.

Terraform Snowflake Integration: Steps

To bring existing Snowflake users under the management of Terraform requires wrangling all the pre-existing users, warehouses, objects, and schemas into your .tf and .tfstate files. This is a two-step process.

Step 1

Generate a .tfstate file that implies the current state of your resources. Use terraform import to import the existing resources into your  .tfstate file. For example,to import your PUBLIC role in CZI Snowflake, the following command is used.

$ terraform import snowflake_role.public "PUBLIC"

One drawback of this terraform import function is that batch importing is not possible, you are allowed to import only one resource at a time.

Step 2 

You have to generate resource definitions in your .tf files, to complete step 1. You will have to manually write a resource definition for that resource every time. This is a tedious process to write a resource definition file for all the pre-existing users.

The latest version v0.13 of Terraform is not able to generate this resource definition automatically but Terraform has mentioned addressing this problem in their latest docs.

In order to overcome the disadvantages of the traditional Terraform Snowflake Integration of running terraform import multiple times and writing bulk resource definitions, Snowglobe is introduced. Snowglobe is used to make the process of Terraform Snowflake integration easier.

Terraform Snowflake Integration: Using Snowglobe

Snowglobe is a python package that programmatically generates the resource definition files for the existing Snowflake objects and resources. 

Snowglobe then appends these results to the  .tf file. Snowglobe also calls terraform import to import all these resources and their definition files into the  .tfstate file. 

Snowflake SQL is used to fetch all the Snowflake objects, the current state of a resource and etc. Snowflake SQL also generates the resource configuration block and runs the terraform import for the object.

Illustrating the use of Snowglobe in Terraform Snowflake integration using the following example.

You have to write a class for each resource type to import. In this example, a class CZISnowflakeUser is defined. Attributes of an object of this class match the properties of the relevant resource types from the Terraform provider.

class CZISnowflakeUser:
    def __init__(
        self,
        name: str,
        comment: str = None,
        default_namespace: str = None,
        ...
    ):
        self.name = name
        self.comment = comment
        self.default_namespace = default_namespace
        ...

    def alias_resource(self):
        return f"{self.name.lower()}"

    def snowflake_provider_resource(self):
        return "snowflake_user"

Snowflake python connector is used to execute the  SHOW USERS and import all Snowflake users into a Pandas data frame. The  CZISnowflakeUser class consists of a method to parse a row from the data frame into a CZISnowflakeUser object. 

This parsed  CZISnowflakeUser object is again passed through an obj_to_terraform parse, which persuades the object into Terraform friendly resource definition string and appends it to the users.tf file.

resource "snowflake_user" "example_user" {  
  name                 = "EXAMPLE_USER"  
  comment              = "This is an example user"       
  default_namespace    = "PUBLIC"
  default_role         = "PUBLIC"
  default_warehouse    = "DEMO_WH"   
  disabled             = false  
  login_name           = "EXAMPLE_USER"  
  must_change_password = false
}

terraform import command is generated and executed, importing the user into the .tfstate.

A single main the function is used in parsing all resource types that call everything and generates the final .tfstate file and the related .tf files. The output file holds all the current state of the existing Snowflake account in the Terraform language.

With Terraform Snowflake integration, all the later changes to the .tf files and a corresponding terraform apply will make those changes to the Snowflake resource.

Benefits of Using Terraform with Snowflake

Terraform Snowflake provider helps users in effectively managing Snowflake. Let’s take a look at some of the benefits of using Terraform with Snowflake.

  • Terraform leverages Infrastructure as Code to manage your Snowflake Account, allowing you to configure the state of Snowflake objects. This makes it easier to manage Snowflake.
  • Terraform Snowflake helps in speeding up manual and repetitive tasks saving you a lot of valuable time.
  • Furthermore, you can set up storage in your Cloud provider and add it to Snowflake as an external stage. You can then connect this storage with Snowpipe.
  • Standardization of roles and privileges keeps you updated with the current state of Snowflake environments such as schema changes, size of Data Warehouse, user access, etc.

Frequently Asked Questions (FAQs)

Should I learn AWS before Terraform?
Terraform is typically used in parallel with some public Cloud providers such as AWS, GCP, etc. Although it’s not mandatory, it is recommended to have knowledge of the underlying resources that you are creating with Terraform.

Is Terraform only for Cloud?
Terraform is an open-source IaC tool known for being Cloud-agnostic. It is used mostly for managing public Cloud infrastructures such as GCP, AWS, and Azure. However, Terraform can also be used for on-premises infrastructure including VMware vSphere and OpenStack.

What is a Terraform provider?
A Terraform provider is a plugin that lets users manage external APIs. Terraform providers such as the AWS provider and the Cloud-init provider act as a translation layer between Terraform and a variety of Cloud providers, Databases, and services.

What language is supported by Terraform?
Terraform allows users to define and provide data center infrastructure using HashiCorp Configuration Language (HCL). HashiCorp Configuration Language (HCL) is a unique declarative configuration language designed to be used with HashiCorp tools, notably Terraform. Alternatively, users can also use JSON to define data center infrastructure.

What is the difference between Terraform and Kubernetes?
Kubernetes is an open-source container orchestration system for automating deployment, scaling, and management of containerized applications like Docker containers. Terraform, on the other hand, is an open-source Infrastructure as Code tool that allows Developers to safely and seamlessly create, update, and improve Cloud infrastructure through its consistent CLI workflow.

Conclusion

Snowflake is a major player in the Cloud Data Warehousing industry and it describes itself as a “Near-Zero Management” platform. Although there are minimal knobs to turn, some fine-tuning is still required to achieve stellar performance. This is where Terraform Snowflake Integration comes in. It helps you to completely manage your Snowflake infrastructure and resources.

This blog introduced us to Snowflake Data Warehouse, Infrastructure as Code, and Terraform. It later helped you establish a Snowflake Terraform integration and understand various aspects of it. 

Also, if you’re using Lambd functions to serve web pages, process data streams, and call APIs, you might want to explore Terraform Lambda.

With your Snowflake Warehouse live and running, you’ll need to extract and ingest data from multiple sources to Snowflake in order to carry out an effective Data Analysis.

However, extracting complex data from a diverse set of data sources into Data Warehouses like Snowflake can be quite challenging and tiresome. So, you can use an easier alternative like Hevo. Hevo can seamlessly automate the process of Data Ingestion in Snowflake with its No-code Data Pipeline.

Visit our Website to Explore Hevo

Would you like to take Hevo for a test?

sign up for a 14-day free trial and experience the feature-rich Hevo suite firsthand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Tell us about your experience of setting up the Terraform Snowflake Integration! Share your thoughts with us in the comments section below.

mm
Business Analyst, Hevo Data

Sherley is a data analyst with a keen interest towards data analysis and architecture, having a flair for writing technical content. He has experience writing articles on various topics related to data integration and infrastructure.

No-Code Data Pipeline for Snowflake

Get Started with Hevo