Terraform Snowflake Integration – 2 Easy steps

on Automation, CRM Automation, CRMs, Data Warehouse, Data Warehouses, Snowflake, Tutorials • September 30th, 2021 • Write for Hevo

Snowflake is one of the most popular data warehouses used for storage, processing, and analysis. The challenges encountered in manual tunning of Snowflake can be addressed by Terraform Snowflake integration. Terraform, an Infrastructure as Code automates the Snowflake deployment. You can manage Snowflake objects like roles, grants, users and etc programmatically using Terraform. Terraform Snowflake Integration will completely manage your Snowflake infrastructure and resources. Terraform Snowflake provider by  Chan Zuckerberg has made the process easier to manage all the Snowflake objects. With Terraform Snowflake provider, you can avoid situations where someone accidentally grants a user the wrong role manually.

In this blog, you will be introduced to Snowflake and Terraform along with their key features and concepts. The steps involved in Terraform Snowflake Integration will be discussed in detail. The notion of Terraform Snowflake provider Integration is emphasized.

Table of contents

Introduction to Snowflake

Terraform Snowflake Integration: Snowflake Logo
Image Source: Wikipedia

Snowflake is one of the most popular enterprise-grade Cloud-based Data Warehouses that brings simplicity to its users without reducing the features being offered in any way. It is capable of automatically scaling resources up and down based on data requirements to ensure that users get the right balance of performance and cost. Snowflake is known as the ‘Near-Zero Management’ platform, as it requires very minimal tunning to get the best of its performance.

The key objects in Snowflake include Warehouses, Roles, Databases, Schemas, Tables, and views along with Grants that control these objects. The Snowflake web interface or the Snowflake SQL is used to work with these Snowflake objects. These Snowflake objects can be created, modified, and deleted. The Snowflake account configuration is done manually with the help of SQL snippets.

Key Features of Snowflake

The key features of Snowflake are as follows:

  • Advanced Scalability: Snowflake houses functionalities that allow users to practically create an unlimited number of Virtual Warehouses with each one running its workload against the data in its database.
  • Robust Security: Snowflake ensures that all data stored in its Data Warehouses is secure by implementing a wide variety of industry-leading and powerful security features, such as Multi-factor Authentication, Automatic 256-bit AES Encryption, etc. Snowflake is also compliant with numerous enterprise-grade data security standards such as PCI DSS, HIPAA, SOC 1, and SOC 2.
  • Automated Performance Tuning: Snowflake offers its users an Automatic Query Performance Optimization mechanism backed by a robust Dynamic Query Optimization Engine in their Cloud Services Layer.

Introduction to Infrastructure as Code

Infrastructure as Code (IaC) comes to the rescue to tackle the problems incurred in manual tunning of Snowflake. Instead of tunning each object one by one, Infrastructure as Code issued to make the changes at once. Infrastructure as Code helps to make changes to the resources by updating the configuration files and deploying the changes using Command Line Interface(CLI).

Using Infrastructure as Code, you can easily understand the state of your resources by reading the Configuration file. You can make changes as ease in the same way similar to coding for getting git history, pulling requests, and code reviewing.

Pertaining to Snowflake, we need Infrastructure as Code to manage the key Snowflake resources like databases and users. The actual configuration of your virtual machines is done by Snowflake itself. There are a variety of Infrastructure as Code (IaC) tools like Chef, Puppey, Ansible, Terraform and etc.

Simplify Snowflake ETL and Analysis with Hevo’s No-code Data Pipeline

A fully managed No-code Data Pipeline platform like Hevo Data helps you integrate data from 100+ data sources (including 30+ Free Data Sources) to a destination of your choice such as Snowflake in real-time in an effortless manner. Hevo with its minimal learning curve can be set up in just a few minutes allowing the users to load data without having to compromise performance. Its strong integration with umpteenth sources provides users with the flexibility to bring in data of different kinds, in a smooth fashion without having to code a single line. 

Hevo takes care of all your data preprocessing needs required to set up the integration and lets you focus on key business activities and draw a much powerful insight on how to generate more leads, retain customers, and take your business to new heights of profitability. It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination.

Get Started with Hevo for Free

Check out what makes Hevo amazing:

  • Real-Time Data Transfer: Hevo with its strong Integration with 100+ Sources (including 30+ Free Sources), allows you to transfer data quickly & efficiently. This ensures efficient utilization of bandwidth on both ends.
  • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer. 
  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Tremendous Connector Availability: Hevo houses a large variety of connectors and lets you bring in data from numerous Marketing & SaaS applications, databases, etc., such as HubSpot, Marketo, MongoDB, Oracle, Salesforce, Redshift, etc. in an integrated and analysis-ready form.
  • Simplicity: Using Hevo is easy and intuitive, ensuring that your data is exported in just a few clicks. 
  • Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-Day Free Trial!

Introduction to Terraform 

Image Source: dev.classmethod

Terraform is one of the most preferred Infrastructure as Code(IaC). It is compatible with main cloud providers like AWS, CGP and, etc. Terraform is declarative and can help you define the resources and configurations you need. Terraform manages dependencies, notes the previous state, and makes all the appropriate changes to align to the new desired state.

Key concepts of Terraform

Terraform Resources and .tf Files

Terraform resources are declared by the Terraform configuration language. This maps to the components of your infrastructure. These resources are configured in .tf config files. The .tf config files are present in your terraform directory. For Snowflake, the Terraform resources are defined by  snowflake_user resources in the CZI provider.

The .tfstate File

The current state of your resources is present in .tfstate file. This .tfstate file will not be edit manually but the Terraform CLI commands will modify it for you.

terraform plan andterraform apply

The terraform plan compares your current .tf files to your .tf state file and provides an output report. This output report states how the configuration defined in the .tf file differs from the state of your current production infrastructure. The terraform apply implements the changes mentioned by the terraform plan in the current infrastructure.

Purpose of Terraform Snowflake Integration

There are a few notable problems encountered in the course of manual tunning of the Snowflake account. These issues create a purpose for Terraform Snowflake providers. To list a few problems in hand-crafted Snowflake account :

  • The manual object creation process is error-prone. There is no standard protocol followed while creating new objects in Snowflake.
  • A lot of confusions occur in an organisation while deciding the hierarchy in making changes in Snowflake.
  • The changes made are recorded in the QUERY HISTORY view. Parsing these history logs is hectic and needs knowledge of SQL and Snowflake.
  • The process of changing the writing SQL commands to adjust the default query timeout in Snowflake is really tedious.

Terraform Snowflake Integration

To bring existing Snowflake user under that management of Terraform requires wrangling all the pre-existing users, warehouses, objects and schemas intou your .tf and .tfstate files. This is a two-step process.

Step1: Generate a .tfstate file that implies the current state of your resources. Use terraform import to import the existing resources into your  .tfstate file. For example,to import your PUBLIC role in CZI Snowflake, the following command is used.

$ terraform import snowflake_role.public "PUBLIC"

One drawback of this terraform import function is that batch importing is not possible, you are allowed to import only one resource at a time.

Step2: You have to generate resource definitions in your .tf files, to complete the step 1. You will have to manually write a resource definition for that resource every time. This is a tedious process to write a resource definition file for all the pre-existing users.

The latest version v0.13 of Terraform is not able to generate this resource definitions automatically but Terraform has mentioned addressing this problem in their latest docs.

In order to overcome the disadvantages of the traditional Terraform Snowflake Integration of running terraform import multiple times and writing bulk resource definitions, Snowglobe is introduced. Snowglobe is used to make the process of Terraform Snowflake integration easier.

Terraform Snowflake Integration: Using Snowglobe

Snowglobe is a python package that programmatically generates the resource definition files for the existing Snowflake objects and resources. Snowglobe then appends these results to the  .tf file. Snowglobe also calls  terraform import to import all these resources and their definition files into the  .tfstate file. Snowflake SQL is used to fetch all the Snowflake objects, the current state of a resource and etc. Snowflake SQL also generate the resource configuration block and runs the  terraform import for the object.

Illustating the use of Snowglobe in Terraform Snowflake integration using the following example.

You have to write a class for each resource type to import. In this example, a class CZISnowflakeUser is defined. Attributes of an object of this class matches the properties of the relevant resource types from the Terraform provider.

class CZISnowflakeUser:
    def __init__(
        self,
        name: str,
        comment: str = None,
        default_namespace: str = None,
        ...
    ):
        self.name = name
        self.comment = comment
        self.default_namespace = default_namespace
        ...

    def alias_resource(self):
        return f"{self.name.lower()}"

    def snowflake_provider_resource(self):
        return "snowflake_user"

Snowflake python connector is used to execute the  SHOW USERS and import all Snowflake users into a Pandas dataframe. The  CZISnowflakeUser class consists of a method to parse a row from the data frame into an CZISnowflakeUser object. This parsed  CZISnowflakeUser object is again passed through a obj_to_terraform parse, which persuades the object into Terraform friendly resource definition string and appends it to the users.tf file.

resource "snowflake_user" "example_user" {  
  name                 = "EXAMPLE_USER"  
  comment              = "This is an example user"       
  default_namespace    = "PUBLIC"
  default_role         = "PUBLIC"
  default_warehouse    = "DEMO_WH"   
  disabled             = false  
  login_name           = "EXAMPLE_USER"  
  must_change_password = false
}

terraform import command is generated and executed, importing the user into the .tfstate

A single main function is used in parsing all resource types that call everything and generates the final .tfstate file and the related .tf files. The output file holds all the current state of the existing Snowflake account in the Terraform language.

With Terraform Snowflake integration, all the later changes to the .tf files and a corresponding terraform apply will make those changes to the Snowflake resource.

Conclusion

From this blog, you have learned about Snowflake and Terraform along with their key features and concepts. You have now understood the steps involved in Terraform Snowflake Integration. You would have now understood the purpose of Terraform Snowflake Integration.

If you are using Snowflake as a Data Warehouse in your firm and searching for an alternative to Manual Data Integration, then Hevo can seamlessly automate this for you. Hevo, with its strong integration with 100+ sources, allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiffy.

Visit our Website to Explore Hevo

Would you like to take Hevo for a test?

Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Tell us about your experience of setting up the Terraform Snowflake Integration! Share your thoughts with us in the comments section below.

No-Code Data Pipeline for Snowflake