Steps to Achieve Snowflake Replication: A Comprehensive Guide

• December 9th, 2020

SNOWFLAKE REPLICATION

Snowflake is a popular Cloud Data Warehousing solution that has been implemented by scores of well-known firms, including Fortune 500 companies, as their Data Warehouse provider and manager.

This article describes Snowflake replication in detail.

Table of Contents

Hevo, A Simpler Alternative to Integrate your Data for Analysis

Hevo offers a faster way to move data from databases or SaaS applications into your data warehouse to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.

Get started with hevo for free

Check out some of the cool features of Hevo:

  • Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
  • Real-time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
  • 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources that can help you scale your data infrastructure as required.
  • 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Security: Hevo is SOC II, GDPR, and HIPPA compliant. Hevo also enables top-grade security with end-to-end encryption, two-factor authentication, and more.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
Sign up here for a 14-day free trial!

Introduction

In today’s competitive business environment, enterprises should have their systems available 24 hours a day, 7 days a week. A database failure can occur for many reasons, including viruses, natural disasters, or network issues. Few minutes of downtime can result in a significant financial impact. 

Businesses operating in industries such as finance, healthcare, technology, and retail need to have high availability of data and a mechanism for quick recovery in case of failure. 

Enterprises normally have plans for business continuity and disaster recovery. However, it may take longer for a business to fully restore its data and systems, which can lead to loss of revenue. 

The Snowflake team understands this, and that’s why they introduced the idea of database replication to ensure consistent data availability. 

Prerequisites

This is what you need for this article:

  • A Snowflake Account. 

Part 1: What is Snowflake?

Image Source: www.en.wikipedia.org

Snowflake is an analytic data warehouse offered as Software-as-a-Service. It provides a faster and easy-to-use Data Warehouse. Snowflake’s data warehouse is also more flexible compared to traditional data warehouse offerings. 

Snowflake’s data warehouse does not rely on an existing database or “big data” software platform like Hadoop. Instead, it uses a new SQL database engine with a unique architecture that is specifically designed for the cloud. 

To the user, Snowflake seems to be similar to the other enterprise data warehouses, but it has unique capabilities and additional functionalities. 

Part 2: Steps to Achieve Snowflake Replication

In this section, we will be discussing the steps necessary to replicate databases across different Snowflake accounts located in different regions. Here are the steps to achieve Snowflake replication:

Step 1: Link your Organization Accounts

Before configuring database replication, you should first link either two or more accounts in your organization. Just contact the Snowflake support team for this. 

To see all accounts in your organization, use the following command:

show replication accounts;

Step 2: Promote a Local Database to Serve as the Primary Database

You should now modify an existing transient or permanent database to act as the primary database. 

Just use the ALTER DATABASE … ENABLE REPLICATION TO ACCOUNTS command. 

You can use a comma (,) to separate the names of accounts in your organization that can store a replica of this database, and users in those accounts will be able to query objects in the secondary database. 

For example:

alter database mydb enable replication to accounts aws_us_east_1.myaccount1, azure_westeurope.myaccount2;

The above command promotes the local database mydb to serve as the primary database and specifies that the accounts myaccount1 and myaccount2 can each store a replica of this database. 

Step 3: Enable Failover for a Primary Database

Use the ALTER DATABASE … ENABLE FAILOVER TO ACCOUNTS statement to enable failover for the primary database to one or more accounts in your organization. You can use any of the replicas of the primary database in any of the accounts. 

Note that you can enable failover for the primary database either before or after the creation of the replica of the primary database in a specified account. 

For example:

alter database mydb enable failover to accounts aws_us_east_1.myaccount1, azure_westeurope.myaccount2;

The above command enables failover for primary database mydb to accounts myaccount1 and myaccount2. 

Note that the above command MUST be executed in the account where the primary database is stored. 

For example, if the primary database is stored in the account myaccount3, the command must be executed in that account. 

Step 4: Create a Secondary Database

You should create a replica of an existing primary database in the same account that stores the primary database, or in a different account. 

Note that the secondary database can only be created in an account that was specified in the ALTER DATABASE … ENABLE REPLICATION TO ACCOUNTS ALTER SYSTEM SET wal_level = logical; command in Step 2 above. 

Execute the CREATE DATABASE … AS REPLICA OF statement in each target account to create a replica of the primary database. 

For example:

CREATE DATABASE mydb
 AS REPLICA OF aws_us_west_2.myaccount1.mydb
 AUTO_REFRESH_MATERIALIZED_VIEWS_ON_SECONDARY = TRUE;

The above command creates a replica of the primary database mydb in the account myaccount2 with automatic refreshing of materialized views in the replica enabled. 

The command should be executed in myaccount2. 

You can then use the following command to refresh each secondary database from a snapshot of its primary database:

ALTER DATABASE mydb REFRESH;

If you need to monitor the progress of the secondary database refresh, run the following command:

select * from table(information_schema.database_refresh_progress(mydb));

Part 3: Limitations

The following are the challenges of Snowflake replication:

  1. It may present issues of time, money, and bandwidth. 
  2. Replication means hosting duplicate information, which comes with additional costs. 
  3. Data replication creates network traffic, which may slow down processing speeds. 

Part 4: Use Hevo Data

Hevo Data provides its users with a simpler platform for integrating data for analysis. 

It is a no-code data pipeline that can help you combine data from multiple sources

It provides you with a consistent and reliable solution to managing data in real-time, ensuring that you always have analysis-ready data in your desired destination. 

visit our website to explore hevo

Your job will be to focus on key business needs and perform insightful analysis using BI tools. 

Try Hevo out for yourself. sign up for the 14-day free trial!

Conclusion

This is what you’ve learnt in this article:

  • You’ve learned more about Snowflake. 
  • You’ve learned how to perform Snowflake replication.

Share your experience of working with Snowflake Replication in the comments section below.

No-code Data Pipeline for Snowflake