Amplitude Snowflake Integration: Simplified 101

on Amplitude, Data Integration, Data Warehouse, Snowflake, Snowflake • April 14th, 2022 • Write for Hevo

Amplitude snowflake: FI

Business intelligence (BI) tools like Google Data Studio, Power BI, and Tableau help organizations understand data and make effective business decisions. But, with BI tools, you cannot effectively map your consumers’ journey and build users’ profiles from engagements on websites and applications.

However, it is essential to create users’ profiles for having an in-depth analysis of your customers as well as their experience with your software solutions. With Amplitude, user profiling becomes easy due to superior features like Retention and Cohorts. You can further use Amplitude with data sources like Snowflake for identity resolutions, data governance, and better analysis.

This article talks about Amplitude Snowflake Integration. In addition to that, it gives a brief introduction to Amplitude and Snowflake.

Table Of Contents

Prerequisites

  • Basic understanding of BI tools

What is Amplitude?

To maintain a competitive advantage and attract and retain customers, business leaders must take a digital-first approach. This mindset shifts away from simply getting customers in the door and toward creating a digital experience that adds value to the customer.

Product analytics allows you to create that digital experience without having to make any educated guesses. It provides you with concrete data to help you improve conversions, retention, and revenue.

Developed in 2012, Amplitude is a product analytics platform used for mapping the customer’s journey on websites and applications. Product analysis is a process to understand how customers get engaged with your products or services. Although Amplitude is used mainly for the Saas (Software as a Service) businesses that host an application online and make it available for the customers through subscriptions, you can use it for almost every business.

For example, if you own an e-commerce website, Amplitude can keep track of your customers’ purchases and the products they have added to their carts. You can notice their behavior by analyzing how many customers have added your products to their cart or have purchased your products after signing up for your website. As a result, Amplitude helps businesses to understand interactions between customers and products in real-time. Such capabilities allow organizations to identify shortcomings in products by analyzing user experience and eliminating bottlenecks by optimizing the website or application. 

The Digital Optimization System, or Amplitude, is a program that allows you to optimize your data. The Digital Optimization System, which is powered by the proprietary Amplitude Behavioral Graph, allows businesses to see and predict which features and actions translate to business outcomes – from loyalty to lifetime value – and intelligently adapt each experience in real-time based on these insights.

Your company can choose where your data is geographically stored across regional, national, and international operations using regions. Your computing resources are provisioned according to regions as well. Snowflake supports three global geographic segments across all Snowflake-supported cloud platforms (North America, Europe, and the Asia Pacific).

Key Features of Amplitude

  • Event segmentation: Event segmentation in Amplitude is used to create charts from events that show what your users are doing with your products. It can help you build analysis on:
    • Measuring the top events performed over a selected period.
    • Analyzing how often the events are fired.
    • Detecting the count of users who is firing events in your product.
    • Clarify those users who are firing events.
  • Event Module: Events in Amplitude are actions taken in your product. Events could be like pushing a button, completing a level, or making a payment. To know how the users are involved in your product, you should be able to aim to track between 15 and 200 events. Amplitude can also track inactive events. Inactive events are actions that are not taken directly by the end-users but still occur within the websites, like push notifications.
  • Retention Analysis: Retention Analysis in Amplitude refers to analyzing how customers return to your product after taking a particular action. However, with Amplitude, you can also have the privilege of not choosing a particular event for retention. Amplitude can compute retention data by comparing the date of the start event to the end event. The Retention analysis chart in Amplitude shows the percentage of active users who have fired your interested events within a specified time like a month, week, or daily.
  • User Cohorting: User Cohorting in Amplitude helps you to identify the behavior of your customers. Cohorts are a group of customers who share some common traits or set of traits. Cohorts are defined by the customer’s actions within a specified period. Different customers who fired the same events are grouped in Cohorts. For example, suppose you own an OTT platform, users who watched 4 to 5 consecutive episodes of a particular series after signing in fall under the same cohorts.

What is Snowflake?

Developed in 2012, Snowflake is a data warehouse that runs on cloud infrastructures like AWS, Google Cloud Platform, and Microsoft Azure. It is a platform that consists of services like data engineering, data lakes, data warehouse, data application, data sharing, data analytics, and more. When data is loaded in Snowflake, Snowflake organizes data into optimized, compressed, and columnar formats. Snowflake is easy to use as it leverages top-rated SQL queries to perform operations on data. Users can start using Snowflake by registering for a free trial of 30 days using email addresses and other credentials.

Snowflake provides the Data Cloud, a global network that allows thousands of organizations to mobilize data at near-unlimited scale, concurrency, and speed. Organizations can use the Data Cloud to unify their siloed data, discover and securely share governed data, and run a variety of analytic workloads. Snowflake provides a unified and seamless experience across multiple public clouds, regardless of where data or users reside. Snowflake’s platform is the heart of the Data Cloud, enabling data warehousing, data lakes, data engineering, data science, data application development, and data sharing.

Snowflake is a fully managed service that is simple to use but can power a near-unlimited number of concurrent workloads, whether you’re a business or technology professional who needs the performance, flexibility, and near-infinite scalability to easily load, integrate, analyze, and securely share your data. Data warehousing, data lakes, data engineering, data science, data application development, and securely sharing and consuming shared data are all possible with Snowflake.

With big data, Snowflake’s architecture allows for similar flexibility. Snowflake separates the storage and computes functions, so businesses that need a lot of storage but don’t need a lot of CPU cycles, or vice versa, don’t have to pay for an integrated bundle that includes both. Users can scale up or down according to their needs and only pay for the resources they use. Storage is charged in terabytes per month, while computation is charged per second.
In fact, the Snowflake architecture is made up of three layers: storage, compute, and services, each of which can be scaled independently.

The Data Cloud manages more than 250PB of data for Snowflake customers, with more than 515 million data workloads running every day.
According to Forrester’s Total Economic ImpactTM Study, Snowflake customers can expect a 612 percent ROI and total benefits of more than $21 million over three years.
Technology partners and system integrators will assist you in deploying Snowflake for your success, from data management to analytics.

Key Features of Snowflake

  • Connectors and Drivers: Snowflake provides users with an extensive set of client connectors and drivers. While the connectors include Python and Spark connectors, the set of drivers includes Node.js, Go Snowflake, .NET, JDBC, ODBC, PHP PDO driver, and more.
  • Data Import and Data Export: With Snowflake, you can load and unload your data in and out of tables, including the below ways.
    • Load data from compressed files.
    • Load data files like CSV, TSV, and more.
    • Load data files in JSON, Avro, ORC, Parquet, and XML format.
    • Load data from cloud or local files using the Snowflake web interface or command-line client.
  • Data Sharing: With secure data sharing in Snowflake, you can share selected objects like Tables, External Tables, Secure Views, and Secure Materialized Views in your accounts database with other snowflake accounts.
  • SQL Operations: Snowflakes support the below SQL operations.
    • Table and View DDL
    • Standard DML
    • Database and schema DDL
    • Advanced DML
    • Transactions
    • Lateral Views and more

Explore These Methods for Amplitude Snowflake Integration

Amplitude is well-known for its collaborative analytics, which aids businesses in tracking visitors. Also, Snowflake is a data warehouse that features storage and computes separation, scalable compute on-the-fly, data sharing, and data cloning. When integrated together, moving data from Amplitude to Snowflake could solve some of the biggest data problems for businesses.

Method 1: Amplitude To Snowflake Connection Using Hevo

Hevo Data, an Automated Data Pipeline, provides you a hassle-free solution to connect Amplitude to Snowflake within minutes with an easy-to-use no-code interface. Hevo is fully managed and completely automates the process of not only loading data from 100+ sources(including 40+ sources) like Amplitude but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code.

GET STARTED WITH HEVO FOR FREE

Method 2: Using Custom Code to Move Data from Amplitude to Snowflake

This method would be time-consuming and somewhat tedious to implement. Users will have to write custom codes to enable two processes, streaming data from Amplitude and ingesting data into Snowflake. This method is suitable for users with a technical background.

Understanding Amplitude to Snowflake Migration Methods

These are the methods you can use to migrate data from Amplitude to Snowflake:

Method 1: Amplitude To Snowflake Connection Using Hevo

Hevo provides you with a truly efficient and fully-automated solution to manage data in real-time and always have analysis-ready data at your data warehouse.

Using Hevo Data, you can connect Amplitude to Snowflake in the following 2 steps:

  • Step 1: Configuring Amplitude Analytics as a Source
    • Now, log in to your Amplitude account here.
    • Navigate to the “Settings” from the left navigation menu.
  • In the “Org Settings” click on the “Projects” option.
  • Here, select the project whose data you want to export to Google BigQuery.
  • Copy the “API Key” and “Secret key” from the project details.
  • Log in to your Hevo account here.
  • Click on the “PIPELINES” option from the “Asset Palette”.
  • Click on the “+ Create” button from the “Pipelines List View“.
  • Now, select the “Amplitude Analytics” source from the “Select Source Type” page.
  • It will open the “Configure your Amplitude Analytics Source” and here provide the required details for Amplitude BigQuery Integration.
  • Click on the “TEST & CONTINUE” to proceed and set up the destination.
  • Step 2: Configuring Snowflake as a Destination: Establish a connection to Snowflake by providing information about your Snowflake Account and its credentials such as Database Name, Username, and Password.

Method 2: Using Custom Code to Move Data from Amplitude to Snowflake

Using Custom Code for Amplitude Snowflake Integration you have to implement the following:

Amplitude Snowflake Integration: Export Amplitude Data to Snowflake

Amplitude users can load their Amplitude event data into their Snowflake account. They need to set up recurring sync on their Amplitude data with the Amplitude UI. Users can also initiate a sync of their historical data in Amplitude manually. Sync is ‘synchronization’ between databases to keep consistency. This feature is enabled only to paid users; a free trial does not have it.

Amplitude Snowflake Integration: Set Up a Recurring Data Export to Snowflake 

It is an important step in Amplitude Snowflake Integration.

It is a simple step process where each sync can be completed within 10 to 15 minutes.

Follow the below steps to set up recurring export of your Amplitude data to Snowflake. It is assumed that users have logged in to Amplitude as paid users.

  • Step 1: Navigate to the Data Destination tab, as shown below.
  • Step 2: Click on the Snowflake panel under the ‘Add More Destinations‘ tab. It will open the Export Data to Snowflake page.
  • Step 3: Under the Export Data to Snowflake tab, select the data you want to export to Snowflake. You can export data like event data, merged Amplitude IDs, or even both. 
  • Step 4: Review the Event Table and Merge IDs table schemas and click on Next. It will open the ‘Setup Export’ tab.
  • Step 5: Enter the below details in the ‘Snowflake Credentials For Amplitude Section’.

Account name: It is the account name on your Snowflake account. It is the first peat of your URL, i.e., before ‘snowflakecomputing.com.’

Warehouse: It is the warehouse that Amplitude will use to load data.

Database: It is the database where the data will be stored. 

Username: It is the username that Amplitude use for connecting with the Snowflake account.

Password: It is the password associated with the username.

Note: Credentials are case-sensitive, so enter them carefully.

  • Step 6: Amplitude will dynamically create a query to generate Snowflake objects to the right side of the Snowflake credentials section. Copy and paste that query to your Snowflake account.
  • Step 7: Click on Next. Amplitude will try to upload the test data using the above credentials. If the upload is successful, click on Finish.

After the 7th step, all the future events will automatically be sent to Snowflakes.

Amplitude Snowflake Integration: Exporting Historical Amplitude Data to Snowflake

  • Step 1: Click on Export Data and select a date range to export your historical data from Amplitude into Snowflake.
  • Step 2: After selecting the data range, this process can be carried out from a day to several weeks, depending on your data size, warehouse size, cluster count, network bandwidth, and the number of concurrent historical data exports.

Amplitude Snowflake Integration: Connect Amplitude to Snowflake

Amplitude Snowflake integration enables users to ingest Snowflake data directly into your Amplitude project. 

Depending on Amplitude’s network policy, users need to accept the following IP addresses for Amplitude’s servers to access your Snowflake instance.

  • 52.33.3.219
  • 35.162.216.242
  • 52.27.10.221

Amplitude Snowflake Integration: Adding Snowflake as a Data Source

Follow the below steps to add Snowflake as a data source for Amplitude Snowflake Integration.

  • Step 1: Navigate to the Sources section and ensure that you have selected the correct project from the dropdown list. Click on the ‘I want to import data into Amplitude tab’ and click on Snowflake.
  • Step 2: Amplitude will then launch the configuration wizard, where you will enter the required credentials for the Snowflake instance.

Account: It is the Snowflake account.

Database: It is the name of the database where Amplitude can find data.

Warehouse: It is used by Amplitude to load the data.

Username: It is used by Amplitude for authentication.

Password: It is used by Amplitude for authentication. 

  • Step 3: Copy and run the auto-generated SQL code on the right side of the page to give Amplitude the proper access to your Snowflake instance.
  • Step 4: After running the code, click on Next to test the connection. If all the credentials are entered correctly, you will see the below page.
  • Step 5: Click on Next to move on to the data selection stage.
  • Step 6: You have several configuration options to choose from here. Type of data: It is used to tell Amplitude whether you will be ingesting event data or user property data.
    • Type of import:
      • Full Sync: In Full Sync import, Amplitude will periodically import the entire dataset, even if the data has already been imported. It is recommended for the datasets where the rows change over time. 
      • Time-based: In Time-base import, Amplitude will periodically ingest the most recent rows in the data, as determined by the Timestamp column. The first import will bring in all available data, while the subsequent import will ingest any data with timestamps after the most recent import. To make this work, you need to include the timestamp column in the output of your SQL statement. 
      • Frequency: You can choose from several scheduling options ranging from five minutes to one month.
      • SQL query: It is the SQL query that Amplitude uses to determine ingested data.
  • Step 7:After setting the configuration options, click on Test SQL to see how the data is coming through from your Snowflake instance. If there are errors, they will appear below the Test SQL button.
  • Step 8: If there are no errors, click on Finish. You can see a notification, which indicates that you have successfully established a Snowflake source. You will be then directed to the Sources listing page, where you will see the newly created Snowflake source.

Conclusion

In this article, you learned to integrate Amplitude Snowflake. This article mainly focuses on exporting and importing data from Amplitude. Besides connecting Amplitude with Snowflake accounts, you can also explore connecting Amplitude with tools like Amazon Redshift, Looker, Gatsby, Segment, and more. Due to its superior features and connectivity, several companies, including Microsoft, PayPal, Autodesk, Twitter, Cisco Systems, and Accenture, use Amplitude in their day-to-day operations.

Hevo Data is an Official Snowflake Partner. It is a No-Code Data Pipeline that offers a faster way to move data from 100+ Data Sources including 40+ Free Sources, into your Data Warehouse such as Snowflake. Hevo is fully automated and hence does not require you to code.

VISIT OUR WEBSITE TO EXPLORE HEVO

Want to take Hevo for a spin?

SIGN UP and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Share your experience with Amplitude Snowflake Connector in the comments section below!

No-code Data Pipeline For Snowflake