Shopify to Snowflake – Move Data in Minutes

on Tutorial • February 28th, 2020 • Write for Hevo

Companies are on a quest to become more data-driven so that they can provide their customers with a seamless and engaging experience. There is a lot of data being generated by Shopify stores and some stores are looking to eliminate their data silos by integrating these massive amounts of data from Shopify to Snowflake. Before we start discussing the process involved to move data, let us understand a little about these individual platforms.

Snowflake Overview

The Snowflake platform allows users to have a petabyte-scale database and infinite compute scale with no management overhead. Data in Snowflake can only be accessed through SQL query operations. Snowflake manages all aspects of data storage: organization, structure, compression, metadata, statistics, etc.

Snowflake is well-positioned to assist customers to take advantage of the cloud because:

  • It is a very easy product to learn. You only need to know SQL.
  • It provides a great amount of elasticity and flexibility.
  • There is a lot of new innovation happening in the cloud that is not necessarily available on-premise.
  • It is an extremely fast product in terms of performance and so customers can adopt Snowflake and put new use cases and workloads into Snowflake very quickly and very easily. This is very different from the past where businesses had to plan for the purchase of hardware and a whole lot of services wrapped around the complexity of integration of their big data and analytics. 

Shopify Overview

Shopify is a managed e-commerce platform for online and retail POS systems. Shopify enables businesses to set up and manage online stores, accept credit card payments, and track and respond to orders.

Although Shopify provides detailed analytics on their application, businesses would need to combine this input with data from marketing or sales applications that they use. This will allow them to perform deeper analytics and uncover meaningful insights that grow the business. 

In this post, we will be looking at methods to extract data from Shopify to use it for further analysis in Snowflake. 

Moving Data from Shopify to Snowflake

This blog will cover the following approaches to move data from Shopify to Snowflake:

Approach 1: Using Hevo Data (Official Snowflake Partner for Data Integration)

Hevo’s simple graphical interface paired with its out-of-the-box integrations with Shopify and Snowflake enable you to move data without having to write any code. Since Hevo is completely managed, you would not have any dependency on engineering resources. This will allow you to drastically bring down the project timelines and get insights fast.

Approach 2:  Using Shopify API and Snowflake

You would need to invest in some engineering resources that can understand Shopify APIs and build code to extract data from here. Once the infrastructure is ready, you would need to test, deploy and validate the consistency of data. Given this is a manual process, the time to insights would be high.

Let us discuss both these approaches in detail in this blog. 

Shopify to Snowflake: Move Data Using Custom Code

1. Pull data from Shopify’s servers using the Shopify REST API

Shopify exposes its complete platform to developers through their Web API. The API can be accessed through HTTP using tools like CURL or Postman. The Shopify API returns JSON-formatted data. To get this data, we need to make a request to the Event endpoint like this. 

GET /admin/events.json?filter=Order,Order Risk,Product,Transaction

This request will pull all the events that are related to Products, Orders, Transactions created for every order that results in an exchange of money, and Fraud analysis recommendations for these orders. The response will be in JSON.

{
  "transactions": [
    {
      "id": 457382019,
      "order_id": 719562016,
      "kind": "refund",
      "gateway": "bogus",
      "message": null,
      "created_at": "2020-02-28T15:43:12-05:00",
      "test": false,
      "authorization": "authorization-key",
      "status": "success",
      "amount": "149.00",
      "currency": "USD",
      "location_id": null,
      "user_id": null,
      "parent_id": null,
      "device_id": iPad Mini,
      "receipt": {},
      "error_code": null,
      "source_name": "web"
    },
    {
      "id": 389404469,
      "order_id": 719562016,
      "kind": "authorization",
      "gateway": "bogus",
      "message": null,
      "created_at": "2020-02-28T15:46:12-05:00",
      "test": false,
      "authorization": "authorization-key",
      "status": "success",
      "amount": "201.00",
      "currency": "USD",
      "location_id": null,
      "user_id": null,
      "parent_id": null,
      "device_id": iPhoneX,
      "receipt": {
        "testcase": true,
        "authorization": "123456"
      },
      "error_code": null,
      "source_name": "web",
      "payment_details": {
        "credit_card_bin": null,
        "avs_result_code": null,
        "cvv_result_code": null,
        "credit_card_number": "•••• •••• •••• 6183",
        "credit_card_company": "Visa"
      }
    },
    {
      "id": 801038806,
      "order_id": 450789469,
      "kind": "capture",
      "gateway": "bogus",
      "message": null,
      "created_at": "2020-02-28T15:55:12-05:00",
      "test": false,
      "authorization": "authorization-key",
      "status": "success",
      "amount": "90.00",
      "currency": "USD",
      "location_id": null,
      "user_id": null,
      "parent_id": null,
      "device_id": null,
      "receipt": {},
      "error_code": null,
      "source_name": "web"
    }
  ]
}

2. Preparing Data for Snowflake

Snowflake natively supports semi-structured data, which means semi-structured data can be loaded into relational tables without requiring the definition of a schema in advance. For JSON, each top-level, complete object is loaded as a separate row in the table. As long as the object is valid, each object can contain newline characters and spaces.

Typically, tables used to store semi-structured data consist of a single VARIANT column. Once the data is loaded, you can query the data like how you would query structured data. 

3. Uploading JSON Files to Amazon S3

To upload your JSON files to Amazon S3, you must first create an Amazon S3 bucket to hold your data. Use the AWS S3 UI to upload the files from local storage.

4. Create an External Stage

An external stage specifies where the JSON files are stored so that the data can be loaded into a Snowflake table.

create or replace stage your_s3_stage url='s3://{$YOUR_AWS_S3_BUCKET}/'
credentials=(aws_key_id='{$YOUR_KEY}' aws_secret_key='{$YOUR_SECRET_KEY}')
encryption=(master_key = '5d24b7f5626ff6386d97ce6f6deb68d5=')
file_format = my_json_format;

5. Pull Data into Snowflake

use role dba_shopify;
create warehouse if not exists load_wh with warehouse_size = 'small' auto_suspend = 300 initially_suspended = true;
use warehouse load_wh;
use schema shopify.public;

/*------------------------------------------
Load the pre-staged shopify data from AWS S3
------------------------------------------*/
list @{$YOUR_S3_STAGE};

/*-----------------------------------
Load the data
-----------------------------------*/
copy into shopify from @{$YOUR_S3_STAGE}

6. Validation

Following the data load, verify that the correct files are present on Snowflake.

select count(*) from orders;

select * from orders limit 10;

Shopify to Snowflake: Limitations Using Custom Code

  • Pulling the data correctly from Shopify servers is just a single step in the process of defining a data pipeline for custom analytics. There are other issues that you have to consider like how to respect API rate limits, handle API changes, etc. 
  • If you would like to have a complete view of all the available data then you will have to create a much complex ETL process which includes 35+ Shopify resources.
  • The above process can only help you bring data from Shopify in batches. If you are looking to load data in real-time, you would need to configure cron jobs and write extra lines of code to achieve that. 
  • Using the REST API to pull data from Shopify can be cumbersome. If Shopify changes the API or Snowflake is not reachable for a particular duration, any such anomalies can break the code and result in irretrievable data loss.
  • In case you would need to transform your data before loading to the warehouse – eg: you would want to standardize time zones or unify currency values to a single denomination, then you would need to write more code to achieve this.

An easier way to overcome the above limitations:

Shopify to Snowflake: An Easier Alternative – Hevo Data

Hevo Data Integration platform can help you overcome the above limitations and move data from Shopify to Snowflake in just a few minutes.

Here are the steps:

  1. Authenticate, connect and configure your Shopify data source
  2. Connect your Snowflake data warehouse account and start moving data

That is it. Hevo will now take charge and ensure that your data is reliably loaded from Shopify to Snowflake in real-time. Sign up for a 14-day free trial to start moving your Shopify data.

Here are more reasons to explore Hevo: 

  1. Low Effort, Low Maintenance – Once you configure the account, Hevo will automatically take care of all the processes in the background. This will also eliminate any additional maintenance overheads and frees up time to focus on other important initiatives.
  2. Real-time Data Load – Hevo’s real-time streaming architecture ensures that your Shopify data within your Snowflake data warehouse is always up-to-date.
  3. More Data Sources – In addition to Shopify, Hevo can bring data from a large array of sources – Sales and Marketing apps, Databases, Analytics apps and so on. Hevo will ensure that you have a way to bring any data to Snowflake and derive meaningful insights. 
  4. Ability to Transform Data – Hevo has built-in data transformation capabilities that allow you to build SQL queries to transform data within your Snowflake data warehouse. This will ensure that you always have analysis-ready data.

Hevo integrates with Shopify seamlessly and brings data to Snowflake without the added complexity of writing and maintaining ETL scripts.

Sign up for a 14-day risk-free trial with Hevo. This will give you an opportunity to experience Hevo’s simplicity so that you enjoy an effortless data load from Shopify to Snowflake.

What are your thoughts on moving data from Shopify to Snowflake? Let us know in the comments. 

No-code Data Pipeline for Snowflake