Shopify to BigQuery: Steps to Load Data in Minutes

on Tutorial • February 25th, 2020 • Write for Hevo

You have your complete eCommerce store set up on Shopify. You collect data on the orders placed, carts abandoned, products viewed and so on. You now want to move all of this data on Shopify to a robust data warehouse such as Google BigQuery so that you can combine this information with data from many other sources and gain deep insights. Well, you have landed on the right blog. This blog talks about moving data from Shopify to BigQuery for analytics. Before we dive in, let us briefly understand how these systems operate.

Understanding Shopify

With the advent of the internet and cloud, Shopify has emerged as one of the easiest ways for individuals and businesses to set up an online eCommerce store. Shopify liberates the user of all the technical complexities and development effort making it super simple to set up a business online. Shopify has a range of products that handle everything from landing page set up to, payments to reporting. 

Although Shopify gives out a range of analytics reports, it is not always easy to unify Shopify data with the other data that a business may have. Eg: Advertising data from Facebook Ads or Google Ads might have to be combined with Shopify orders to understand the true ROI from these campaigns. That is where the need to move data from Shopify to BigQuery arises. 

Understanding Google BigQuery

Google BigQuery, a cloud data warehouse solution offered by Google allows users to build SQL-like queries to answer deep analytical questions. BigQuery speed of execution combined with its pay for what you use payment model makes it a lucrative option for modern businesses that want to run analytics on large data sets. 

Businesses often tend to load data from different data sources – Shopify for eCommerce, Facebook/Google Ads for Advertising, Intercom for Support and more to BigQuery and in turn answer deeper analytical questions by querying this data. 

Moving Data from Shopify to BigQuery

There are many approaches one could take to move data from Shopify to BigQuery. In this post, we will discuss the following:

Approach 1: Using Hevo – A Completly Managed Data Integration Platform

Hevo provides a point and clicks interface that will allow you to move data without writing any code. Since Hevo is fully managed, the setup time is minimal and you will have your Shopify data in BigQuery in minutes.

Approach 2:  Writing Custom Code Using Shopify API

You would need to invest in some engineering resources that can understand Shopify APIs and build code to extract data from here. Once the infrastructure is ready, you would need to test, deploy and validate the consistency of data. Given this is a manual process, the time to insights would be high.

Let us discuss both these approaches in detail in this blog. 

Shopify to BigQuery – Automating the Data Load using Hevo

With Hevo, bringing data from Shopify to BigQuery becomes a cakewalk. Here are the steps: 

  1. Configure a Webhook endpoint to load data from Shopify
  2. Connect Google BigQuery account and start moving data

Hevo takes care of automatically mapping all your data to relevant tables in BigQuery, giving you access to Shopify data in real-time. Sign up for a risk-free 14-day free trial with Hevo and experience a hassle-free data migration from Shopify to BigQuery.

Additionally, Hevo helps you clean, transform and enrich data both before and after moving the data warehouse, ensuring that you have analysis-ready data in your warehouse at any point.

Here are the other advantages: 

  1. Simplicity – Hevo is super intuitive, super simple to use. With Hevo, you can start loading data from Shopify to BigQuery in just a few clicks.
  2. Real-time Data – Hevo’s real-time streaming architecture ensures that you move data from Shopify to BigQuery instantly, without any delay. This allows you to gain meaningful insights in real-time.
  3. Reliable Data Load – Hevo’s fault-tolerant architecture ensures that your data is loaded in a consistent and reliable fashion without any loss of data.
  4. Scalability – Hevo is built to handle data of any scale. Additionally, Hevo can bring data from a multitude of data sources. Both these aspects ensure that Hevo can help you scale your data infrastructure as your data needs grow.

Shopify to BigQuery – Writing Custom Code Using Shopify API

This approach broadly involves 3 steps: 

  1. Pulling Data from Shopify using API calls
  2. Loading data into BigQuery

Pulling Data from Shopify: 

Shopify exposes a wide set of RESTful APIs to its customers to extract data on orders, cart updates, inventory status, customers and more. You can explore the complete set of APIs here.

Some things to note when using Shopify APIs: 

  1. Since Shopify allows you to extract a wide variety of data sets, you would need to first list down the data points you would need and learn about the APIs that will help you to extract this data
  2. To make a Shopify API call, you would need to provide API Authentication. This can be done by generating a username and password using the Shopify admin account.

Once you have these pre-requisites addressed, initiating an API call to extract product data from Shopify would look like this: 

curl --user user:password GET /admin/api/2019-10/products.json?limit=100

Shopify imposes limits on the amount of data that it sends out in a single API call. The default pagination limit is 50 if the limit parameter is not given. For the API call made above 250 page results will be shared. 

If you need to extract more data, then you would need to store the ID value of the last product that the API returned and then use this id to get the next data set. 

This is done as below.

curl --user user:password GET /admin/api/2019-10/products.json?limit=100&since_id=632910392 -o products.json

Where since_id is the last product id that was received on the previous API call. 

Once you have successfully executed this API call, you will now receive a response in JSON format. This is a nested JSON that contains all the information related to the products such as title, description, images, etc. and more importantly the variants sub JSON which provides all the variant-specific information like inventory_quantity,  barcode, product price, etc.

Moving Data to BigQuery: 

Now that you have extracted the data from Shopify, you will next need to load it to BigQuery. 

We can load data into BigQuery directly using API call or can upload the CSV / JSON file generated in the previous step using BigQuery UI.

Before you move data to BigQuery, make sure that the data types returned by the API response are supported by BigQuery. You can read more about BigQuery data types and conversions here.

Steps to Load Data to BigQuery: 

  1. Upload the prepared data to Google Cloud Storage (using Gsutil or the Web Console)
  2. Create a BigQuery table to store this data
  3. Load to BigQuery from your GCS bucket using BigQuery’s command-line tool (bq) or any cloud SDK. Read more about bq commands here.

We recommend you read through this guide that can help you with loading data to Google BigQuery

Limitations of the above approach:

  1. Effort Intensive: Using code to move data from Shopify to BigQuery using custom code requires you to learn and bring together many different technologies. Given the learning curve involved, your data projects’ timelines can be affected.
  2. Not Real-time: The process mentioned above does not help you bring data in real-time. You would have to configure a cron job and write extra code to bring data in real-time.
  3. Data Consistency: Since you are extracting business-critical data out of Shopify, you would need to set up a separate process to validate the data received in BigQuery. Any inconsistency would negatively impact the decisions you may take as a business.
  4. Cannot Transform Data: Many times, you would encounter use cases where you need to convert currencies or standardize time zones in order to perform efficient analytics. The above approach does not cover that.
  5. Maintainance Overhead: If Shopify makes any changes on the API or say Google Bigquery is down, any break in your script will result in irretrievable data loss. Hence, you would need to constantly monitor and maintain the data

The above challenges can be overcome by implementing Hevo, that can get the job done in just a few clicks. Sign up for a 14-day free trial to experience a hassle-free data load from Shopify to Google BigQuery.

No-code Data Pipeline for BigQuery