Contentful to BigQuery: 2 Easy Methods to Replicate Data

So, you’re a Contentful user, right? It’s nice to talk to someone that knows the importance of content and its optimization for their business. Your personnel can directly work on their content flows without having any coding knowledge & collaborating easily. That’s appreciable!

At times, there would be a need to move the data about operations done on your content from Contentful to a data warehouse, such as replicating data from Contentful to BigQuery. That’s where you come in. You take the responsibility of replicating data from Contentful to a centralized repository. By doing this, the analysts and key stakeholders can make super-fast business-critical decisions.

Give a high-five! We’ve prepared a simple and straightforward guide that will help you perform the data replication from Contentful to BigQuery.

Note that currently, Hevo doesn’t support Contentful as a Source.

Table of Contents

Contentful Overview

Contentful is a headless content management platform with the primary purpose of allowing teams to manage and deliver content across multiple channels. Unlike traditional CMS platforms, Contentful separates the content from the presentation layer, which gives developers the freedom to build websites and apps, and other digital experiences using any front-end technology without the necessity of requiring developers for managing, updating, and distributing content from a single platform. With a Contentful API-first approach, you can connect the whole set of tools and services to your needs for flexible and scalable delivery of consistent, personalized content at scale.

BigQuery Overview

Google BigQuery is a fully managed, serverless data warehouse for real-time analytics and large-scale data processing. The benefits of faster SQL queries, even if the data is gigantic, run on Google’s infrastructure, make big differences. BigQuery supports structured and semi-structured data formats, both of which are appropriate for different types of analytics. Its capabilities in machine learning are built into the platform, allowing the creation of models directly from there. BigQuery is based on the pay-as-you-go pricing model allowing businesses to efficiently manage voluminous data. It helps in optimizing the cost of the business while saving them a lot of trouble in handling large data workloads.

With the increase in data sources, you would have to spend a significant portion of your engineering bandwidth creating new data connectors. A more effortless solution is opting for a No-Code solution like Hevo, which completely manages and maintains the data pipelines for you and lets you focus on your business analysis without worrying about the data transfer process.

Hevo provides effortless data integration with these features:

150+ pre-built connectors for seamless integration.
Pre-load and post-load transformation capabilities to ensure your data is always ready.
Automatic schema mapping accurately maps data at the destination without manual effort.
Fault-tolerant architecture that ensures data security with no data loss.
24/7 support to assist you during data migration.
Transparent and cost-effective pricing tailored to meet varied data needs.

Try Hevo today to experience seamless data transformation and migration.

Get started for Free with Hevo!

How to Replicate Data From Contentful to BigQuery?

You have to run multiple exports for different types of data in Contentful. This will help replicate your data in JSON files.

Follow along to replicate data from Contentful to BigQuery in JSON format:

Step 1: Export data from Contentful

You can extract information about several kinds of operations in Contentful using webhooks. These operations can include: creating, publishing, archiving, etc. It also offers several REST APIs for accessing and manipulating content in Contentful.

In this example, the export is performed using the Contentful CLI tool. You need first to download the CLI tool in your system.
In your command line, run the following command: <strong>contentful space export [options]</strong>.
The options that come as an output can be exported in a JSON file.
Now, you can run the export using the following command:

contentful space export --config example-config.json

The Contentful data will look similar to the following:

{
  "snapshot": {
    "name": "Landing Page",
    "fields": [
      {
        "id": "title",
        "name": "Title",
        "required": true,
        "localized": true,
        "type": "Text"
      },
      {
        "id": "body",
        "name": "Body",
        "required": true,
        "localized": true,
        "type": "Text"
      }
    ],
    "sys": {
      "firstPublishedAt": "2017-11-15T13:38:11.311Z",
      "publishedCounter": 2,
      "publishedAt": "2017-11-15T13:38:11.311Z",
      "publishedBy": {
        "sys": {
          "type": "Link",
          "linkType": "User",
          "id": "4FLrUHftHW3v2BLi9fzfjU"
        }
      },
      "publishedVersion": 9
    }
  },
  "sys": {
    "space": {
      "sys": {
        "type": "Link",
        "linkType": "Space",
        "id": "yadjklj1kx9rmg0"
      }
    },
    "type": "Snapshot",
    "id": "category",
    "createdBy": {
      "sys": {
        "type": "Link",
        "linkType": "User",
        "id": "4FLrUHfthjkHW3v2BLi9fzfjU"
      }
    },
    "createdAt": "2022-11-18T11:29:46.809Z",
    "snapshotType": "post",
    "snapshotEntityType": "ContentType"
  }
}

Step 2: Preparing the Data

Many a time, you won’t have a pre-defined data structure. In that case, you’ll have to create a schema for your data tables from scratch. In this case, for every value in the response, you will identify the datatypes and build a table that can receive them accordingly. You can follow the official documentation of Contentful to identify the fields and data types.

Integrate Amazon Ads to BigQuery

Get a Demo Try it

Integrate Asana to BigQuery

Get a Demo Try it

Integrate MS SQL Server to BigQuery

Get a Demo Try it

Step 3: Loading Data into BigQuery

You need to use the bq command-line tool, particularly the bq load command, to upload data to your datasets and define schema and data type information in Google BigQuery. You can refer to the following syntax.

bq --location=<LOCATION> load \
--source_format=<FORMAT> \
<DATASET.TABLE> \
<PATH_TO_SOURCE> \
<SCHEMA>

You can refer to the following documentation for any in-depth information on loading JSON data into Google BigQuery.

This process will successfully load your desired JSON files to Amazon BigQuery in a pretty straightforward way.

The above 3-step guide replicates data from Contentful to BigQuery effectively. It is optimal for the following scenarios:

One-Time Data Replication: This method suits your requirements if your business teams need the data only once in a while.
Limited Data Transformation Options: Manually transforming data in JSON files is difficult & time-consuming. Hence, it is ideal if the data in your JSON files is clean, standardized, and present in an analysis-ready form.
Dedicated Personnel: If your organization has dedicated people who have to perform the manual downloading and uploading of JSON files, then accomplishing this task is not much of a headache.
Coding Knowledge: For doing this replication, you need to have some knowledge of writing bq commands in BigQuery.

However, with the increase in data sources, you would have to spend a significant portion of your engineering bandwidth creating new data connectors. To get to the part where you start your analysis, you need to formulate custom data transformations for filtering, cleaning & standardizing your data. As your data exponentially grows with your scaling business, more sources start coming in, so with it comes the requirements of building custom data pipelines for each source.

A more effortless solution is opting for a No-Code solution that completely manages and maintains the data pipelines for you. Choosing a cloud-based Tool like Hevo allows you to focus completely on your business analysis without worrying about the data transfer process.

Summing It Up

Exporting & uploading JSON files is your go-to solution when your data analysts require fresh data from Contentful only once in a while. The bq command tool allows you to copy data from a JSON file into BigQuery easily. This method is a good choice if you rarely need to copy data and require little to no data transformations. Though, when you need to frequently replicate data from multiple sources with complex transformations for complete business analysis, then Hevo is the right choice for you!

Now, you don’t need to bite the bullet and spend months developing & maintaining custom data pipelines. You can make all hassle go away in minutes by taking a ride with Hevo Data’s automated no-code data pipeline. Sign up for Hevo’s 14-day free trial and experience seamless data migration.

Manisha Jena Research Analyst, Hevo Data

Manisha Jena is a data analyst with over three years of experience in the data industry and is well-versed with advanced data tools such as Snowflake, Looker Studio, and Google BigQuery. She is an alumna of NIT Rourkela and excels in extracting critical insights from complex databases and enhancing data visualization through comprehensive dashboards. Manisha has authored over a hundred articles on diverse topics related to data engineering, and loves breaking down complex topics to help data practitioners solve their doubts related to data engineering.

Contentful to BigQuery: 2 Easy Methods to Replicate Data

Contentful Overview

BigQuery Overview

How to Replicate Data From Contentful to BigQuery?

Step 1: Export data from Contentful

Step 2: Preparing the Data

Step 3: Loading Data into BigQuery

Summing It Up

Related Articles

Optimize your data integration with Hevo!

Related articles