What is BigQuery Data Transfer Service? An Easy Guide

• October 21st, 2020

BigQuery Data Transfer Service | Hevo Data

Today, companies generate and store huge volumes of data. Storing and querying such volumes of data can be costly and time-consuming, especially for a company that doesn’t have the right hardware and infrastructure. 

Google understands this, and that’s why they developed BigQuery. It is an enterprise data warehouse that uses the processing power of Google’s infrastructure to enable super-fast SQL queries. So, you can move data from your data warehouse to BigQuery for improved performance. 

The BigQuery Data Transfer Service automates the process of moving data from various data sources into BigQuery. In this article, we will be discussing the BigQuery Data Transfer Service in detail. 

Table of Contents

Prerequisites for Creating a Data Transfer Service in BigQuery

To create a Data Transfer Service in BigQuery:

  • You need a BigQuery Account with access to a BigQuery Project.
BigQuery Data Transfer Service: Project | Hevo Data
  • You need access to a Data Source to transfer data from it into BigQuery. For this demonstration, Amazon S3 and Amazon Redshift are chosen as Data Sources.
BigQuery Data Transfer Service: Source | Hevo Data
  • You need to take a call on the “Schedule Options” to set up the frequency of Data Transfer. This is to allow BigQuery to automatically pull data from the source at a designated time.
BigQuery Data Transfer Service: Schedule | Hevo Data
  • Choose the format in which data needs to be extracted from the source.
BigQuery Data Transfer Service: Schema | Hevo Data
  • And finally, you’ll need access to a Destination Dataset where the data extracted from the source will be stored in BigQuery.
BigQuery Data Transfer Service: Destination Dataset | Hevo Data

What is BigQuery?

BigQuery is a serverless, cost-effective, and highly scalable multi-cloud Data Warehouse designed and developed for business agility. It was developed by Google, which means it uses the processing power of Google’s infrastructure. BigQuery has built-in machine learning capabilities that you can use to understand your data better. 

Here are the three main ways to use BigQuery:

  • Load and export data

You can easily and quickly load your data into BigQuery. BigQuery will process your data, after which you can export it for further analysis. 

  • Query and view data

You can run interactive queries in BigQuery. You can also execute batch queries and create virtual tables from data. 

  • Manage data

A table or view must belong to a dataset, so you need to create at least one BigQuery dataset before loading data into BigQuery. Check out our Ultimate Guide to BigQuery Dataset for easy understanding.

With BigQuery, you can list jobs, projects, datasets, and tables. It’s easy to get information about any of these and patch or update your datasets. BigQuery makes it easy for you to delete and manage any data that you upload. 

You can use BigQuery to create dashboards and reports that you can use to analyze your data and understand it better. BigQuery is also a good tool for real-time analytics. 

To learn more about Google BigQuery, click here.

Key Features of Google BigQuery

Below are some of the top Google BigQuery features that set it apart from other cloud-native tools for companies all over the world: 

  • Date functions: It may sound a bit too standard. However, it’s a handy feature when converting dates from multiple sources to a single format for advanced analytics. Moreover, with the date functions, you can set up automatically updated reports that trigger mailings.   
  • Aggregate Functions: With this feature, you can quickly get a summary of the data in a particular table. You can learn more about Google BigQuery Aggregate Functions by clicking here.
  • Window Functions: Similar to Aggregate Functions, these carry out data summary calculations. The only difference is that they do not deal with the entire set but rather a specified one.  

Simplify your Data Analysis with Hevo’s No-code Data Pipelines

Hevo Data, a No-code Data Pipeline, helps you transfer data from 100+ sources to BigQuery to visualize it in your desired BI tool. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Get Started with Hevo for Free

Check out what makes Hevo amazing:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

What is BigQuery Data Transfer Service?

This is a feature offered by BigQuery that allows you to transfer data from external sources automatically. It allows you to take data from multiple data sources on a fully managed and scheduled basis. 

Examples of such data sources include:

  • Google Marketing Platform
  • YouTube
  • Google Ads
  • Partner SaaS applications to BigQuery
  • Amazon S3
  • Teradata

The Data Transfer Service makes BigQuery a great tool if you need to merge data from different sources into one place. 

Here are the different ways through which you can access the BigQuery Data Transfer Service:

  • Cloud Console
  • bq command-line tool
  • classic BigQuery web UI
  • BigQuery Data Transfer Service API

In the next few sections, I will be showing you how to transfer data from an external cloud storage provider (Amazon S3) and from a data warehouse (Amazon Redshift) using the BigQuery Data Transfer Service. 

To learn more about Data Migration, click here.

What are the Supported Data Sources?

BigQuery currently supports Data Transfer from the following Data Sources.

On top of that, BigQuery also supports several third-party transfers that are available in the Google Cloud Marketplace.

How to Access the BigQuery Data Transfer Service?

You can access BigQuery Data Transfer Service through Cloud Console, bq command-line tool or BigQuery Data Transfer Service API.

  • Cloud Console: The Google Cloud Console provides a Graphical User Interface that you can use to create and manage BigQuery resources. To access the BigQuery page, open the Cloud Console and click on BigQuery.
  • bq command-line tool: The bq command-line tool is a Python-based command-line tool for BigQuery. 
  • BigQuery Data Transfer Service API: You can schedule queries or transfer external data from SaaS to BigQuery regularly.

How to Edit a Data Transfer in BigQuery?

Once you have created a Data Transfer in BigQuery, you can further change it by following the below-mentioned steps.

  • Open your BigQuery Account.
  • Open the Project whose Data Transfer you want to edit.
  • From the sidebar, locate and click on “Data transfers“.
BigQuery Data Transfer Service: Data Transfers | Hevo Data
  • Click on the Data Transfer Service you want to edit.
  • Locate and click on the “Edit” button to edit the Data Transfer.
BigQuery Data Transfer Service: Edit Data Transfer | Hevo Data
  • You can now change the name of your Data Transfer, schedule it as per your need, and configure the Data Transfer settings.
  • Once you’re done with the modifications, click on “Save“.
BigQuery Data Transfer Service: Save Data Transfer | Hevo Data

What is the Pricing of BigQuery Data Transfer Service?

The BigQuery Data Transfer Service follows a prorated model and charges monthly. BigQuery offers different pricing for different Data Sources and Data Warehouses.

Applicable to all data transfers, standard BigQuery storage and query pricing applies after data is transferred to BigQuery. However, Data Extraction and Data loading into a Cloud Storage bucket or BigQuery is free.

You may incur additional storage costs with Teradata migrations, as data is not automatically deleted from your Cloud Storage bucket after it is uploaded to BigQuery. You have to manually delete the data from your Cloud Storage bucket to avoid additional storage costs.

BigQuery Data Transfer Service offers plans & pricing for different use cases and business needs, check them out!

How to Migrate Data from Amazon S3 into BigQuery?

You can use the BigQuery Data Transfer Service to move data from Amazon S3 to BigQuery. 

The person creating the transfer must have the following permissions in BigQuery:

  • bigquery.transfers.update permissions to create the transfer
  • bigquery.datasets.update permissions on the target dataset

To create an Amazon S3 data transfer, follow the steps given below:

Step 1: Open the BigQuery page on the cloud console. 

Step 2: Click “Transfers”. 

BigQuery Data Transfer Service: Click Transfers | Hevo Data

Step 3: Click “Create a Transfer”. 

Step 4: Move to the “Source type” section and select “Amazon S3” for “Source”. 

BigQuery Data Transfer Service: Select Source | Hevo Data

Step 5: In the “Transfer config name” section, enter a name for the transfer in the “Display name” field. 

Step 6: Select a schedule for which you need the transfer to be done. 

Enter the date and time that you need the transfer to begin in the “Start date and run time” field. For the “Destination dataset” dropdown in the “Destination settings” section, select the dataset that you created to store your data. 

In the “Destination source details” section, enter the name of the table that you created to store the data in the “Destination table” field. Enter the URI in the “Amazon s3 URI” field. 

Follow the format given below:

s3://mybucket/myfolder/...

Enter your access key ID in the “Access Key ID” field. Enter your secret access key in the “Secret access key” field. Also, select your data format in the “File format” dropdown. 

BigQuery Data Transfer Service: Source Details | Hevo Data

Specify the maximum number of bad errors that can be ignored in the “Number of errors allowed” field. 

Step 7: Click “Save”. 

How to Migrate Data from Redshift?

You can use the BigQuery Data Transfer Service to move data from the Amazon Redshift data warehouse to BigQuery. 

Ensure that you’ve the following permissions in BigQuery:

  • bigquery.transfers.update permissions to create the transfer
  • bigquery.datasets.update permissions on the target dataset

To ensure there is a smooth migration, ensure you meet the following prerequisites on Google Cloud:

Step 1: Select or create a Google Cloud project where to store your migration data. 

  • Open the project selector page in the Cloud Console. 
  • Create or select a Cloud project. 

Step 2: Enable BigQuery Data Transfer API. 

Just open the Google Cloud Console and click the “Enable” button on the BigQuery Data Transfer Service API page. 

BigQuery Data Transfer Service: Click Enable | Hevo Data

Step 3: Create a BigQuery dataset to store the data. 

You don’t have to create any table. You can then follow the steps given below to set up the Amazon Redshift transfer:

Step 1: Open the BigQuery page on the Cloud console. 

Step 2: Click “Transfers”. 

Step 3: Click “Add Transfer”. 

Step 4: Do the following on the “New Transfer” page:

For Source, select “Migration: Amazon Redshift”. Enter a name for the transfer in the “Display name” field. Select the right dataset from the “Destination dataset” dropdown. 

BigQuery Data Transfer Service: Destination Setting | Hevo Data

Step 5: Under the “Data Source Details”, enter the specific details of your Amazon Redshift transfer. For “JDBC connection url for Amazon Redshift”, enter the JDBC url to access your Amazon Redshift cluster. 

For “Username of the database”, provide the username for the Amazon Redshift database that you would like to migrate. Enter the database password in the “Password of your database” field. 

Enter your access key pair in the “Access key ID” and “Secret access key” fields. Enter the URI of the S3 bucket that you will use as the staging area in the “Amazon S3 URI” field. For “Amazon Redshift Schema”, enter the Amazon Redshift Schema that you are migrating. For “Table name patterns”, enter a name or a pattern to match the table names in the schema. 

BigQuery Data Transfer Service: Enter details | Hevo Data

Click the toggle button to enable email notifications. Select your topic name on the “Select a Pub/Sub topic” dropdown or click “Create a topic”. 

Step 6: Click the “Save” button. 

BigQuery Data Transfer Service: Click Save | Hevo Data

The transfer setup details will be displayed on the Cloud Console. 

What are the Limitations of BigQuery Data Transfer Service?

Here are the limitations of the BigQuery Data Transfer Service:

  1. It doesn’t integrate with all data sources. 
  2. Currently, you cannot use the BigQuery Data Transfer Service to migrate data from BigQuery to other data storage. 

Conclusion

This is what you’ve learned in this article:

  1. What BigQuery Data Transfer Service is. 
  2. How to use the BigQuery Data Transfer Service to migrate data from Amazon S3 into BigQuery. 
  3. How to use the BigQuery Data Transfer Service to migrate data from Amazon Redshift into BigQuery. 

Hevo Data, a No-code Data Pipeline, helps you transfer data from a source of your choice in a fully-automated and secure manner without having to write the code repeatedly. Hevo, with its strong integration with 100+ sources & BI tools, allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiff.

Visit our Website to Explore Hevo

Want to take Hevo for a spin?Sign Up and experience the feature-rich Hevo suite first hand. Hevo offers plans & pricing for different use cases and business needs, check them out!

Have any further queries? Get in touch with us in the comments section below.

No-Code Data Pipeline for BigQuery