Are you struggling to load data into BigQuery? Are you confused, which is the best method to load data into BigQuery? If yes, then this blog will answer all your queries. In this article, you will learn how to load data into BigQuery, and explore some different data type uploads to the Google Big Query Cloud Storage, including CSV and JSON files.

You will also learn about the ways of uploading through an API or add-on. If you need to analyze terabytes of data in a few seconds, Google BigQuery is the most affordable option.

What is Google BigQuery?

Google BigQuery is serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility. 

Here are few features of Google BigQuery:

  • BigQuery allows us to analyze petabytes of data at a quick speed with zero operational overhead. 
  • No cluster deployment, no virtual machines, no setting keys or indexes, and no software are required.
  • Stream millions of rows per second for real-time analysis.
  • Thousands of cores are used per query.
  • Separate storage and computing. 

To understand more about Google BigQuery, please refer to the following Hevo Data article. 

Types of Data Load in BigQuery

Following types of data loads are supported in Google BigQuery: 

  • You can load data from cloud storage or a local file. The supported records are in the Avro, CSV or JSON format. 
  • Data exports from Firestore and Datastore can be uploaded into Google BigQuery.
  • You can load data from other Google Services such as Google Ads Manager and Google Analytics. 
  • Streaming inserts can be actively loaded in BigQuery.
  • Data Manipulation Language (DML) statements are also used for bulk data upload.  

Data uploading through Google Drive is NOT yet supported, but data can be queried in the drive using an external table

Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Learn the best practices and considerations for setting up high-performance ETL to BigQuery

Data Ingestion Format

Proper Data Ingestion format is necessary to carry out a successful upload of data. The following factors play an important role in deciding the data ingestion format:

  1. Schema Support: One important feature of BigQuery is that it creates a table schema automatically based on the source data. Data formats like Avro, ORC, and Parquet are self-describing formats. No specific schema support is needed for these, but for data formats like JSON and CSV, an explicit schema can be provided. 
  2. Flat Data/Nested and Repeated Fields: Nested and Repeated data helps in expressing hierarchical data. All the formats including Avro, ORC, Parquet, Firestore exports, support data with Nested and Repeated Fields. 
  3. Embedded Newlines: When data is being loaded from JSON files, the rows need to be newline delimited. Query expects newline-delimited JSON files to contain a single record per line. 
  4. Encoding: BigQuery supports UTF-8 encoding for both nested, repeated and flat data. For the CSV files, BigQuery supports ISO-8859-1 encoding for flat data. 

Load Data into BigQuery

To load data into BigQuery, the following steps must be followed:

  1. Before you upload any data, you need to create a dataset and table in Google BigQuery. To do this on the BigQuery, go to the home page and select the resource in which you want to create a dataset. 
Load data into BigQuery: Select  the resource
  1. In the Create dataset window, give your dataset an ID, select a data location, and set the default table expiration period.
    Note: If you select “Never” for table expiration, the physical storage location will not be defined. For temporary tables, you can specify the number of days to store them.
Load Data into BigQuery: Create dataset
  1. Next, create a table in the dataset. 
Load data into BigQuery: Create table

After table creation, you can now load data into BigQuery. Let’s explore the different data type uploads to the Google Big Query Cloud Storage:

1. Upload Data from CSV File

To upload data from CSV file, go to the create table window, select a data source and use the upload function. 

Load Data into BigQuery: Upload from CSV File

Select the file and file format. 

In the next step, define the destination for the data, the name of the project and the dataset. As mentioned earlier, there are two options available for the table. One is native and the other option is external. 

Google BigQuery will automatically determine the table structure, but if you want to add fields manually, you can use either the text revision function or the ‘+ Add field’ button. Also, if you want to change how Google BigQuery parses data from CSV files, you can use the advanced options. 

2. Upload Data from JSON Files

To upload data from JSON files, repeat all the steps to create or select the dataset and table that you are working with and then select JSON as the file format. You can upload a JSON file from your computer, Google Cloud Storage, or Google Drive Disk. 

Further information about the JSON format is available on Google Cloud Documentation. 

3. Upload Data from Google Cloud Storage

Google Cloud Storage allows you to securely store and transfer data online. The following file formats can be uploaded from Google Cloud Storage to Google Big Query:

  • CSV
  • JSON
  • Avro
  • Parquet 
  • ORC 
  • Cloud Datastore

You can read more about using Cloud Storage with big data in the following documentation. 

4. Upload Data from Other Google Services

To upload data from various Google services, you first need to configure the BigQuery Data Transfer Service. Before you can use it, you must select or create a data project and, in most cases, enable billing for it. For example, billing is mandatory for these services:

  • Campaign Manager
  • Google Ads Manager
  • Google Ads
  • YouTube – Channel Reports
  • YouTube – Content Owner Reports

To start the BigQuery Data Transfer Service, on the BigQuery Home Page, select Transfers from the left-hand menu. Admin Access is needed to create a Transfer. In the next window, all you have to do is select the data source you desire. The BigQuery Data Transfer Service can be accessed not only from the platform console but also from the following: 

  • Classic bq_ui
  • bq command-line tool 
  • BigQuery Data Transfer Service API

Once configured, the service will automatically and regularly upload data to BigQuery. However, you cannot use it to download data from BigQuery.

5. Download Data with the API

With Cloud Client Libraries, you can use your favourite programming language to work with the Google BigQuery API. 

Here you can read more about downloading data using API from

To start, you need to create or select the project with which you want to work. On the home page, go to the APIs section.

Learn more about Google Analytics to BigQuery Integration.

Additional Resources on Load Data into Bigquery

Conclusion

In this blog, you learned about the Google BigQuery and how to load data into BigQuery. You also explored some different data type uploads to the Google Big Query Cloud Storage, including CSV and JSON files. But, if you want to automate your data flow, try Hevo.

Hevo is a No-code Data Pipeline. It supports pre-built integrations from 150+ data sources. You can load data into BigQuery from your desired data source in a few minutes.

Give Hevo a try by signing up for a 14-day free trial today.

Load data into BigQuery and share your experience with us in the comment section below.

Frequently Asked Questions

1. Is BigQuery a database or data warehouse?

Google BigQuery is a fully managed, serverless data warehouse offered by Google Cloud.

2. What is the fastest way to load files into BigQuery?

The fastest way to load files into BigQuery generally involves using Google Cloud Storage as an intermediary, leveraging the bq command-line tool or API, optimizing file formats, and using features like partitioning and clustering.

Muhammad Faraz
Technical Content Writer, Hevo Data

Muhammad Faraz is an AI/ML and MLOps expert with extensive experience in cloud platforms and new technologies. With a Master's degree in Data Science, he excels in data science, machine learning, DevOps, and tech management. As an AI/ML and tech project manager, he leads projects in machine learning and IoT, contributing extensively researched technical content to solve complex problems.