Databases support CSV files on a regular basis. BigQuery, Google’s data warehouse as a service, combines data storage and analytics in one package. BigQuery allows you to conduct real-time SQL queries on billions of records. Stakeholders are always looking, to find faster and better ways, to get all their data from all their sources, into BigQuery.

A very popular and semantically easy data format these days is CSV. It can store data from databases, clickstreams, browsing trails, social media interactions, page views, and a multitude of sources. In this article, you will see 4 ways to move data from CSV to BigQuery. Read along to select the method that works best for your business!

What is BigQuery?

Google BigQuery Logo

BigQuery is a fully managed data warehouse that helps you organize and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence.

BigQuery allows you to conduct real-time SQL queries on billions of records. Stakeholders are always looking for faster and better ways to get all their data from all their sources into BigQuery.

Why move data from CSV to BigQuery?

  • Enhances overall efficiency: Uploading CSV files to BigQuery simplifies data management and enhances the efficiency of your analytical workflows, making it easier to handle and analyze large datasets.
  • Performance: BigQuery is designed to handle massive volumes of data efficiently, offering quick query execution that reduces the time needed to gain insights from your data.
  • Advanced Analytics: BigQuery provides advanced analytics tools, such as ML and spatial data analysis, which deliver deeper insights to inform wise decision-making.
  • Cost-Effective: BigQuery’s pay-as-you-go approach ensures you only pay for the storage and queries you use, eliminating the need for expensive hardware or software.
  • Scalability: Features like real-time analytics, on-demand storage scaling, BigQuery ML, and optimization tools make it easier to manage and scale your data analysis processes as needed.
Integrate Google Sheets to BigQuery
Integrate REST API to BigQuery
Integrate Salesforce to BigQuery
Integrate Webhooks to BigQuery

Methods to Load Data from CSV to BigQuery 

Method 1: CSV to BigQuery using Command Line Interface

Prerequisites:

Step 1: Google Cloud Setup

Step 1. a) Install the gcloud command line interface

Step 1. b) Authenticate yourself to Google Cloud. For that, run a command:

gcloud auth login

Step 1. c) Sign in with your account and give all permissions to Google Cloud SDK.

After you complete all these steps, this window should appear:

Google CLI Authenticated

Step 2: Command Prompt Configurations

Step 2. a) Run command:

bq 

This will enter you in Google’s big query.

Step 2. b) To check the number of datasets in your projects, run the command:

bq ls <project_id>:
BigQuery configuration terminal

Step 2. c) Now, we need to create a new dataset. In order to do so, you can run the command:

Bq mk <new_dataset_name>
Creation of new dataset

Step 3: Load the data into the dataset

Step 3. a) First, go to your cloud SDK directory and add the CSV file that you want to upload. In my case, this is the directory.

Cloud SDK Directory

Step 3. b) Now, to load the file, go back to your command line and run the following command:

bq load --source_format=CSV example_cli.bank demo.csv ID: integer,NAME:string,AGE:integer

The output of this command will be Upload Complete.

Step 4: Preview of the data

Step 4. a) To check whether the table has been created, you can use the following command:

bq ls example_cli
New Table Created

Step 4. b) If you want to see the schema of the table, you can do so by using the command:

bq show <dataset_name>.<table_name>
Schema of the Table

Step 4. c) To preview your table, go back to BigQuery, go to tables, and click on preview.

Preview of Table

Method 2: CSV to BigQuery using BigQuery Web UI

Prerequisites:

Step 1: Create a new Dataset

To create a new dataset in BigQuery, go to your BigQuery studio, click on the three dots beside your project ID, and click Create Dataset.

Creating a Dataset using BigQuery UI

Step 2: Create a Table

To create a new table, click on the three dots next to your dataset name and click on Create Table.

Create a new table

Note:

  • Keep the file format as CSV.
  • Turn on auto-detect. This will automatically detect the incoming table schema and generate a schema accordingly.
  • Change create table from to Upload. Upload the file that you want to load into BigQuery.
Create Table configuration

Step 3: Preview Table 

Step 3. a) You can edit the schema you just created. To do so, click on the table name and click on the schema tab. You can make changes to the schema as you like from here, and finally, click on edit schema.

Edit Schema Option

Step 3. b) To preview the table, click on the preview tab beside the details tab.

Preview using Web UI

Method 3: CSV to BigQuery using Web API

Prerequisites:

Step 1: Configure BigQuery API

Go to Google Cloud Console and look for APIs and services. Search for BigQuery and click on BigQuery API.

BigQuery API Search

Note: Make sure that the service is enabled.

BigQuery API Enabled

Step 2: Configuring the Python script

Open your code editor and type the given Python script.

import time

from google.cloud import bigquery

import os

from google.cloud import bigquery

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = ""

client = bigquery.Client()

job_config = bigquery.LoadJobConfig(

    source_format=bigquery.SourceFormat.CSV,

    skip_leading_rows=1,

    autodetect=True,

)

table_id = ""

with open(r'demo.csv', "rb") as source_file:

    job = client.load_table_from_file(source_file, table_id, job_config=job_config)

while job.state != 'DONE':

    time.sleep(2)

    job.reload()

    print(job.state)

print(job.result())

table = client.get_table(table_id)

print(

    "Loaded {} rows and {} columns to {}".format(

        table.num_rows, len(table.schema), table_id

    )

)

Note: 

  • In Google application credentials provide your service account’s json key.
  • In table_id give a new table name that you want to crate.

Step 4: Preview the table

You can now preview the table you have created by following the same steps as mentioned earlier in this blog.

Table created using API preview

Method 4: CSV to BigQuery using Hevo Data

Prerequisites:

Step 1: Create a new Pipeline

Log in to your Hevo account, go to pipelines and click +Create

HEVO Create a Pipeline

Step 2: Configure your Source details

Note: Import your csv to Google Sheets so that you can configure it as a source in Hevo’s platform

Step 2. a) Search for Google Sheets as a source

Configure Source as Google Sheets

Step 2. b) Provide your source details and click on test+continue

Configure Source

Step 3: Configure your Destination details

Step 3. a) Select your destination type

Configure Destination as Google BigQuery

Step 3. b) Provide your BigQuery credentials.

Configure Destination

Note: Enable Auto Mapping. This is a feature given by Hevo that automatically maps your incoming schema to your destination.

Configure Destination

Step 3. c) Your pipeline will be created and reflected on your dashboard.

Pipeline Overview

Step 4: Preview your Table

After the pipeline has completed ingesting, you can preview your loaded table by opening BigQuery and previewing it.

Preview your Loaded Table

Limitations of Moving Data from CSV to BigQuery

  • Nesting and repetitive data are not supported in CSV files.
  • BOM (byte order mark) characters should be removed. They may result in unanticipated consequences.
  • BigQuery will not be able to read the data in parallel if you use gzip compression. It takes longer to import compressed CSV data into BigQuery than it does to load uncompressed data. See Loading compressed and uncompressed data for further information.
  • You can’t use the same load job to load compressed and uncompressed files.
  • A gzip file can be up to 4 GB in size.

Conclusion

Loading CSV data into BigQuery is a powerful way to enhance your data management and analytical capabilities. Whether you are dealing with large datasets, requiring real-time analytics, or leveraging advanced analytics tools like BigQuery ML, importing your CSV data into BigQuery can significantly streamline your workflows. This article provides a step-by-step guide for setting up a CSV to BigQuery connection using four different methods.

The first three methods are manual and can become time-consuming. Also, writing custom scripts requires high-level code knowledge, which not everyone may be familiar with. To avoid the manual part and automate the entire process you can always look up for Hevo Data and sign up for our free trial.

FAQ on CSV to BigQuery

How do I append a CSV file to a BigQuery table?

You can append additional data to an existing table by performing a load-append operation.

What is the fastest way to load data into BigQuery?

Bulk Insert into BigQuery is the fastest way to load data.

How to connect data to BigQuery?

On your computer, open a spreadsheet in Google Sheets.
In the menu at the top, click Data Data connectors. Connect to BigQuery.
Choose a project.
Click Connect.

How do I export CSV to storage in BigQuery?

Open the BigQuery page in the Google Cloud console. In the Explorer panel, expand your project and dataset, then select the table. In the details panel, click Export and select Export to Cloud Storage.

Share your thoughts on loading data from CSV to BigQuery in the comments!

Pratik Dwivedi
Technical Content Writer, Hevo Data

Pratik Dwivedi is a seasoned expert in data analytics, machine learning, AI, big data, and business intelligence. With over 18 years of experience in system analysis, design, and implementation, including 8 years in a Techno-Managerial role, he has successfully managed international clients and led teams on various projects. Pratik is passionate about creating engaging content that educates and inspires, leveraging his extensive technical and managerial expertise.

No-code Data Pipeline for BigQuery