Databases support CSV files on a regular basis. BigQuery, Google’s data warehouse as a service, combines data storage and analytics in one package. BigQuery allows you to conduct real-time SQL queries on billions of records. Stakeholders are always looking, to find faster and better ways, to get all their data from all their sources, into BigQuery.
A very popular and semantically easy data format these days is CSV. It can store data from databases, clickstreams, browsing trails, social media interactions, page views, and a multitude of sources. In this article, you will see 4 ways to move data from CSV to BigQuery. Read along to select the method that works best for your business!
What is BigQuery?
BigQuery is a fully managed data warehouse that helps you organize and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence.
BigQuery allows you to conduct real-time SQL queries on billions of records. Stakeholders are always looking for faster and better ways to get all their data from all their sources into BigQuery.
Why move data from CSV to BigQuery?
- Enhances overall efficiency: Uploading CSV files to BigQuery simplifies data management and enhances the efficiency of your analytical workflows, making it easier to handle and analyze large datasets.
- Performance: BigQuery is designed to handle massive volumes of data efficiently, offering quick query execution that reduces the time needed to gain insights from your data.
- Advanced Analytics: BigQuery provides advanced analytics tools, such as ML and spatial data analysis, which deliver deeper insights to inform wise decision-making.
- Cost-Effective: BigQuery’s pay-as-you-go approach ensures you only pay for the storage and queries you use, eliminating the need for expensive hardware or software.
- Scalability: Features like real-time analytics, on-demand storage scaling, BigQuery ML, and optimization tools make it easier to manage and scale your data analysis processes as needed.
Methods to Load Data from CSV to BigQuery
Method 1: CSV to BigQuery using Command Line Interface
Prerequisites:
Step 1: Google Cloud Setup
Step 1. a) Install the gcloud command line interface
Step 1. b) Authenticate yourself to Google Cloud. For that, run a command:
gcloud auth login
Step 1. c) Sign in with your account and give all permissions to Google Cloud SDK.
After you complete all these steps, this window should appear:
Step 2: Command Prompt Configurations
Step 2. a) Run command:
bq
This will enter you in Google’s big query.
Step 2. b) To check the number of datasets in your projects, run the command:
bq ls <project_id>:
Step 2. c) Now, we need to create a new dataset. In order to do so, you can run the command:
Bq mk <new_dataset_name>
Step 3: Load the data into the dataset
Step 3. a) First, go to your cloud SDK directory and add the CSV file that you want to upload. In my case, this is the directory.
Step 3. b) Now, to load the file, go back to your command line and run the following command:
bq load --source_format=CSV example_cli.bank demo.csv ID: integer,NAME:string,AGE:integer
The output of this command will be Upload Complete.
Step 4: Preview of the data
Step 4. a) To check whether the table has been created, you can use the following command:
bq ls example_cli
Step 4. b) If you want to see the schema of the table, you can do so by using the command:
bq show <dataset_name>.<table_name>
Step 4. c) To preview your table, go back to BigQuery, go to tables, and click on preview.
Integrate Google Sheets to BigQuery
Integrate REST API to BigQuery
Integrate Salesforce to BigQuery
Integrate Webhooks to BigQuery
Method 2: CSV to BigQuery using BigQuery Web UI
Prerequisites:
Step 1: Create a new Dataset
To create a new dataset in BigQuery, go to your BigQuery studio, click on the three dots beside your project ID, and click Create Dataset.
Step 2: Create a Table
To create a new table, click on the three dots next to your dataset name and click on Create Table.
Note:
- Keep the file format as CSV.
- Turn on auto-detect. This will automatically detect the incoming table schema and generate a schema accordingly.
- Change create table from to Upload. Upload the file that you want to load into BigQuery.
Step 3: Preview Table
Step 3. a) You can edit the schema you just created. To do so, click on the table name and click on the schema tab. You can make changes to the schema as you like from here, and finally, click on edit schema.
Say goodbye to manual uploads! With Hevo’s no-code platform, you can quickly load CSV data into BigQuery in real time. Enjoy automated schema mapping, built-in transformations, and a seamless integration process—all with just a few clicks.
Still not sure? See how Postman, the world’s leading API platform, used Hevo to save 30-40 hours of developer efforts monthly and found a one-stop solution for all its data integration needs.
Get Started with Hevo for Free
Step 3. b) To preview the table, click on the preview tab beside the details tab.
Method 3: CSV to BigQuery using Web API
Prerequisites:
Step 1: Configure BigQuery API
Go to Google Cloud Console and look for APIs and services. Search for BigQuery and click on BigQuery API.
Note: Make sure that the service is enabled.
Step 2: Configuring the Python script
Open your code editor and type the given Python script.
import time
from google.cloud import bigquery
import os
from google.cloud import bigquery
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = ""
client = bigquery.Client()
job_config = bigquery.LoadJobConfig(
source_format=bigquery.SourceFormat.CSV,
skip_leading_rows=1,
autodetect=True,
)
table_id = ""
with open(r'demo.csv', "rb") as source_file:
job = client.load_table_from_file(source_file, table_id, job_config=job_config)
while job.state != 'DONE':
time.sleep(2)
job.reload()
print(job.state)
print(job.result())
table = client.get_table(table_id)
print(
"Loaded {} rows and {} columns to {}".format(
table.num_rows, len(table.schema), table_id
)
)
Migrate Google Sheets to BigQuery within Minutes!
No credit card required
Note:
- In Google application credentials provide your service account’s json key.
- In table_id give a new table name that you want to crate.
Step 4: Preview the table
You can now preview the table you have created by following the same steps as mentioned earlier in this blog.
Method 4: CSV to BigQuery using Hevo Data
Prerequisites:
Step 1: Configure Google Sheets as your Source.
Step 2: Configure BigQuery as your Destination details
Step 3: Monitor the Pipeline.
After the pipeline has completed ingesting, you can preview your loaded table by opening BigQuery and previewing it.
Limitations of Moving Data from CSV to BigQuery
- Nesting and repetitive data are not supported in CSV files.
- BOM (byte order mark) characters should be removed. They may result in unanticipated consequences.
- BigQuery will not be able to read the data in parallel if you use gzip compression. It takes longer to import compressed CSV data into BigQuery than it does to load uncompressed data. See Loading compressed and uncompressed data for further information.
- You can’t use the same load job to load compressed and uncompressed files.
- A gzip file can be up to 4 GB in size.
Overcome these limitations by choosing Hevo- a no code automated platform. Get Started with Hevo by signing up for 14-day free trial.
Learn More on How to Unload and Load CSV to Redshift and Import CSV File Into PostgreSQL Table
Conclusion
Loading CSV data into BigQuery is a powerful way to enhance your data management and analytical capabilities. Whether you are dealing with large datasets, requiring real-time analytics, or leveraging advanced analytics tools like BigQuery ML, importing your CSV data into BigQuery can significantly streamline your workflows. This article provides a step-by-step guide for setting up a Import CSV to BigQuery connection using four different methods.
The first three methods are manual and can become time-consuming. Also, writing custom scripts requires high-level code knowledge, which not everyone may be familiar with. To avoid the manual part and automate the entire process you can always look up for Hevo Data and sign up for our free trial.
FAQ on CSV to BigQuery
How do I append a CSV file to a BigQuery table?
You can append additional data to an existing table by performing a load-append operation.
What is the fastest way to load data into BigQuery?
Bulk Insert into BigQuery is the fastest way to load data.
How to connect data to BigQuery?
On your computer, open a spreadsheet in Google Sheets.
In the menu at the top, click Data Data connectors. Connect to BigQuery.
Choose a project.
Click Connect.
How do I export CSV to storage in BigQuery?
Open the BigQuery page in the Google Cloud console. In the Explorer panel, expand your project and dataset, then select the table. In the details panel, click Export and select Export to Cloud Storage.
Share your thoughts on loading data from CSV to BigQuery in the comments!
Pratik Dwivedi is a seasoned expert in data analytics, machine learning, AI, big data, and business intelligence. With over 18 years of experience in system analysis, design, and implementation, including 8 years in a Techno-Managerial role, he has successfully managed international clients and led teams on various projects. Pratik is passionate about creating engaging content that educates and inspires, leveraging his extensive technical and managerial expertise.