Efficient Data Management with BigQuery INSERT and UPDATE Commands

Sarad Mohanan • Last Modified: August 22nd, 2023

BIGQUERY INSERT AND UPDATE

Recent years have witnessed many new platforms and software in the field of data management. With the vast sea of information that is growing day by day, most organizations are looking towards Cloud-based solutions to collect, store and work on this precious commodity called data.BigQuery is one such popular Cloud-based Data Warehouse that provides high scalability, simplicity, and abstraction to its users. Unlike its peers, BigQuery is serverless and hence provides services dynamically which means that no provisioning or hardware management is required from the user.

This blog will introduce you to Bigquery and will explain the BigQuery INSERT and UPDATE command in detail. These are SQL queries that can help you in managing your data in Google BigQuery. Read along to learn more about BigQuery and understand the working of bigQuery INSERT and UPDATE commands!

Table of Contents

Prerequisites

  • A Google Cloud account with permission to create projects.
  • Basic understanding of databases and SQL. 

Introduction to Google BigQuery

Google BigQuery Logo
Image Source

Google BigQuery is a completely managed data warehouse service. It has a comprehensive querying layer with state-of-the-art processing ability and response times. BigQuery helps customers to experience a powerful data warehouse without having to spend money on developing and maintaining one. BigQuery is offered based on a pay-as-you-go model. Over the typical data warehouse features, BigQuery also offers many supporting features. BigQuery ML helps users to run models on BigQuery data using SQL queries. Connected sheets help users to analyze the data in BigQuery using Google Sheets. Other than the data residing in its own storage, BigQuery can also access data from various other locations including Google Drive, Cloud SQL, etc.

To learn more about Google BigQuery, visit here.

Introduction to BigQuery SQL

SQL in Cloud
Image Source

Since BigQuery is a Data Warehouse service, its querying layer plays a big role in its acceptability for use cases. Data Manipulation statements in BigQuery are implicit transactions, that is they are auto committed. There is no support for multi-statement transactions in BigQuery. Another interesting behavior is the ability to concurrently execute data manipulation queries. This means concurrently executing statements that are conflicting with each other can raise errors and fail.

Now that you have learned about the basics of BigQuery DML, you can understand the following schema in BigQuery that can be used in the rest of the tutorial:

The Outlet table

[
  {"name": "outlet_name", "type": "string"},
  {"name": "location", "type": "string"}
]

The Stock table

[
 {"name": "product", "type": "string"},
 {"name": "quantity", "type": "integer"},

 {"name": "outlet", "type": "string"}
]

The StockDetails  table 

[
 {"name": "product", "type": "string"},
 {"name": "color", "type": "string"},
 {"name": "price", "type": "string"},
 {"name": "expiry_months", "type": "integer"},
]

To learn more about SQL in BigQuery, visit here.

Simplify your Data Analytics Using Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to integrate data from 100+ other data sources (Including 40+ Free Data Sources) and load it in a Data Warehouse of your choice to visualize it in your desired BI tool. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Get Started with Hevo for Free

Check out some of the cool features of Hevo:

  • Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
  • Real-time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
  • 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources that can help you scale your data infrastructure as required.
  • 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.

BigQuery INSERT and UPDATE

BigQuery INSERT and UPDATE are the most used commands when it comes to Data Management in BigQuery. The syntax, importance, and use of both these commands are as follows:

Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Learn the best practices and considerations for setting up high-performance ETL to BigQuery

1) BigQuery INSERT and UPDATE: INSERT Command

Out of the BigQuery INSERT and UPDATE commands, you must first learn the basic INSERT statement constructs to interact with the above table definitions.

INSERT query follows the standard SQL syntax. The values that are being inserted should be used in the same order as the columns. The below image shows an example of INSERT command

INSERT Command in BigQuery
Image Source

You can execute a basic INSERT query with columns specified as below.

INSERT INTO dataset.StockDetails(name, price) VALUES(‘Samsung Galaxy Note 10,’999’’)

An INSERT query without specifying columns can be executed as below.

INSERT INTO dataset.StockDetails VALUES(‘SG Note 10’,’Mystic Black’,’999’,’24’)

An INSERT query using a subquery can be executed as follows.

INSERT INTO dataset.stockDetails(product, color) VALUES(‘SG Note 10’, SELECT color from dataset.StockDetails WHERE product=’Samsung Galaxy Note 20’)

BigQuery also supports the WITH keyword in the QUERY syntax.

A point to note here is the implicit data typing mapping that is executed by BigQuery in the case of INSERT statements. This can lead to corrupted Databases in case the user is not careful. BigQuery will raise an error for data type mismatch only in cases where the incoming column cannot be converted to the target column data type.

2) BigQuery INSERT and UPDATE: UPDATE Command

In the case of BigQuery, each UPDATE statement must include a WHERE clause. This offers protection against the bulk erroneous updates if the developer accidentally forgets the WHERE clause. To update the complete table, you have to specifically mention WHERE TRUE.

The most basic UPDATE example will look like below.

UPDATE dataset.StockDetails SET color = ‘Mystic Green’ WHERE product=’SG Note 20 Ultra’

To update records by using a string match operation, the query will be as follows.

UPDATE dataset.StockDetails SET color = ‘Mystic Green’ WHERE product LIKE ‘%Note%’

The above statement will update the color of all records where the product contains ‘Note’ in its name.

An Update statement using a JOIN clause can be executed as below.

UPDATE dataset.StockDetails a SET color = ‘black’ FROM dataset.Stock b WHERE a.product = b.product AND b.outlet = ‘central_park’

The above query updates the color of all the records that are belonging to a particular store.

An explicit JOIN keyword has to be used in case the JOIN is between tables that are not getting updated. An example will be as follows.

UPDATE dataset.StockDetails SET quantity=’10’ FROM dataset.Stock INNER JOIN dataset.Store on Stock.outlet = Outlet. outlet_name WHERE StockDetails.product = Stock.product and Outlet.location = ‘New York’

The above command will set the quantity as 10 for all items in all outlets belonging to the location ‘New York’.

That covers the basics of INSERT and UPDATE statements in BigQuery. While implementing an ETL pipeline to synchronize the data between your transactional databases and data warehouse, an automated scheduled script using the above constructs can be used. Implementing such a script has a lot more additional complexities because of the existence of many factors like duplicate rows, data type translations, etc. 

But there is an alternative.

It is to use a completely managed ETL tool like Hevo that can accomplish data transfer between various sources and destinations easily.

That’s it! You can now try and implement the BigQuery INSERT and UPDATE commands by yourself.

Conclusion

This article teaches you how to work with Google BigQuery INSERT and UPDATE statements with ease. It also provides in-depth knowledge about the concepts behind every step to help you understand and implement them efficiently. Integrating complex data from a diverse set of data sources can be a challenging task and this is where Hevo saves the day!

Hevo Data offers a faster way to move data from 100+ data sources such as SaaS applications or Databases into your Data Warehouse such as Google BigQuery to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Share your thoughts on BigQuery INSERT and UPDATE queries. 

No-code Data Pipeline for your BigQuery Data Warehouse