Modifying table schemas | BigQuery

Samuel Salimon • Last Modified: August 22nd, 2023

BigQuery Alter Table_Fi

Data Warehousing architectures have rapidly changed over the years and most of the notable service providers are now Cloud-based. Therefore, companies are increasingly on the move to align with such offerings on the Cloud as it provides them with a lower upfront cost, enhances scalability, and performance as opposed to traditional On-premise Data Warehousing systems. Google BigQuery is among one of the well-known and widely accepted Cloud-based Data Warehouse Applications.

You can use BigQuery to compile all your data into one system and run SQL queries to analyze it. Data should be appropriately structured so that it can be easily examined. BigQuery resources can be created and modified via data definition language (DDL) statements based on standard SQL query syntax. BigQuery currently supports DDL commands for creating, altering, and deleting tables, views, and user-defined functions (UDFs).

In this article, you will gain information about Google BigQuery Alter Table Commands. You will also gain a holistic understanding of Google BigQuery, its key features, and the types of BigQuery Alter Table Commands. Read along to find out in-depth information about BigQuery ALTER TABLE Commands.

Table of Contents

Introduction to Google BigQuery

Google BigQuery Alter Table - Google BigQuery
Image Source

Google BigQuery is a Cloud-based Data Warehouse that provides a Big Data Analytic Web Service for processing petabytes of data. It is intended for analyzing data on a large scale. It consists of two distinct components: Storage and Query Processing. It employs the Dremel Query Engine to process queries and is built on the Colossus File System for storage. These two components are decoupled and can be scaled independently and on-demand.

Google BigQuery is fully managed by Cloud service providers. We don’t need to deploy any resources, such as discs or virtual machines. It is designed to process read-only data. Dremel and Google BigQuery use Columnar Storage for quick data scanning, as well as a tree architecture for executing queries using ANSI SQL and aggregating results across massive computer clusters. Furthermore, owing to its short deployment cycle and on-demand pricing, Google BigQuery is serverless and designed to be extremely scalable.

For further information about Google Bigquery, follow the Official Documentation.

Key Features of Google BigQuery

Google BigQuery Alter Table - Features of BigQuery
Image Source

Some of the key features of Google BigQuery are as follows:

1) Performance

Partitioning is supported by BigQuery, which improves Query performance. The data may be readily queried using SQL or Open Database Connectivity (ODBC)

2) Scalability

Being quite elastic, BigQuery separates computation and storage, allowing customers to scale processing and memory resources according to their needs. The tool has significant vertical and horizontal scalability and runs real-time queries on petabytes of data in a very short period.

3) Security

When a third-party authorization exists, users can utilize OAuth as a standard approach to get the cluster. By default, all data is encrypted and in transit. Cloud Identity and Access Management (IAM) allows for fine-tuning administration.

4) Usability

Google BigQuery is a highly user-friendly platform that requires a basic understanding of SQL commands, ETL tools, etc.

5) Data Types

 It supports JSON and XML file formats.

6) Data Loading

It employs the conventional ELT/ETL Batch Data Loading techniques by employing standard SQL dialect, as well as Data Streaming to load data row by row using Streaming APIs.

7) Integrations

In addition to operational databases, the system supports integration with a wide range of data integration tools, business intelligence (BI), and artificial intelligence (AI) solutions. It also works with Google Workspace and Cloud Platform.

8) Data Recovery

Data backup and disaster recovery are among the services provided by Google BigQuery. Users can query point-in-time snapshots of data changes from the last seven days.

9) Pricing Models

The Google BigQuery platform is available in both on-demand and flat-rate subscription models. Although data storage and querying will be chargedexporting, loading, and copying data is free. It has separated computational resources from storage resources. You are only charged when you run queries. The quantity of data processed during searches is billed.

Simplify BigQuery ETL and Analysis with Hevo’s No-code Data Pipeline

A fully managed No-code Data Pipeline platform like Hevo Data helps you integrate and load data from 100+ different sources (including 40+ free sources) to a Data Warehouse such as Google BigQuery or Destination of your choice in real-time in an effortless manner. Hevo with its minimal learning curve can be set up in just a few minutes allowing the users to load data without having to compromise performance. Its strong integration with umpteenth sources allows users to bring in data of different kinds in a smooth fashion without having to code a single line. 

Get Started with Hevo for free

Check out some of the cool features of Hevo:

  • Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
  • Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. You need to edit the event object’s properties received in the transform method as a parameter to carry out the transformation. Hevo also offers drag and drop transformations like Date and Control Functions, JSON, and Event Manipulation to name a few. These can be configured and tested before putting them to use.
  • Connectors: Hevo supports 100+ integrations to SaaS platforms, files, Databases, analytics, and BI tools. It supports various destinations including Google BigQuery, Amazon Redshift, Firebolt, Snowflake Data Warehouses; Amazon S3 Data Lakes; and MySQL, SQL Server, TokuDB, DynamoDB, PostgreSQL Databases to name a few.  
  • Real-Time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
  • 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources (including 40+ free sources) that can help you scale your data infrastructure as required.
  • 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
Sign up here for a 14-day Free Trial!

Understanding Google BigQuery Alter Table Command

Your database requirements will also change as the times change and your needs change. Occasionally, you may have to alter an existing table’s definition. In some cases, you may need to rename a column, add a new column, or change the column’s datatype column. You can make these changes using DDL statements. 

A BigQuery ALTER TABLE is a DDL statement. The following Google BigQuery ALTER TABLE commands are:

Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Learn the best practices and considerations for setting up high-performance ETL to BigQuery

1) ALTER TABLE SET OPTIONS Statement

Comma-separated lists can be used to include multiple options. You can set the options for a table using the ALTER TABLE SET OPTIONS statement. Among the choices you can set are a label and an expiration date for each table.

A) Syntax

ALTER TABLE [IF EXISTS] table_name
SET OPTIONS(table_set_options_list)

IF EXISTS checks if the table exists. If no such table exists, the statement doesn’t have any effect. There are two parts to this query: table_name and table_set_options_list. Table_name specifies the name and table_set_options_list specifies the options to set.

The following format can be used to specify a table option list:

NAME=VALUE, …

2) ALTER TABLE ADD COLUMN Statement

You may have to add an entirely new column to your database in certain situations. For example, your product manager may decide that all users record the last time logged in. You can accomplish this using the BigQuery ALTER TABLE ADD COLUMN command. Using this command, one or more columns can be added to an existing table.

A) Syntax

ALTER TABLE table_name
ADD COLUMN [IF NOT EXISTS] column_name column_schema [, ...]

Adding a new column to an existing table requires the column_name, the name of the new column, and column_schema, which is its schema.

It is important to note that this statement cannot create a partition, clustered, or nested columns inside existing RECORD fields. The same is true for REQUIRED columns in existing schemas. You can, however, create nested REQUIRED columns as part of a new RECORD field. When the IF NOT EXISTS clause is used, the statement returns an error if that column already exists in the table. However, if that column already exists, no error is returned, and no action is taken.

B) Example

In the following example, the following columns are added to the table all_users_table:

  • Column Name of type STRING.
  • Column Location of type GEOGRAPHY.
  • Column Salary of type NUMERIC with REPEATED mode.
  • Column DOB of type DATE with a description.

Code Snippet 

ALTER TABLE mydataset.all_users_table
  ADD COLUMN Name STRING,
  ADD COLUMN IF NOT EXISTS Location GEOGRAPHY,
  ADD COLUMN Salary ARRAY<NUMERIC>,
  ADD COLUMN DOB DATE OPTIONS(description="my description")

This statement fails if any of the columns named Name, Salary, or DOB exist. Due to the IF NOT EXISTS clause.

3) ALTER TABLE RENAME TO Statement

A table can be renamed using this statement. However, you should note that table snapshots cannot be changed by using the BigQuery ALTER TABLE RENAME statement. In addition, you should note that renaming a table will delete all tags associated with it or its columns in Data Catalog.

A) Syntax

ALTER TABLE [IF EXISTS] table_name
RENAME TO new_table_name

The name of the new table is new_table_name. It cannot be the same name as an existing table. In addition, if you rename a table and change its policies or row-level access policies, the changes may not take effect. You will need to wait for BigQuery to indicate that streaming is not in use before renaming a table that is presently streaming data into it.

B) Example

The following example renames the table mydataset.all_users_table to mydataset.new_allusers_table:

ALTER TABLE mydataset.all_users_table 
RENAME TO new_allusers_table

4) ALTER TABLE DROP COLUMN statement

This statement drops one or more columns from an existing table schema. The dropped column doesn’t automatically free up the associated storage when the statement runs. Columns are stored in the background for seven days from the day they are dropped. You can immediately regain space by deleting the column from a table schema. 

A) Syntax

ALTER TABLE table_name
DROP COLUMN [IF EXISTS] column_name [, ...]

Columns can only be dropped from an already existing table and schema. It is impossible to drop partitioned, clustered, or nested columns within existing RECORD columns with the ALTER TABLE DROP COLUMN statement. In the absence of the IF EXISTS clause, the statement fails if no column with the specified name exists in the table.  

B) Example

We will drop the following columns from a table called all_users_table in this example:

  • Column Name
  • Column Location
ALTER TABLE mydataset.all_users_table
  DROP COLUMN Name,
  DROP COLUMN IF EXISTS Location

The statement fails if the column name does not exist. The statement succeeds even if Location doesn’t exist since the IF EXISTS clause applies.

5) ALTER COLUMN SET OPTIONS Statement

You can set options on columns in a table with this statement, including the column description.

A) Syntax

ALTER TABLE [IF EXISTS] table_name
ALTER COLUMN [IF EXISTS] column_name SET OPTIONS(column_set_options_list)

When ALTER COLUMN [IF EXISTS] is used, it implies no effect if the specified column does not exist. A column_name refers to the top-level column you want to make changes to. However, you cannot modify nested columns in a STRUCT. In column_set_options_list, you specify the options you want to set for the column.

The column_set_options_list must be specified in the following format:

NAME=VALUE, …

VALUE and NAME must match any of the following combinations:

NAMEVALUEDetails
descriptionSTRINGExample: description=”a table that expires in 2025″

Value contains only literals, query parameters, and scalar functions. NAME is removed if the constant expression evaluates to null.

It is important to note that the constant expression cannot refer to a table, subqueries, SQL statements (SELECT, CREATE, UPDATE), user-defined functions, aggregate functions, analytic functions, or scalar functions such as ARRAY_TO_STRING, RAND, and SESSION_USER. 

If the column already had a value, setting VALUE will replace it. By selecting the VALUE to NULL, the column will be cleared.

B) Example

In the example below, a new description is added to a column called price:

ALTER TABLE mydataset.all_users_table
ALTER COLUMN price
SET OPTIONS (
  description="Price per unit"
)

6) ALTER COLUMN DROP NOT NULL Statement

Using this statement, a NOT NULL constraint is removed from a column in a table in BigQuery. The NOT NULL constraint is regarded as the column constraint.  An error is returned if a column does not have the NOT NULL constraint.

A) Syntax

ALTER TABLE [IF EXISTS] table_name
ALTER COLUMN [IF EXISTS] column DROP NOT NULL

B) Example

The following example removes the NOT NULL constraint from a column called user_column:

ALTER TABLE mydataset.all_users_table
ALTER COLUMN user_column
DROP NOT NULL

7) ALTER COLUMN SET DATA TYPE Statement

You can modify the data type in a table with the ALTER COLUMN SET DATA TYPE statement in BigQuery. For example, an INT64 data type can be changed into a FLOAT64 type, but not the other way around.

A) Syntax

ALTER TABLE [IF EXISTS] table_name
ALTER COLUMN [IF EXISTS] column_name SET DATA TYPE data_type

B) Example

The following example changes the data type of column d1 from an INT64 to FLOAT64:

CREATE TABLE dataset.table(d1 INT64);
ALTER TABLE dataset.all_users_table ALTER COLUMN c1 SET DATA TYPE FLOAT64;

8) ALTER VIEW SET OPTIONS Statement

This statement lets you set the options for a view. You can customize the options in the options list, such as the label and expiration date. A comma-separated list is an excellent way to include multiple options.

A) Syntax

ALTER VIEW [IF EXISTS] view_name
SET OPTIONS(view_set_options_list)

The format of the view option list is as follows:

NAME=VALUE, …

There are only literals, query parameters, and scalar functions in VALUE. In cases where NAME evaluates to null, the constant expression is ignored.

It should be noted that the constant expression cannot contain references to tables, subqueries, or SQL commands such as SELECT, CREATE, and UPDATE, user-defined functions, aggregate functions, and analytic functions such as REGEXP_REPLACE, GENERATE_ARRAY, REPEAT, etc. 

If the view already had a value for that option, setting the VALUE will replace it. The view’s value for the option is cleared when the VALUE is set to NULL.

B) Example

In the following example, we set the views expiration time to seven days, and we also set the description:

ALTER VIEW mydataset.users_view
SET OPTIONS (
  expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 DAY),
  description="View that expires seven days from now."
)

9) ALTER MATERIALIZED VIEW SET OPTIONS Statement

This statement allows you to set the options on a materialized view.

A) Syntax

ALTER MATERIALIZED VIEW [IF EXISTS] materialized_view_name
SET OPTIONS(materialized_view_set_options_list)

A materialized_view’s_set_options_list lets you set options such as whether to enable refresh or the refresh interval, a label, and expiration date. Multiple options can be included by using commas at the beginning and end of the list.

The following is the format for specifying a materialized view option list:

NAME=VALUE, …

If the materialized view had an existing value, changing VALUE replaces that value. When set to NULL, the materialized view’s value is cleared.

B) Example

The following example enables refresh and sets the refresh interval to 10 minutes on a materialized view:

ALTER MATERIALIZED VIEW mydataset.users_mv
SET OPTIONS (
  enable_refresh=true,
  refresh_interval_minutes=10
)

Conclusion

In this article, you have learned about the types of BigQuery Alter Table Commands. This article also provided information on Google BigQuery, its key features, and Google BigQuery Alter Table Commands.

Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations with a few clicks.

Visit our Website to Explore Hevo

Hevo Data with its strong integration with 100+ data sources (including 40+ free sources) allows you to not only export data from your desired data sources & load it to the destination of your choice such as Google BigQuery, but also transform & enrich your data to make it analysis-ready so that you can focus on your key business needs and perform insightful analysis using BI tools. 

Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also have a look at the amazing price, which will assist you in selecting the best plan for your requirements.

Share your experience of understanding Google BigQuery Alter Table Commands in the comment section below! We would love to hear your thoughts.

No-code Data Pipeline for Google BigQuery