Google BigQuery — a serverless, highly scalable, and cloud-agnostic data warehouse-as-a-service — is a blessing for today’s businesses that deal in petabyte-scale information exchange environments. Backed by the processing power of Google’s infrastructure, BigQuery is known for super-fast query resolving speed and cost-efficiency.
BigQuery is also a Data Analytics platform covering the complete analytics value chain — ingesting, processing, and storing data, followed by advanced analytics, including cross-functional collaboration. Hence, efficient schema management is prioritized among businesses that depend highly on analytics. In this tutorial article, we will focus on just one aspect, i.e., methods to rename a column in BigQuery.
In general, it’s not possible to rename a column because it is not supported by the Cloud Console, the command-line tool, or the API. And, it is certain, if you ever try to update a table schema using a renamed column, the following error returns:
BigQuery error in update operation: Provided Schema does not match Table <strong>project_id</strong>:<strong>dataset</strong>.<strong>table</strong>.
But, this tutorial article will demonstrate how to rename a column in BigQuery. We will use two methods to execute and get results of our likings.
- Renaming a column in BigQuery using queries
- Renaming a column in BigQuery by exporting and loading data into a new table
Besides that, to make this tutorial article coherent to all we will also give a brief about BigQuery and its architecture.
Table of Contents
- What is Google BigQuery?
- BigQuery Architecture
- How to Rename a Column in BigQuery?
What is Google BigQuery?
An enterprise-wide data warehouse for analytics, BigQuery is a fully managed and serverless data warehouse-as-a-service. It empowers today’s data folks to analyze information efficiently by creating a logical data warehouse into columnar storage and compiling data from object storage and spreadsheets.
Google BigQuery democratizes gaining new business insights by empowering businesses to make data-driven decisions, run analytics, and analyze petabytes scale SQL queries. Some key features of Google BigQuery are BigQuery ML, Big Query GIS, BigQuery BI Engine, and connected sheets.
Built on top of the Dremel technology, Google BigQuery, has a serverless architecture. It decouples data locality and offers distinct storing and processing clusters. But, BigQuery differs from node-based cloud data warehousing solutions because it leverages technologies like Borg, colossus, Jupiter, and Dremel to produce optimum performance.
- Dremel: BigQuery executes user queries with the help of the Dremel query engine. It breaks queries into pieces and reassembles the results. Google search, Google ads, Youtube, and Gmail all widely use Dremel. To know more about Dremel, don’t forget to read the paper on Dremel published in 2010.
- Colossus: Colossus is Google’s latest generation distributed file system. It manages replication, recovery, and distributed management. BigQuery has the ColumnIO columnar storage format and compression algorithm, which can efficiently store and compute a large amount of data in a fraction of time.
- Borg: Borg is Google’s large-scale cluster management system. It’s the brains behind operations and consists of dozens of thousands of machines and hundreds of thousands of cores.
- Jupiter: Jupiter networking infrastructure is a powerful differentiator which sets apart Google’s cloud platform from the rest. It can deliver one petabit/sec of total bisection bandwidth.
Simplify Data Analysis With Hevo’s No-code Data Pipeline
Hevo Data is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready format without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secured, consistent manner with zero data loss.Get Started with Hevo for Free
Check out why Hevo Data is the Best:
- Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
- Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
- Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
- Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
- Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
- Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
- Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
How to Rename a Column in BigQuery?
There exist two methods to rename a column in BigQuery:
- Method 1: Rename a column in BigQuery using queries
- Method 2: Rename a column in BigQuery by exporting and loading data into a new table
Let’s discuss them in detail.
Method 1: Rename a column in BigQuery using queries
The method requires selecting all the columns in the table first, then assigning an alias to the column you want to rename — the alias must follow the BigQuery’s column names rules. Listed below are some advantages and disadvantages of Method 1.
- Using a query, you shift the data onto a new destination table to preserve the original data.
- Overwriting on the original table will save storage costs because you are using one table instead of two. But, this also means you lose the original data.
- The query scan charges can be massive if the table size is large.
- Writing the query to result from an older table to a new destination table will incur storage costs for two tables.
The example below shows a standard SQL query selecting all the data in mytable excluding two columns that are to be renamed. Hence an alias is used (as discussed above) to generate new names for the respective columns. We will rename column_one to newcolumn_one and column_two to newcolumn_two.
Using the Console method steps to rename a column in BigQuery looks like this:
Step 1: Select Query editor in the Cloud Console.
Step 2: Enter the query present below to select all the data in mydataset.mytable. We will not select the two columns which we are going to rename. In short, the query changes the name of the following two columns, column_one and column_two to newcolumn_one and newcolumn_two, respectively.
SELECT * EXCEPT(column_one, column_two), column_one AS newcolumn_one, column_two AS newcolumn_two FROM mydataset.mytable
Step 3: Now, click on more and select Query settings.
Step 4: Check the “Set a destination table for query results” checkbox, present under the Destination section.
Step 5: Continue with the processes shown below:
- For the project that contains mydataset.mytable, leave the value set as the default project.
- For Dataset name, choose mydataset.
- In the Table name field, enter mytable, and click OK.
Step 6: In the “Destination table write preference” section select the “Overwrite table.”
Step7: Click on Save, then click on Run in the Query editor. The new column names appear in the mytable.
Method 2: Rename a column in BigQuery by exporting and loading data into a new table
This method begets by first exporting the table to Cloud Storage then loading the required data into a new table with a new schema definition, containing the new column name. It is also possible to use the load job and overwrite the existing table.
- BigQuery does not charge you for the export job and the load job. it’s free.
- Overwriting onto the original table will save storage costs because you are using one table instead of two. But, this also means you lose the original data.
- Loading data onto a new table incurs storage costs for two tables until and unless you delete the old table and lose the original data.
- BigQuery charges money for storing the exported data in Cloud Storage.
Through this tutorial article, we successfully discussed the process to rename a table in BigQuery using two standard methods. If you want to learn more and want to learn some extended use cases — how to change a column’s data type or column mode — the below cited articles can help you.
Nevertheless, it’s imperative to fall past lackluster and tedious warehouse management work routines in today’s data-driven business culture. To mitigate such a scenario, Hevo offers a fully managed solution — Hevo Managed Google BigQuery — that enables you to manage Google BigQuery hassle-free.