The Common table expressions, commonly known as CTEs in SQL Server, is a tool that allows users to design and arrange queries. It has faster development, troubleshooting, and performance improvement. Common table expressions are a functional feature in SQL that will enable you to perform multi-step and complex transformations in a single easy-to-read query. Because of their readability and flexibility, CTEs are a useful tool for beginners and experts alike.

CTE in SQL Server provides users with a convenient way to query data faster and make it more readable. It is similar to using SQL commands like temp tables and sub-queries. With CTEs (also called SQL WITH clauses), you have quick troubleshooting options, as you can create multiple statements and run select statements inside the CTE. Read on to find more about BigQuery WITH statements, their syntax, benefits, and tips to write them. 

Prerequisites

  • Understanding of SQL

Introduction to BigQuery

BigQuery WITH: BigQuery Logo
Image Source

Google’s BigQuery is a cloud-native, enterprise-grade data warehouse that was first launched in 2012. Since its inception, BigQuery has evolved into a more economical and fully managed data warehouse that enables users to run blazing fast, interactive, ad hoc queries on petabyte-scale datasets. In addition, BigQuery now integrates with various Google Cloud Platform (GCP) services and third-party tools, making it more valuable.

Since BigQuery is a serverless platform, there are no servers to manage or database software to install. BigQuery service contains underlying software and infrastructure, including high availability, flexible pricing, and scalability. 

BigQuery exposes a simple client interface where users can run interactive queries if they know little about its underlying architecture. To get started with BigQuery, you must import your data into BigQuery and then write your queries using SQL dialects offered by BigQuery. A good understanding of BigQuery architecture is helpful when implementing various queries, including controlling costs, optimizing query performance, and optimizing storage. 

Introduction to Common Table Expressions (CTEs)

A common table expression or CTE was introduced in the fourth SQL revision, SQL:1999. It is a relatively new SQL feature. SQL Server first introduced them in 2005, and then PostgreSQL and MySQL made them available starting with Version 8.4 in 2009 and Version 8.0. in 2018, respectively. 

CTE is a temporary dataset returned by a query, which is then used by another query. It’s temporary because the system does not store the results anywhere; it exists only when you run the query. The Common Table Expressions (CTE) is used in standard SQL to simplify various SQL Query classes for which a derived table was unsuitable. You can reference a CTE within the INSERT, SELECT, UPDATE, CREATE or DELETE statement in SQL. As of SQL Server 2008, you can now add a CTE to the MERGE statement.

Here is a standard CTE syntax:

BigQuery WITH: CTE Syntax
Image Source

There are two types of CTEs:

1. Non-recursive CTEs

Non-Recursive CTEs are when you don’t use any recursion or repeated processing in a sub-routine. The following example creates a simple Non-Recursive CTE to display the row number from 1 to 10:

BigQuery WITH: Non-recursive CTEs
Image Source

2. Recursive CTEs

Recursive CTEs use repeated procedural loops, and the recursive query calls itself until the query satisfies the condition. In a recursive CTE, there must be a ‘where’ condition to terminate the recursion:

BigQuery WITH: Recursive CTEs
Image Source
Simplify your Data Analysis with Hevo’s No-code Data Pipeline

A fully managed No-code Data Pipeline platform like Hevo Data helps you integrate data from 150+ data sources (including 40+ Free Data Sources) to a destination of your choice like Google BigQuery in real-time in an effortless manner. Hevo with its minimal learning curve can be set up in just a few minutes allowing the users to load data without having to compromise performance. Its strong integration with umpteenth sources allows users to bring in data of different kinds in a smooth fashion without having to code a single line. 

Get Started with Hevo for Free

Check out some of the cool features of Hevo:

  • Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
  • Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. You need to edit the event object’s properties received in the transform method as a parameter to carry out the transformation. Hevo also offers drag and drop transformations like Date and Control Functions, JSON, and Event Manipulation to name a few. These can be configured and tested before putting them to use.
  • Connectors: Hevo supports 100+ integrations to SaaS platforms, files, databases, analytics, and BI tools. It supports various destinations including Salesforce CRM, Google BigQuery, Amazon Redshift, Firebolt, Snowflake Data Warehouses; Amazon S3 Data Lakes; and MySQL, MongoDB, TokuDB, DynamoDB, PostgreSQL databases to name a few.  
  • Real-Time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
  • 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
  • Scalable Infrastructure: Hevo has in-built integrations that can help you scale your data infrastructure as required.
  • 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

Understanding the Importance of CTEs

Here is why CTEs can play a vital role in your business workflow:

  • Increase readability: Using a CTE can increase the readability of the SQL query. This is especially useful in a large query and those who use complicated logic in the column. You can move the complex logic to the temporary CTE to make the overall query simple.
  • Use it like SQL View: Even though CTEs is temporary, it is named result set. It also has columns from the original table. Unlike SQL Views that are saved in the database, you can use CTEs as Views to avoid unnecessary metadata. 
  • Use a recursion or hierarchical query: SQL With Clause allows you to write a recursive query or a hierarchical query, which is a question that refers to the previous rows from the same query. 

WITH clause

The WITH clause consists of a single or multiple common table expressions (CTEs). The CTE connects the results of a subquery to your table name. This will be useful wherever there is the same query expression.

The syntax is given below:

WITH cte[, ...]

BigQuery data warehouse does not materialize the outcome of non-recursive CTEs in the WITH. When a non-recursive CTE is referenced in more than one time in a query, the CTE is executed once for each reference. Therefore, a WITH clause helps to break down complex queries.

Understanding How to Write a BigQuery WITH Statement

CTEs are also called SQL WITH clauses because CTEs necessarily use the WITH keyword to define sub-queries. It’s sometimes called sub-query factoring because you are improving the design of an SQL query.

Common table expressions (CTEs) help you break up complex queries in BigQuery. Here’s a simple query to illustrate how to write a BigQuery WITH statement:

BigQuery WITH: Query Example
Image Source

Tips for writing BigQuery WITH Statements

Here are a few tips you can use to write BigQuery WITH statements:

  • A single SELECT, UPDATE, DELETE, or INSERT statement that references some or all the BigQuery WITH columns must follow the BigQuery WITH statement. You can also specify a BigQuery WITH statement in a CREATE VIEW statement as part of the defining SELECT statement of the view.
  • You can define multiple BigQuery WITH query definitions in a non-recursive CTE. These set operators: UNION ALL, UNION, INTERSECT, or EXCEPT must combine the definitions in a BigQuery WITH clause.
  • A BigQuery WITH statement can reference itself and previously defined CTEs in the same BigQuery WITH clause. However, forward referencing is not allowed for a BigQuery WITH statement.
  • You can’t specify more than one BigQuery WITH clause in a statement. For example, if a CTE_query_definition contains a sub-query, that subquery cannot contain a nested BigQuery WITH clause that defines another CTE.
  • The statement of recursive CTE must contain at least two BigQuery WITH statement definitions: a recursive member and an anchor member. Defining a BigQuery WITH clause simply means writing a SELECT query, giving a result you want to use within another query. Some members of the anchors and recursive members can be defined. However, all definitions of queries of anchor members must be placed before the definition of the first recursive member. All BigQuery WITH definitions is an anchor member unless they refer to the BigQuery WITH statement itself.
  • One of the set operators (UNION ALL, UNION, INTERSECT, or EXCEPT) must combine anchor members. UNION ALL is the only operator set permitted between the last anchors and first recursive member and when connecting several recursive members in a BigQuery WITH statement.

Considerations

WITH clause helps to resolve your problem by computing an intermediate outcome used in the queries. On the other hand, you can have a table for saving the query result. 

You can use the queries containing WITH clause in another query having FROM clause. This is similar to how it is done in Oracle to define a table function.Still,the query created is complex to understand, even if the end outcome is the same. You can also add more subquery that can be combined as a chain. With this, you can have intermediate results required to answer the query.

You can either keep the queries before processing in WITH clause without moving data to store in a table. Or, you can store the en result of the preprocessing query in a table. This will help you query that quickly in future. Because, preprocessing and joins in the query would be already done. And, you can share the data table with others as well. 

A limitation of proceeding this way is that you might require a subset of the data by consistently adding the clause WHERE which can impact performance. So, you can use WITH clause faster. Thanks to the push-down feature in BigQuery,  WITH clause is more efficient than a result saved in an indexed table. When your query having WITH clause doesn’t have a WHERE filter, BigQuery adds the WHERE clause filter of the following query along with the WITH statement for optimizing the process.

Also, the problem with the permanent table is that, you lose the calculations done in the various steps. This indicatres that you have to delete it and make it again whenever there is any changing the query in the beginning.

Conclusion

CTEs in SQL allow users to create a temporarily named query that helps increase their query readability. CTEs in SQL is implemented using the WITH clauses. As you can see, whether recursive or non-recursive, CTE can be a useful tool when you need to produce a temporary set of results that can be accessed in the SELECT, UPDATE, DELETE, JOIN, or INSERT statement. CTE is like a derived table in a certain sense: it is not stored as an object and is only valid while implementing the main statement. However, unlike derivative tables, CTEs can be referenced several times in the query.

In addition, unlike a sub-query, CTEs can refer themselves. And best of all, CTEs are relatively easy to implement.

Extracting complex data from a diverse set of data sources to carry out an insightful analysis can be challenging, and this is where Hevo saves the day! Hevo offers a faster way to move data from Databases or SaaS applications to be visualized in a BI tool for free. Hevo is fully automated and hence does not require you to code.

Visit our Website to Explore Hevo

Want to take Hevo for a spin? Sign Up for the 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Osheen Jain
Freelance Technical Content Writer, Hevo Data

Osheen has extensive experience in freelance writing within the data industry and is passionate about simplifying the complexities of data integration and data analysis through informative content for those delving deeper into these subjects.

No-code Data Pipeline for BigQuery