The Common table expressions, commonly known as CTEs in SQL Server, is a tool that allows users to design and arrange queries. It has faster development, troubleshooting, and performance improvement.

Common table expressions are a functional feature in SQL that will enable you to perform multi-step and complex transformations in a single easy-to-read query. Because of their readability and flexibility, CTEs are a useful tool for beginners and experts alike.

CTE in SQL Server provides users with a convenient way to query data faster and make it more readable. It is similar to using SQL commands like temp tables and sub-queries.

With CTEs (also called SQL WITH clauses), you have quick troubleshooting options, as you can create multiple statements and run select statements inside the CTE.

Prerequisites

  • Understanding of SQL

Introduction to BigQuery

A good understanding of BigQuery architecture is helpful when implementing various queries, including controlling costs, optimizing query performance, and optimizing storage. 

Introduction to Common Table Expressions (CTEs)

  • CTE is a temporary dataset returned by a query, which is then used by another query. It’s temporary because the system does not store the results anywhere; it exists only when you run the query.
  • The Common Table Expressions (CTE) is used in standard SQL to simplify various SQL Query classes for which a derived table was unsuitable.
  • You can reference a CTE within the INSERT, SELECT, UPDATE, CREATE or DELETE statement in SQL. As of SQL Server 2008, you can now add a CTE to the MERGE statement.

Here is a standard CTE syntax:

BigQuery WITH: CTE Syntax
Image Source

There are two types of CTEs:

1. Non-recursive CTEs

Non-Recursive CTEs are when you don’t use any recursion or repeated processing in a sub-routine. The following example creates a simple Non-Recursive CTE to display the row number from 1 to 10:

Pro Tip:- Derive Meaningful Insights from Your Unified Data on BigQuery

Integrate Oracle to BigQuery
Integrate PostgreSQL to BigQuery
Integrate MongoDB to BigQuery
Integrate Salesforce to BigQuery

2. Recursive CTEs

Recursive CTEs use repeated procedural loops, and the recursive query calls itself until the query satisfies the condition. In a recursive CTE, there must be a ‘where’ condition to terminate the recursion:

BigQuery WITH: Recursive CTEs
Image Source

Understanding the Importance of CTEs

Here is why CTEs can play a vital role in your business workflow:

  • Increase readability: Using a CTE can increase the readability of the SQL query. This is especially useful in a large query and those who use complicated logic in the column. You can move the complex logic to the temporary CTE to make the overall query simple.
  • Use it like SQL View: Even though CTEs is temporary, it is named result set. It also has columns from the original table. Unlike SQL Views that are saved in the database, you can use CTEs as Views to avoid unnecessary metadata. 
  • Use a recursion or hierarchical query: SQL With Clause allows you to write a recursive query or a hierarchical query, which is a question that refers to the previous rows from the same query. 

WITH clause

The WITH clause consists of a single or multiple common table expressions (CTEs). The CTE connects the results of a subquery to your table name. This will be useful wherever there is the same query expression.

The syntax is given below:

WITH cte[, ...]

BigQuery data warehouse does not materialize the outcome of non-recursive CTEs in the WITH. When a non-recursive CTE is referenced in more than one time in a query, the CTE is executed once for each reference. Therefore, a WITH clause helps to break down complex queries.

Understanding How to Write a BigQuery WITH Statement

CTEs are also called SQL WITH clauses because CTEs necessarily use the WITH keyword to define sub-queries. It’s sometimes called sub-query factoring because you are improving the design of an SQL query.

Common table expressions (CTEs) help you break up complex queries in BigQuery. Here’s a simple query to illustrate how to write a BigQuery WITH statement:

BigQuery WITH: Query Example
Image Source

Tips for writing BigQuery WITH Statements

  • A single SELECT, UPDATE, DELETE, or INSERT statement that references some or all the BigQuery WITH columns must follow the BigQuery WITH statement. You can also specify a BigQuery WITH statement in a CREATE VIEW statement as part of the defining SELECT statement of the view.
  • You can define multiple BigQuery WITH query definitions in a non-recursive CTE. These set operators: UNION ALL, UNION, INTERSECT, or EXCEPT must combine the definitions in a BigQuery WITH clause.
  • A BigQuery WITH statement can reference itself and previously defined CTEs in the same BigQuery WITH clause. However, forward referencing is not allowed for a BigQuery WITH statement.
  • You can’t specify more than one BigQuery WITH clause in a statement. For example, if a CTE_query_definition contains a sub-query, that subquery cannot contain a nested BigQuery WITH clause that defines another CTE.
  • The statement of recursive CTE must contain at least two BigQuery WITH statement definitions: a recursive member and an anchor member. Defining a BigQuery WITH clause simply means writing a SELECT query, giving a result you want to use within another query. Some members of the anchors and recursive members can be defined. However, all definitions of queries of anchor members must be placed before the definition of the first recursive member. All BigQuery WITH definitions is an anchor member unless they refer to the BigQuery WITH statement itself.
  • One of the set operators (UNION ALL, UNION, INTERSECT, or EXCEPT) must combine anchor members. UNION ALL is the only operator set permitted between the last anchors and first recursive member and when connecting several recursive members in a BigQuery WITH statement.

Considerations

  1. WITH clause helps to resolve your problem by computing an intermediate outcome used in the queries. On the other hand, you can have a table for saving the query result. 
  2. You can use the queries containing WITH clause in another query having FROM clause. This is similar to how it is done in Oracle to define a table function.Still,the query created is complex to understand, even if the end outcome is the same. Y
  3. ou can also add more subquery that can be combined as a chain. With this, you can have intermediate results required to answer the query.
  4. You can either keep the queries before processing in WITH clause without moving data to store in a table. Or, you can store the en result of the preprocessing query in a table.
  5. This will help you query that quickly in future. Because, preprocessing and joins in the query would be already done. And, you can share the data table with others as well. 
  6. A limitation of proceeding this way is that you might require a subset of the data by consistently adding the clause WHERE which can impact performance.
  7. So, you can use WITH clause faster. Thanks to the push-down feature in BigQuery,  WITH clause is more efficient than a result saved in an indexed table.
  8. When your query having WITH clause doesn’t have a WHERE filter, BigQuery adds the WHERE clause filter of the following query along with the WITH statement for optimizing the process.
  9. Also, the problem with the permanent table is that, you lose the calculations done in the various steps.
  10. This indicatres that you have to delete it and make it again whenever there is any changing the query in the beginning.

Conclusion

  1. CTEs in SQL allow users to create a temporarily named query that helps increase their query readability. CTEs in SQL is implemented using the WITH clauses.
  2. As you can see, whether recursive or non-recursive, CTE can be a useful tool when you need to produce a temporary set of results that can be accessed in the SELECT, UPDATE, DELETE, JOIN, or INSERT statement. CTE is like a derived table in a certain sense:
  3. it is not stored as an object and is only valid while implementing the main statement. However, unlike derivative tables, CTEs can be referenced several times in the query.
Osheen Jain
Technical Content Writer, Hevo Data

Osheen is a seasoned technical writer with over a decade of experience in the data industry. She specializes in writing about B2B, technology, finance, and SaaS domains. Her passion for simplifying intricate technical concepts has established her as a respected expert in the field, making her an invaluable resource for those looking to deepen their understanding of data science.

No-code Data Pipeline for BigQuery