One of the most common things in data analytics is running the same analytics queries over and over again by different end users over various times and snapshots of the data. This action, in particular, makes running warehousing solutions expensive and time-consuming at the same time.

How about, we store the results of such expensive queries and make them readily available for us, without having to run the query over the raw data whenever it is needed? This is exactly what Materialized Views in BigQuery does for us. BigQuery serves the pre-computed results, leading to faster performance and reduced query costs. You must generally use it for recurring queries with minimal changes to the underlying data.

In this blog, let’s dive deep into the complexities of Materialized Views in BigQuery to better understand them and determine when and when not to use them.

What is BigQuery Materialized View?

Every database has a feature known as Views, which are saved queries. A query that is run every time it is called. But this gets expensive if the saved query or say the definition of the view is complex and needs a lot of time and database resources to run. Thus, as the name suggests, a Materialized View in BigQuery materializes the data behind its definition. Unlike regular views, which execute their underlying SQL logic every time you query them, materialized views store the results in BigQuery’s storage layer. Thus, this approach reduces computational overhead and improves performance.

Key Characteristics of Materialized Views

  1. Incremental Updates: Only the data that has changed since the last computation is refreshed, making updates efficient.
  2. Performance Boost: Queries on materialized views are significantly faster as they read precomputed results.
  3. Cost-Effective: Since materialized views reduce the need for repetitive computations, they lower query costs.
  4. Read-Optimized: Ideal for dashboards or applications requiring low-latency results.
Simplify BigQuery Migration with Hevo!

Facing challenges migrating your data to BigQuery? Migrating your data can become seamless with Hevo’s no-code intuitive platform. With Hevo, you can:

  1. Automate Data Extraction: Effortlessly pull data from 150+ connectors(and other 60+ free sources).
  2. Transform Data effortlessly: Use Hevo’s drag-and-drop feature to transform data with just a few clicks.
  3. Seamless Data Loading: Quickly load your transformed data into your desired destinations, such as BigQuery.

Try Hevo and join a growing community of 2000+ data professionals who rely on us for seamless and efficient migrations. 

Get Started with Hevo for Free

Examples of Materialized Views in BigQuery

Let us take a look at where we can use Materialised views to enhance our analytics workload, with industry-based examples and use cases.

Scenario 1: E-Commerce Sales Insights

Base Table definition:

  1. Sales Table
#Column NameColumn Type
1.product_idSTRING (UUID)
2.categorySTRING
3.sales_amountFLOAT/DOUBLE
4.sale_dateDATE

Usual Query without Materialized Views:

SELECT  
  category,  
  SUM(sales_amount) AS total_sales,  
  COUNT(product_id) AS items_sold  
FROM  
  sales_data  
GROUP BY  
  category;  

Let us change it into a Materialized view:

CREATE MATERIALIZED VIEW mv_sales_by_category AS  
SELECT  
  category,  
  SUM(sales_amount) AS total_sales,  
  COUNT(product_id) AS items_sold  
FROM  
  sales_data  
GROUP BY  
  category;  

Query using Materialized View:

SELECT category, total_sales  
FROM mv_sales_by_category  
WHERE category = 'Electronics';  

Scenario 2: Tracking Marketing Campaign Performance

Base tables definition:

  1. Campaigns Table
#Column NameColumn Type
1campaign_idSTRING
2campaign_nameSTRING
3start_dateDATE
4end_dateDATE
5budgetFLOAT/DOUBLE
  1. Impressions Table
#Column NameColumn Type
1campaign_idSTRING
2.impression_dateDATE
3.impressions_countINT
  1. Clicks Table
#Column NameColumn Type
1.campaign_idSTRING
2. click_dateDATE
3.click_countINT
4.revenue_generatedFLOAT/DOUBLE

Usual Query without Materialized Views:

SELECT  
  c.campaign_id,  
  c.campaign_name,  
  SUM(i.impressions_count) AS total_impressions,  
  SUM(cl.clicks_count) AS total_clicks,  
  SUM(cl.revenue_generated) AS total_revenue,  
  SUM(cl.revenue_generated) / c.budget AS roi,  
  SUM(cl.clicks_count) / NULLIF(SUM(i.impressions_count), 0) AS ctr  
FROM  
  campaigns c  
LEFT JOIN  
  impressions i  
ON  
  c.campaign_id = i.campaign_id  
LEFT JOIN  
  clicks cl  
ON  
  c.campaign_id = cl.campaign_id  
WHERE  
  i.impression_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)  
  AND cl.click_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)  
GROUP BY  
  c.campaign_id, c.campaign_name, c.budget;  

Let us now create the Materialized view to precompute and store the results in BigQuery.

CREATE MATERIALIZED VIEW mv_campaign_performance AS  
SELECT  
  c.campaign_id,  
  c.campaign_name,  
  SUM(i.impressions_count) AS total_impressions,  
  SUM(cl.clicks_count) AS total_clicks,  
  SUM(cl.revenue_generated) AS total_revenue,  
  SUM(cl.revenue_generated) / c.budget AS roi,  
  SUM(cl.clicks_count) / NULLIF(SUM(i.impressions_count), 0) AS ctr,  
  MAX(i.impression_date) AS last_impression_date,  
  MAX(cl.click_date) AS last_click_date  
FROM  
  campaigns c  
LEFT JOIN  
  impressions i  
ON  
  c.campaign_id = i.campaign_id  
LEFT JOIN  
  clicks cl  
ON  
  c.campaign_id = cl.campaign_id  
GROUP BY  
  c.campaign_id, c.campaign_name, c.budget;  

Optimized Query using Materialized Views:

SELECT  
  campaign_name, total_impressions, total_clicks, total_revenue, roi, ctr  
FROM  
  mv_campaign_performance  
WHERE  
  roi > 1.5  
  AND total_impressions > 100000  
  AND last_impression_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY);  

Prerequisites for Using Materialized View in BigQuery

Before you can use this powerful feature, you need to make sure you fulfill the following pre-requisites:

  1. Google Cloud Account
  2. IAM Permissions
  • bigquery.tables.create
  • bigquery.dataEditor
  • bigquery.dataOwner
  • bigquery.admin
  1. Basic SQL Knowledge like  DDL, DML, and aggregations queries.
  2. Understand the nuances of Materialized View Query Structure and working mechanism

You can take a look at how you can get started with the BigQuery Create View command to get a better understanding of working with views in BigQuery.

How to Create BigQuery Materialized Views?

Now that you are aware of the prerequisites and the powerful features of BigQuery Materialized views. Lets us dive into step by step guide on how you can create and use a Materialized views in Google Cloud.

Step 1: Access the Google Cloud Console

Go to the Google Cloud Console and sign in with your Google account. Do create a new account if you don’t have already.

Step 2: Select Your Project

Once you are logged in, click on the project dropdown in the top navigation bar and select the project where you want to create the materialized view.

Select your project in BigQuery

Step 3: Navigate to BigQuery

Once you’re in the project you want to work on, in the left sidebar, scroll down and click on BigQuery. This will open the BigQuery console.

Open Google BigQuery Console

You should be able to see, the BigQuery console like this:

Google BigQuery Console

Step 4: Locate Your Dataset

In the Explorer panel, find the dataset where you would like to create the materialized view.

Or you can click on the +ADD button to create/add a new dataset to work on.

Step 5: Define Your Base Query

Let us now define a base query, the SQL query that will serve as the basis for your materialized view. This query can include various transformations, filters, and aggregations.

Example Base Query:

SELECT 
    c.customer_id,
    c.customer_name,
    SUM(oi.quantity * oi.price) AS total_sales,
    COUNT(DISTINCT o.order_id) AS total_orders
FROM 
    `bigquery_dataset.customers` AS c
JOIN 
    `bigquery_dataset.orders` AS o ON c.customer_id = o.customer_id
JOIN 
    `bigquery_dataset.order_items` AS oi ON o.order_id = oi.order_id
GROUP BY 
    c.customer_id, c.customer_name;

Step 6: Use the Base Query To Create a Materialized View

Click on the SQL Query button from the Create New Section of the console.

BigQuery Studio

Run the Materialized view query to create. 

Example:

CREATE MATERIALIZED VIEW `bigqueryproject.bigquery_dataset.sales_summary_mv` AS
SELECT 
    c.customer_id,
    c.customer_name,
    SUM(oi.quantity * oi.price) AS total_sales,
    COUNT(DISTINCT o.order_id) AS total_orders
FROM 
    `bigquery_dataset.customers` AS c
JOIN 
    `bigquery_dataset.orders` AS o ON c.customer_id = o.customer_id
JOIN 
    `bigquery_dataset.order_items` AS oi ON o.order_id = oi.order_id
GROUP BY 
    c.customer_id, c.customer_name;

Step 7: Query Materialized view 

Once you run the above query you can now use the created materialized view to query the underlying data by using the query below:

SELECT 
    customer_name,
    total_sales,
    total_orders
FROM 
    `bigqueryproject.bigquery_dataset.sales_summary_mv`
WHERE 
    total_sales > 1000;
Load your Data from MySQL to BigQuery
Connect your Data from HubSpot to BigQuery
Replicate your Data from BigQuery to Snowflake

Use Case of Materialized Views

There are various use cases in which engineers use the materialized view. Let us discuss a few use cases where it is used the most.

  1. Pre-Aggregating Data: Instead of aggregating data every time it is required, materialized views cache aggregated data for future use.
    Scenario: Calculate daily sales totals from a streaming sales dataset.
  2. Pre-Filtering Data: Materialized views are generally used to store a frequently requested subset of a table’s data based on filtering techniques.
    Scenario: Filter transactions for a specific product category.
  3. Pre-Joining Data: SQL joins are very expensive as they require a shuffle of data, thus materialized views are used to store pre-joined datasets to optimize performance.
    Scenario: Joining customer information and transaction data for quick access to the complete picture.
  4. Pre-Clustering Data: Materialized views can reorganize data using a clustering strategy better suited for specific query patterns.
    Scenario: Speed up time-series analysis of web server logs.

Limitations of Materialized Views

Materialized Views is a powerful tool, however, it does come with some limitations. Let us now look at some limitations that we need to consider while working with it.

  1. Storage Costs: Materialized Views store the cached data, and thus incur cost based on the size of cached data.
  2. Limited Flexibility: You will have to re-create and run the entire view if the underlying query logic changes at some point.
  3. Update Overhead: Frequent changes to source tables can lead to higher refresh costs.
  4. Query Constraints: Materialized views cannot reference other views or temporary tables.

You can also take a look how you can easily work with Databricks Materialized Views and Snowflake Materialized Views if you are looking for alternatives of BigQuery.

Conclusion

To summarize, Materialized Views in BigQuery is a very powerful tool for optimizing query performance and reducing costs in scenarios that involve repeated queries. From precomputing results to leveraging incremental updates, they provide a seamless experience for data analysts and engineers. 

However, they come with their own set of limitations and best-use cases, making it essential to evaluate your specific needs before implementation.

If you want to integrate your data from various sources and load it in Google BigQuery, then try Hevo.

Sign up for a 14-day free trial and simplify your data integration process. Check out the pricing details to understand which plan fulfills all your business needs.

Frequently Asked Questions (FAQs)

1. What is the difference between a view and a materialized view in BigQuery?

A view executes its query every time it’s accessed, fetching live data. However, a materialized view stores pre-computed results, making repeated queries faster and more cost-effective.

2. Why use a materialized view instead of a table?

Materialized views are updated incrementally, reducing the overhead of managing a table manually. Thus, making it ideal for precomputing frequently queried data.

3. What is the difference between a scheduled query and a materialized view?

A scheduled query runs periodically, storing results in a table—however, a materialized view updates automatically when the underlying data changes and serves pre-computed results.

4. What is the difference between a materialized view and a normal view?

A Materialized view stores results physically in storage. However, a Normal view only stores the query logic and computes results on demand.

Raju Mandal
Senior Data Engineer

Raju is a Certified Data Engineer and Data Science & Analytics Specialist with over 8 years of experience in the technical field and 5 years in the data industry. He excels in providing end-to-end data solutions, from extraction and modeling to deploying dynamic data pipelines and dashboards. His enthusiasm for data architecture and visualization motivates him to create informative technical content that simplifies complicated concepts for data practitioners and business leaders.