Google BigQuery FIRST_VALUE and LAST_VALUE Functions Simplified 101

on BigQuery Functions, Data Analytics, Data Warehouse, Google BigQuery, Google BigQuery ML • January 19th, 2022 • Write for Hevo

BigQuery First_Value

Business relies on Data Analysis for insightful decisions. If your company has been existing for a while, you probably have huge buckets of data. Analyzing these functions can prove difficult if you still use basic spreadsheets. 

Google BigQuery is a Cloud Data Warehouse solution by Google to help businesses and developers query huge amounts of data within short periods. Nevertheless, beginners can find Google BigQuery difficult to maneuver, especially with advanced analytic functions. Two of these functions are Google BigQuery First_Value and Last_Value. 

In this article, you will read about Google BigQuery, its features, and understand the usage of Google BigQuery First_Value and Last_Value functions.

Table of Contents

Introduction to BigQuery

Google BigQuery Logo
Image Source

Google BigQuery is Google’s Data Warehouse for Big Data. This service enables users to store, analyze, and manage vast amounts of data at affordable costs. Google BigQuery is also filled with in-built features like business intelligence, federated queries, and machine learning that facilitate easy data processing. 

In addition, this Data Warehouse supports integration with clients’ libraries like Java, Python, and JavaScript. As such, Google BigQuery users who build their applications with any of these client libraries can easily transfer the data on their Google BigQuery account to their programs without any manual movement. 

Google BigQuery is highly scalable. You can start with the minimum memory size that the data warehouse offers, and upgrade your memory size as your business improves. Fortunately, Google BigQuery only charges you according to the number of petabytes you use per query. 

Since Google BigQuery is serverless, you don’t have to worry about owning or maintaining a server to use the service. All Google BigQuery users manage their data on the Google Cloud. 

Furthermore, Google BigQuery offers a variety of functions that its subscribers employ to query their data. Here are some Google Google BigQuery terms and functions that you should get familiar with in that regard:

  • WITH:  WITH is used to emulate a temporary table from certain values.
  • SELECT: SELECT indicates the columns that a query will return. 
  • FROM: The FROM statement shows the tables from which the query will return rows. 
  • ORDER BY: is used to indicate the order in which the query will return the results.
  • ASC | DESC: Directs the query to sort the result in ascending or descending order of the values
  • NULL: NULL points to empty values or ‘0’. When ASC is added in the query, null values appear first. In contrast, NULL values usually appear last in DESC. If you don’t want NULL values to appear in your query results, you can use the IGNORE NULL function.
  • RESPECT: RESPECT directs the query to include a value in the result. It is usually combined with the NULL statement.
  • WINDOW: A window refers to a group of rows where you can apply analytic functions.
  • OVER: This statement introduces the window for a particular function.
  • VALUE_Expression: This is the expression whose value the query is expected to return.
  • ANALYTIC FUNCTION: Analytic functions return one value for each in a table. Examples of analytic functions are first_value and last_value.
  • PARTITION BY: Based on an ASC or DESC command, PARTITION BY divides the rows into groups and places rows that share similar values close to each other. 

For instance, in the table: 

FruitNumber
Mango15
Banana20
Orange11
Pear20
Cashew11
Strawberry15
Blueberry20

If you indicate PARTITION BY number in your SELECT statement, the query will return the results as

Fruit Number
Orange11
Cashew11
Mango15
Strawberry15
Banan20
Pear20
Blueberry20

Key Features of Google BigQuery

A few features of Google BigQuery are listed below:

  • BI Engine: Google BigQuery comes with an in-memory analysis service that enables users to analyze large datasets in the Data Warehouse itself which makes the sub-query response time faster.
  • Fault-tolerant Structure: Google BigQuery ensures that your data remains protected in every condition. It prevents data loss and provides real-time logs for any error in an ETL process.
  • Encrypted Models: Google BigQuery ML allows users to encrypt Machine Learning models with the Customer-Managed Encryption Keys (CMEK).

To learn more about Google BigQuery, click here.

Simplify Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDK,s, and Streaming Services and simplifies the ETL process. It supports 100+ data sources and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo not only loads the data onto the desired Data Warehouse/destination but also enriches the data and transforms it into an analysis-ready form without having to write a single line of code.

Get Started with Hevo for Free

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensures that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.

Check out why Hevo is the Best:

  1. Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  2. Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  3. Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  4. Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  5. Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  6. Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, E-Mail, and support calls.
  7. Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

Understanding Google BigQuery First_Value and Last_Value

Google BigQuery First_Value and Last_VALUE cover image
Image Source

The Google BigQuery First_Value function is used to query the first row in a table. When you input this function in your query, it will return only one value. On the other hand, Google BigQuery Last_Value function returns a single value for the last row in your target table. 

Usually, Google BigQuery users apply Google BigQuery First_Value and Last_Value functions to arrange values in a table in ascending or descending order. For instance, you may use these functions to order the scores of a group of students to discover the most outstanding students.

The typical syntax for a Google BigQuery FIRST_VALUE statement is as follows:

FIRST_VALUE (expression [ IGNORE NULLS | RESPECT NULLS ] ) 
OVER [PARTITION BY expression_list ]
[ ORDER BY order_list frame_clause ])

The syntax for a Google BigQuery LAST_VALUE statement is as follows:

LAST_VALUE (expression [IGNORE NULLS | RESPECT NULLS]  )
OVER [PARTITION BY expression_list]

Now, let’s apply the Google BigQuery First_Value and Last_Value syntax in practice.

Using Google BigQuery First_Value and Last_VALUE

The following example of Google BigQuery First_Value and Last_Value outlines the highest scores across a list of student grades in a school.

FIRST_VALUE

If you are running a query for a Google BigQuery FIRST_VALUE function, it should look like this:

WITH students AS
(SELECT ‘Jane Kennedy’ as Name,
‘1’ as Class,
‘70’ as Score 
UNION ALL SELECT ‘Matthew Smith’ , ‘2’ , ‘81’
UNION ALL SELECT ‘Mary Lane’, ‘1’, ‘50’
UNION ALL SELECT ‘Harry Song’,  ‘3’ , ‘64’
UNION ALL SELECT ‘Isiah Matthews’,’1’ , ‘78’
UNION ALL SELECT ‘Joseph Meyers’ , ‘3’, ‘55’
UNION ALL SELECT ‘Andrew Crawford’ , ‘1’ , ‘67’
UNION ALL SELECT ‘Jamie Sylvester’ , ‘2’ , ‘ 57’
UNION ALL SELECT ‘Catherine Adams’ , ‘3’ , ‘48’
UNION ALL SELECT ‘Patrick Cain’ , ‘ 2’ , ‘49’
FROM (
SELECT Name, 
Class,
Score,
FIRST_VALUE  (Score)
OVER ( 
PARTITION BY Class ORDER BY Score DESC 
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) );

The above query of Google BigQuery First_Value will return the following results :

Name ClassScore First_Value
Harry Song36464
Joseph Meyers35564
Catherine Adams34864
Matthew Smith28181
Jamie Sylvester25781
Patrick Cain4981
Isiah Matthews17878
Jane Kennedy 17078
Andrew Crawford 16778
Mary Lane15078

LAST_VALUE

Here’s how to query your data for a Google BigQuery LAST_VALUE function:

WITH students AS
(SELECT ‘Jane Kennedy’ as Name,
‘1’ as Class,
‘70’ as Score 
UNION ALL SELECT ‘Matthew Smith’ , ‘2’ , ‘81’
UNION ALL SELECT ‘Mary Lane’, ‘1’, ‘50’
UNION ALL SELECT ‘Harry Song’,  ‘3’ , ‘64’
UNION ALL SELECT ‘Isiah Matthews’,’1’ , ‘78’
UNION ALL SELECT ‘Joseph Meyers’ , ‘3’, ‘55’
UNION ALL SELECT ‘Andrew Crawford’ , ‘1’ , ‘67’
UNION ALL SELECT ‘Jamie Sylvester’ , ‘2’ , ‘ 57’
UNION ALL SELECT ‘Catherine Adams’ , ‘3’ , ‘48’
UNION ALL SELECT ‘Patrick Cain’ , ‘ 2’ , ‘49’
FROM (
SELECT Name, 
Class,
Score,
LAST_VALUE (Score)
OVER ( 
PARTITION BY Class ORDER BY Score DESC 
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) );

The query above for Google BigQuery Last_Value will return the following results:

Name ClassScore LAST_VALUE
Harry Song36448
Joseph Meyers35548
Catherine Adams34848
Matthew Smith28149
Jamie Sylvester25749
Patrick Cain4949
Isiah Matthews17850
Jane Kennedy 17050
Andrew Crawford 16750
Mary Lane15050

Conclusion

In this article, you learnt about Google BigQuery, Google BigQuery First_Value, and Last_Value. You also read about the usage and how Google BigQuery First_Value and Last_Value functions work, you can easily spot important values amidst a large table. Google BigQuery First_Value and Last_Value make it easier for Data Scientists and Data Analysts to query data in Google BigQuery Data Warehouse.

Visit our Website to Explore Hevo

Companies store valuable data from multiple data sources into Google BigQuery. The manual process to transfer data from source to destination is a tedious task. Hevo Data is a No-code Data Pipeline that can help you transfer data from any data source to desired Google BigQuery. It fully automates the process to load and transform data from 100+ sources to a destination of your choice without writing a single line of code. 

Want to take Hevo for a spin? Sign Up here for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Share your experience of learning about Google BigQuery First_Value and Last_Value in the comments section below!

No-code Data Pipeline For your Google BigQuery