Working with PostgreSQL Subquery | A 101 Guide

• July 15th, 2022

PostgreSQL Subquery_FI

While writing queries in PostgreSQL, there would be some situations where you would be tired of selecting all the records and then filtering them. And also, filtering them would require writing numerous lines of unnecessary and redundant code. 

This is where PostgreSQL Subquery comes to your rescue. PostgreSQL Subquery can reduce the lines of your code by multifold and you can easily filter out your data based on any condition you desire. 

This article will help you comprehensively understand the overall usage and need of the PostgreSQL Subquery. You will also come across the different kinds of use cases and query examples of PostgreSQL Subquery.

Table of Contents

What is a PostgreSQL Subquery?

A PostgreSQL Subquery is basically a query nested inside a larger query.

The rules and guidelines that should be kept in mind while using PostgreSQL Subqueries are as follows:

  • PostgreSQL Subqueries should be enclosed within parentheses.
  • If you’re using a PostgreSQL Subquery with the SELECT clause, then you can have only one column along with it unless multiple columns are in the main query for the subquery to compare its selected columns.
  • An ORDER BY clause cannot be used in a PostgreSQL Subquery, although the main query can use an ORDER BY. The GROUP BY can be used to perform the same function as the ORDER BY in a subquery.
  • PostgreSQL Subqueries that return more than one row can only be used with multiple value operators, such as the IN, EXISTS, ALL, NOT IN, ANY, SOME, etc.
  • The BETWEEN operator cannot be used with a subquery. Rather, it can be used within the subquery.
  • You can only use single-row operators with single-row PostgreSQL Subqueries, and multiple-row operators with multiple-row PostgreSQL Subqueries.
  • When using certain comparison operators in a WHERE clause, if a subquery (inner query) returns a null value to the parent query (outer query), then the outer query won’t return any rows.

Why do you use PostgreSQL Subqueries?

A PostgreSQL Subquery is used to return data that will be utilized as a condition in the main query to further limit the data to be returned. You may also use it when the desired result demands more than one query and each subquery returns a subset of the table involved in the query.

Scale your data integration effortlessly with Hevo’s Fault-Tolerant No Code Data Pipeline

As the ability of businesses to collect data explodes, data teams have a crucial role to play in fueling data-driven decisions. Yet, they struggle to consolidate the scattered data in their warehouse to build a single source of truth. Broken pipelines, data quality issues, bugs and errors, and lack of control and visibility over the data flow make data integration a nightmare.

1000+ data teams rely on Hevo’s Data Pipeline Platform to integrate data from over 150+ sources in a matter of minutes. Billions of data events from sources as varied as SaaS apps, Databases, File Storage and Streaming sources can be replicated in near real-time with Hevo’s fault-tolerant architecture. What’s more – Hevo puts complete control in the hands of data teams with intuitive dashboards for pipeline monitoring, auto-schema management, and custom ingestion/loading schedules. 

All of this combined with transparent pricing and 24×7 support makes us the most loved data pipeline software on review sites.

Take our 14-day free trial to experience a better way to manage data pipelines.

Get started for Free with Hevo!

PostgreSQL Subquery: Overview

PostgreSQL Subquery can accommodate a variety of use cases. These are as follows:

I) With different Statements

You can use a PostgreSQL Subquery along with different statements based on whether you want to query, insert, update or delete data from your table(s).

The statements along which you can use PostgreSQL Subquery are as follows:

1) SELECT statement

While retrieving information if you want to restrict your data set then you can fulfill this by creating a PostgreSQL Subquery inside the main query.

A) Syntax
SELECT column_name [, column_name ]
FROM   table1 [, table2 ]
WHERE  column_name OPERATOR
      (SELECT column_name [, column_name ]
      FROM table1 [, table2 ]
      [WHERE])

In this case, the query written inside parentheses is the Subquery or the nested query.

B) Example
SELECT * FROM EMPLOYEE
   WHERE ID IN (SELECT ID
      FROM EMPLOYEE
      WHERE SALARY > 50000);

This query will display all the information about the employees from the Employee table whose salary would be greater than 50,000.

The output would be similar to

PostgreSQL Subquery output when used with SELECT statement
Image Source: Self

2) INSERT statement

You can create a PostgreSQL Subquery to help you insert data in a table (or view) on the basis of the data returned by the subquery. You can also modify the data queried by the subquery by using any of the character, date, or number functions.

A) Syntax
INSERT INTO table_name [ (column_1 [, column_2 ]) ]
   SELECT [ *|col_1 [, col_2 ] ]
   FROM table_1 [, table_2 ]
   [ WHERE VALUE OPERATOR ]

In this case, you are inserting values inside the table_name table on the basis of the data retrieved by using a PostgreSQL Subquery.

B) Example 
INSERT INTO EMPLOYEE_HANDBOOK
   SELECT * FROM EMPLOYEE
   WHERE ID IN (SELECT ID
      FROM EMPLOYEE);

This query will insert all the information about the employees into the EMPLOYEE_HANDBOOK table from the EMPLOYEE table on the basis of their ID.

3) UPDATE statement

You can update one or more columns in a table on the basis of data returned by a PostgreSQL Subquery. 

A) Syntax
UPDATE table_n1
SET col_name = new_value
[ WHERE OPERATOR [ VALUE ]
(SELECT COL_NAME
FROM TAB_NAME)
[ WHERE) ]

In this case, you are updating the values of the records in the column col_name of the tab_n1 table on the basis of some condition written in the PostgreSQL Subquery. 

2) Example
UPDATE ORG
   SET SALARY = SALARY * 0.75
   WHERE AGE IN (SELECT AGE FROM EMPLOYEE
      WHERE AGE >= 29 );

This query will update the salary of the employees in the ORG table by 3/4th on the basis of whose age is greater than or equal to 29 from the EMPLOYEE table.

4) DELETE statement

You can delete one or more columns in a table on the basis of data returned by a PostgreSQL Subquery. 

A) Syntax
DELETE FROM TAB_NAME
[ WHERE OPERATOR [ VALUE ]
(SELECT COL_NAME
FROM TABLE_NAME)
[ WHERE) ]

In this case, you are deleting all the records in the TAB_NAME table on the basis of some condition written in the PostgreSQL Subquery. 

B) Example 
DELETE FROM EMPLOYEE_HANDBOOK
   WHERE AGE IN (SELECT AGE FROM EMPLOYEE
      WHERE AGE > 27 );

This query will delete all the records in the EMPLOYEE_HANDBOOK table whose age is greater than 27.

II) With different Clauses

You can use a PostgreSQL Subquery along with different statements based on whether you want to query, insert, update or delete data from your table(s).

The clause along which you can use PostgreSQL Subquery is as follows:

1) SELECT clause

While performing any CRUD operation in PostgreSQL, you can use a PostgreSQL Subquery in a SELECT statement.

A) Syntax

The overall syntax would be:

SELECT…., (SELECT...) FROM table_name…;
B) Example

To have a detailed idea, you can follow the given query:

SELECT b1.bike_name, b1.bike_model,  
    (SELECT MIN (bike_id)  
     FROM bike b2  
     WHERE b1.bike_id = b2.bike_id) //Subquery 
FROM bike b1; 

The query written inside the parentheses is the PostgreSQL Subquery which is used along with the SELECT clause to find the minimum bike id. The MIN() is an aggregate function that is used in the Subquery to filter out the records.

2) FROM clause

While writing queries in PostgreSQL, you can use a PostgreSQL Subquery in a SELECT statement’s FROM clause.

A) Syntax

The overall syntax would be:

SELECT...FROM (Subquery) [AS] name…;

Since every table in a FROM clause must have a name, the [AS] name clause is required. Column names in the subquery select list must be unique.

B) Example

To have a detailed idea, you can follow the given query:

select A.id, A.name, A.dept, A.salary FROM
(select avg(budget) as avg_budget from DEPARTMENT) as TARGET T, INSTRUCTOR as A
where A.salary > T.avg_budget;

The query written inside the parentheses is the PostgreSQL Subquery which is treated as a virtual table and the data it returns is fed to a buffer table called TARGET.

3) WHERE clause

While performing any CRUD operation in PostgreSQL, you can use a PostgreSQL Subquery along with a SELECT clause.

A) Syntax

The overall syntax would be:

SELECT … FROM table_name WHERE col_name IN (Subquery);

Here, Subquery is a selected range used in the WHERE clause for setting a specific condition.

B) Example

To have a detailed idea, you can follow the given query:

SELECT p.prod_id, p.prod_name
FROM product p
WHERE p.cat_id IN
   (SELECT c.cat_id
    FROM category c
    WHERE c.cat_id > 30
    AND c.cat_name like 'M%');

The query written inside the parentheses is the PostgreSQL Subquery which is used along with the WHERE IN clause to set the condition for retrieving the records.

III) With different Conditions

You can use a PostgreSQL Subquery along with different condition statements.

The different conditions used with PostgreSQL Subquery are as follows:

1) EXISTS condition

You can use a PostgreSQL Subquery along with the EXISTS clause. The EXISTS clause only keeps the number of rows returned by the subquery.

A) Syntax

The basic syntax is as follows:

……
EXISTS   
(SELECT col_name, [column_name|expressions]   
FROM table_name   
WHERE condition);

The PostgreSQL Subquery is an input of the EXISTS condition. The EXISTS condition will return TRUE if the Subquery returns any row. And if the subquery returns no results, the EXISTS condition will return FALSE.

B) Example
SELECT *
FROM customer
WHERE EXISTS 
  (SELECT *
   FROM order
   WHERE customer.cus_id = order.cus_id);

This query will return all the information of those customers who have made any orders.

2) IN condition

You can use a PostgreSQL Subquery along with the IN condition. In this case, you are creating the subquery with a WHERE clause and IN condition.

A) Syntax

The basic syntax is as follows:

Here, the PostgreSQL Subquery is used to set a specific condition by using it with the WHERE clause and IN condition.

B) Example 
SELECT emp_name
FROM Employee
WHERE Department IN
       (SELECT dept_no
       FROM Department
       WHERE floor = 5);

This query will return all the names of the employees whose departments are on the 5th floor.

IV) For doing Comparisons

You can use a PostgreSQL Subquery before or after any of the comparison operators.  The subquery will return at most one value. The value will be the result of an arithmetic expression or a column function. Your query then compares the subquery result to the value on the other side of the comparison operator. 

The different comparison operators that can be used are as follows:

PostgreSQL Subquery supports these comparison operators
Image Source: Self

Example

SELECT emp_id,f_name,l_name,salary
FROM Employee
WHERE salary > 
(SELECT AVG(salary) FROM Employee); 

In this example, the records are displayed on the basis of comparing the salary of all the employees.

V) PostgreSQL Subquery with ALL, ANY, and SOME Operators

You can use PostgreSQL Subquery with ALL, ANY, and SOME Operators.

1) ALL Operator

When the ALL operator is used along with a PostgreSQL Subquery, it will help you compare the value in the main query to every value returned by the subquery.

A) Syntax

The basic syntax is as follows:

The parenthesized subquery on the right will return exactly one column. Using the given operator, the left-hand expression is evaluated and compared to each row of the subquery result, which must return a Boolean result.

  • The subquery will return true if all rows return true (including the case where the subquery returns no rows).
  • If any false results are detected, the result will be false.
  • If the comparison does not return false for any row and returns NULL for at least one row, the result is NULL.

B) Example 

SELECT dept_id, AVG(salary) 
FROM EMPLOYEE GROUP BY dept_id 
HAVING AVG(salary)>=ALL 
(SELECT AVG(Salary) FROM EMPLOYEE 
GROUP BY dept_id);

In this query, ALL will help with comparing all the average salaries of all the employees department-wise.

2) ANY/SOME Operator

The ANY operator compares the value to each of the subquery’s results. As a result, the ANY keyword returns TRUE if the comparison is TRUE for ANY of the values in the column returned by the subquery. And SOME Operator is synonymous with the ANY operator.

A) Syntax

The basic syntax is as follows:

expression operator SOME (subquery);

B) Example 
SELECT first_name, last_name,department_id 
FROM employees 
WHERE department_id= ANY
(SELECT DEPARTMENT_ID 
FROM departments WHERE location_id=1700);

In this query, ANY will help with comparing the average salaries of all the employees department-wise. The result of the query will be based on comparing with any of the values returned by the PostgreSQL Subquery.

For further information on PostgreSQL Documentation, you can visit here.

Conclusion

This article illustrated the concept of PostgreSQL Subquery. You had an in-depth understanding of why there is a need to create a PostgreSQL Subquery, its different use cases, and all the parameters in it. 

Now, you can move forward and write your Subqueries in PostgreSQL according to your use cases.

You can also deep-dive extensively into PostgreSQL by going through the following:

If you use PostgreSQL, ingesting and moving data into your selected warehouse using ETL for Data Analysis might be a time-consuming process. When you realize how much money and resources are required to engage data engineers and analysts to make sense of this data, the situation becomes even more daunting.

However, with Hevo at your fingertips, you won’t have to worry about your PostgreSQL Data Replication demands. Loading your data into your selected data warehouse will only take a few minutes.

Hevo Data‘s excellent integration with 150+ Data Sources (including 40+ Free Sources) such as PostgreSQL allows you to export data from your chosen data sources and load it to your desired destination. It also assists you in transforming and enriching your data so that it is suitable for analysis. You can readily save time and focus on getting insights and doing in-depth research of your data using BI tools.

Visit our Website to Explore Hevo

Now, you can replicate data from PostgreSQL to any data warehouse of your choice such as Amazon Redshift, Snowflake, Google BigQuery, or Firebolt.

Why don’t you give Hevo a try? Sign Up here for a 14-day full feature access trial and experience the feature-rich Hevo suite first hand. You can also check our unbeatable pricing and make a decision on your best-suited plan. 

Share your thoughts on learning about PostgreSQL Subquery in the comments section below. If you have any questions, do let us know. We’d be happy to help.

No-code Data Pipeline for PostgreSQL