Redshift PIVOT & UNPIVOT: The Ultimate How-to Guide

Q: 1. What is PIVOT in Redshift?

Redshift does not have a built-in PIVOT function. However, you can achieve pivot-like functionality by using a combination of CASE statements and GROUP BY to rotate rows into columns manually.

Q: 2. What is the use of PIVOT function?

The PIVOT function in SQL transforms rows of data into columns. It's typically used to summarize data, converting unique values from one column into multiple columns, and applying aggregate functions (like SUM, COUNT) across these columns.

Q: 3. Why do we use PIVOT in SQL?

PIVOT is used to make data easier to read and analyze by transforming rows into columns. It is commonly used for reporting, aggregating, and restructuring data for better insight into datasets.

Redshift PIVOT and UNPIVOT are two separate manifestations of the same relationship/relational operator. In order to achieve the desired dataset the two operators — Redshift PIVOT and UNPIVOT — convert rows into columns and columns into rows.

In general, Redshift PIVOT and UNPIVOT operators are useful to manage multiple tables of datasets and transform the information present in an easy-to-understand format.

In this tutorial article, we will discuss these two concepts in detail, and with some examples show you how these operators work.

Table of Contents

What is Amazon Redshift?

Redshift is a petabyte-scale data warehouse solution built and designed for data scientists, data analysts, data administrators, and software developers. Its parallel processing and compression algorithm allow users to perform operations on billions of rows, reducing command execution time significantly. Redshift is perfect for analyzing large quantities of data with today’s business intelligence tools in multiple data warehouses.

Amazon Redshift is the best for round-the-clock computational needs — like NASDAQ daily reporting, automated ad-bidding, and live dashboards. Redshift’s users pay for an hourly rate — depending on the instance type and the number of nodes deployed.

Amazon Redshift architecture is based on an extensive communication channel between the client application and the data warehouse cluster. These two communicate with each other using the industry-standard JDBC and ODBC drivers for PostgreSQL. The Data warehouse cluster has leader node, compute node, and node slices — these are the core infrastructure components of Redshift’s architecture.

Leader Node: The leader node communicates with the client application and the compute nodes. It resolves, develops, and executes database operations necessary to obtain results for multiple complex queries.
Compute Node: A compute node is made up of multiple node slices. Each compute node has its CPU, memory, and attached disk storage. It performs computational work in a cluster. The user can select from options available for various node types, which can provide an optimum experience based on computation and storage needs.
Node Slice: A group of slices makes a node slice. A slice is a logical partition of a disk. Each slice has its allocated share of the node’s memory and disk space. The number of slices per node depends on the node’s size.

Redshift PIVOT & UNPIVOT: Redshift Architecture

Say goodbye to the hassle of manually connecting Redshift. Embrace Hevo’s user-friendly, no-code platform to streamline your data migration effortlessly.

Choose Hevo to:

Access 150+(60 free sources) connectors, including Redshift.
Ensure data accuracy with built-in data validation and error handling.
Eliminate the need for manual schema mapping with the auto-mapping feature.

Don’t just take our word for it—try Hevo and discover how Hevo has helped industry leaders like Whatfix connect Redshift seamlessly and why they say,” We’re extremely happy to have Hevo on our side.”

Get Started with Hevo for Free

What is Redshift PIVOT operator in Redshift?

Redshift PIVOT operator converts rows into columns. Let’s take the example of a table that looks like this:

	Student Name	Course	Score
1	Simran	Physics	98
2	Alexander	Computer Science	89
3	Simran	Computer science	86
4	Alexander	Physics	78

Table 1

And when we PIVOT it by the second column, the new table will look as follows:

	Student Name	Physics	Computer Science
1	Simran	98	86
2	Alexander	78	89

Table 2

In the first table, two unique values for the course column are given — Physics and Computer Science. But after using Redshift PIVOT operator the unique values have been transformed into columns and the Course column into the row.

Now, let’s check out some PIVOT examples in Redshift.

First, set up a sample table (as shown below) and use it to run the subsequent queries.

CREATE TABLE part (
    partname varchar, 
    manufacturer varchar, 
    quality int, 
    price decimal(12, 2)
);

INSERT INTO part VALUES ('P1', 'M1', 2, 10.00);
INSERT INTO part VALUES ('P1', 'M2', NULL, 9.00);
INSERT INTO part VALUES ('P1', 'M3', 1, 12.00);

INSERT INTO part VALUES ('P2', 'M1', 1, 2.50);
INSERT INTO part VALUES ('P2', 'M2', 2, 3.75);
INSERT INTO part VALUES ('P2', 'M3', NULL, 1.90);

INSERT INTO part VALUES ('P3', 'M1', NULL, 7.50);
INSERT INTO part VALUES ('P3', 'M2', 1, 15.20);
INSERT INTO part VALUES ('P3', 'M3', NULL, 11.80);

Table Creation:
The part table has four columns:
partname (string): Name of the part.
manufacturer (string): Manufacturer of the part.
quality (integer): Quality rating (nullable).
price (decimal): Price of the part (up to 12 digits, with 2 decimal places).

Data Insertion:
Multiple INSERT statements add records for three parts (P1, P2, P3), each with various manufacturers:

Now, PIVOT on partname with an AVG aggregation on price.

SELECT *
FROM (SELECT partname, price FROM part) PIVOT (
    AVG(price) FOR partname IN ('P1', 'P2', 'P3')
);

The result for the above-mentioned query is as follows

  p1   |  p2  |  p3   
-------+------+-------
 10.33 | 2.71 | 11.50

Note: The result following the previous PIVOT query is similar to the following GROUP BY query. But, after running the query, the rows are transformed into column names and columns.

SELECT partname, avg(price)
FROM (SELECT partname, price FROM part)
WHERE partname IN ('P1', 'P2', 'P3')
GROUP BY partname;

The following query results in output as mentioned below:

 partname |  avg  
----------+-------
 P1       | 10.33
 P2       |  2.71
 P3       | 11.50

Now, an example with manufacturer as a column.

SELECT *
FROM (SELECT quality, manufacturer FROM part) PIVOT (
    count(*) FOR quality IN (1, 2, NULL)
);

The following output for the above-mentioned query is as follows:

 manufacturer | 1 | 2 | null 
--------------+---+---+------
 M1           | 1 | 1 |    1
 M2           | 1 | 1 |    1
 M3           | 1 | 0 |    2

Note: Redshift PIVOT operator returns similar data (with the query given below), including GROP BY. The only difference is, the PIVOT operator returns 0 for column 2 and the manufacturer M3. The Group BY operator does not have a corresponding row — in most cases PIVOT inserts NULL if a row does not have input data for a given column.

SELECT manufacturer, quality, count(*)
FROM (SELECT quality, manufacturer FROM part)
WHERE quality IN (1, 2) OR quality IS NULL
GROUP BY manufacturer, quality;

The query gives out the following output:

 manufacturer | quality | count 
--------------+---------+-------
 M1           |       1 |     1
 M1           |       2 |     1
 M1           |         |     1
 M2           |       1 |     1
 M2           |       2 |     1
 M2           |         |     1
 M3           |       1 |     1
 M3           |         |     2

Redshift PIVOT operator accepts optional aliases on the aggregate expression, and for each value for the IN operator, too. It is recommended to use aliases and then customize column identity. If there is no aggregate alias, then only the IN list aliases are used. Otherwise, the aggregate alias is appended to the column name with an underscore to separate the names.

SELECT *
FROM (SELECT quality, manufacturer FROM part) PIVOT (
    count(*) AS count FOR quality IN (1 AS high, 2 AS low, NULL AS na)
);

The query gives out the following output:

 manufacturer | high_count | low_count | na_count 
--------------+------------+-----------+----------
 M1           |          1 |         1 |        1
 M2           |          1 |         1 |        1
 M3           |          1 |         0 |        2

Integrate Amazon DocumentDB to Redshift

Get a Demo Try it

Integrate Amazon RDS to Redshift

Get a Demo Try it

Integrate Criteo to Redshift

Get a Demo Try it

What is Redshift UNPIVOT operator in Redshift?

Redshift UNPIVOT operator converts columns into rows. Let’s take the example of a table that looks like this:

ID	Student Name	Math	English	History	Science
1	Son	54	79	65	98
2	Marie	67	90	76	86

Table 1

Now when we UNPIVOT, the new table will look as follows:

	Student Name	Course	Score
1	Son	Math	54
2	Son	English	79
3	Son	History	65
4	Son	Science	98
5	Marie	Math	67
6	Marie	English	90
7	Marie	History	76
8	Marie	Science	86

Table 2

The columns of the first table have been converted into rows. The second table is how the unpivoted table looks like

Now, let’s check out Redshift UNPIVOT examples in Redshift.

Setup and use the sample data (given below) then execute the subsequent examples.

CREATE TABLE count_by_color (quality varchar, red int, green int, blue int);

INSERT INTO count_by_color VALUES ('high', 15, 20, 7);
INSERT INTO count_by_color VALUES ('normal', 35, NULL, 40);
INSERT INTO count_by_color VALUES ('low', 10, 23, NULL);

We will continue to Redshift UNPIVOT on input columns green, blue, and red.

SELECT * 
FROM (SELECT red, green, blue FROM count_by_color) UNPIVOT (
    cnt FOR color IN (red, green, blue)
);

The following output is shown.

 color | cnt 
-------+-----
 red   |  15
 red   |  35
 red   |  10
 green |  20
 green |  23
 blue  |   7
 blue  |  40

Note: By default, the NULL values in the input column remain skipped, not yielding any result row.

Now let’s UNPIVOT with INCLUDE NULLS

SELECT * 
FROM (
    SELECT red, green, blue 
    FROM count_by_color
) UNPIVOT INCLUDE NULLS (
    cnt FOR color IN (red, green, blue)
);

Inner Query: Selects the red, green, and blue columns from count_by_color.
UNPIVOT Operation:
Converts the selected columns into rows.
Creates two new columns:
color: Indicates the original column names (red, green, blue).
cnt: Contains the corresponding count values.


Result: The output shows each color with its associated count, including rows where counts may be NULL.

The following output is shown below.

When the INCLUDING NULLS parameter is set, results generated for NULL input values are as follow. The query below shows UNPIVOT with quality as an implicit column.

SELECT * 
FROM count_by_color UNPIVOT (
    cnt FOR color IN (red, green, blue)
);

The output for the above-mentioned query is shown below:

 quality | color | cnt 
---------+-------+-----
 high    | red   |  15
 normal  | red   |  35
 low     | red   |  10
 high    | green |  20
 low     | green |  23
 high    | blue  |   7
 normal  | blue  |  40

The next query shows the UNPIVOT operator with aliases for IN list’s values.

SELECT * 
FROM count_by_color UNPIVOT (
    cnt FOR color IN (red AS r, green AS g, blue AS b)
);

The output for the above-mentioned query is shown below:

 quality | color | cnt 
---------+-------+-----
 high    | r     |  15
 normal  | r     |  35
 low     | r     |  10
 high    | g     |  20
 low     | g     |  23
 high    | b     |   7
 normal  | b     |  40

Usage Guidelines for PIVOT and UNPIVOT

PIVOT

PIVOT can be applied to tables, sub-queries, and common table expressions (CTEs).
PIVOT cannot be applied to any JOIN expressions, recursive CTEs, PIVOT, or UNPIVOT expressions. Also not supported are SUPER unnested expressions and Redshift Spectrum nested tables.
PIVOT supports the COUNT, SUM, MIN, MAX, and AVG aggregate functions.
The PIVOT IN list values cannot be column references or sub-queries. Each value must be type compatible with the FOR column reference.

To know more about PIVOT , check out the official documentation of amazon.

UNPIVOT

If an IN list value does not have an alias, UNPIVOT uses the column name as a default value.
UNPIVOT can be applied to tables, sub-queries, and common table expressions (CTEs).
UNPIVOT cannot be applied to any JOIN expressions, recursive CTEs, PIVOT, or UNPIVOT expressions. Also not supported are SUPER unnested expressions and Redshift Spectrum nested tables.
The UNPIVOT IN list must contain only input table column references. The IN list columns must have a common type that they are all compatible with. The UNPIVOT value column has this common type.
The UNPIVOT name column is of type VARCHAR.

Conclusion

Through this tutorial article, we shed a light on Amazon Redshift PIVOT and UNPIVOT operators. In theory, their use case might seem basic, but in reality, their usage leaves cascading effects on the project management landscape as a whole.

For better and informed judgment, you can also refer to either of these three articles cited below:

Furthermore, if you are a data folk who needs a one-stop solution for data management and ETL-related tasks — Hevo has got your back! Hevo Data is a no-code data pipeline platform that helps new-age businesses integrate their data from multiple sources systems to a data warehouse and plug this unified data into any BI tool or Business Application. The platform provides 150+ ready-to-use integrations with a range of data sources and is trusted by hundreds of data-driven organizations from 30+ countries.

Sign up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Frequently Asked Questions

1. What is PIVOT in Redshift?

Redshift does not have a built-in PIVOT function. However, you can achieve pivot-like functionality by using a combination of CASE statements and GROUP BY to rotate rows into columns manually.

2. What is the use of PIVOT function?

The PIVOT function in SQL transforms rows of data into columns. It’s typically used to summarize data, converting unique values from one column into multiple columns, and applying aggregate functions (like SUM, COUNT) across these columns.

3. Why do we use PIVOT in SQL?

PIVOT is used to make data easier to read and analyze by transforming rows into columns. It is commonly used for reporting, aggregating, and restructuring data for better insight into datasets.

Yash Arora Content Manager, Hevo Data

Yash is a Content Marketing professional with over three years of experience in data-driven marketing campaigns. He has expertise in strategic thinking, integrated marketing, and customer acquisition. Through comprehensive marketing communications and innovative digital strategies, he has driven growth for startups and established brands.

Redshift PIVOT & UNPIVOT: The Ultimate How-to Guide [Practical Examples]

What is Amazon Redshift?

What is Redshift PIVOT operator in Redshift?

What is Redshift UNPIVOT operator in Redshift?

Usage Guidelines for PIVOT and UNPIVOT

PIVOT

UNPIVOT

Conclusion

Frequently Asked Questions

1. What is PIVOT in Redshift?

2. What is the use of PIVOT function?

3. Why do we use PIVOT in SQL?

Related articles

Redshift PIVOT & UNPIVOT: The Ultimate How-to Guide [Practical Examples]

What is Amazon Redshift?

What is Redshift PIVOT operator in Redshift?

What is Redshift UNPIVOT operator in Redshift?

Usage Guidelines for PIVOT and UNPIVOT

PIVOT

UNPIVOT

Conclusion

Frequently Asked Questions

1. What is PIVOT in Redshift?

2. What is the use of PIVOT function?

3. Why do we use PIVOT in SQL?

Related Articles

Optimize your data integration with Hevo!

Related articles