Snowflake PIVOT & UNPIVOT Command: Syntax, Usage & Practical Examples 101

on Snowflake, Snowflake Commands • January 18th, 2022 • Write for Hevo

Snowflake Pivot FI

To empower your business decisions with data, you need Real-Time High-Quality data from all of your data sources in a central repository. Traditional On-Premise Data Warehouse solutions provide limited Scalability and Performance, and they require constant maintenance. Snowflake is a more Cost-Effective and Instantly Scalable solution with industry-leading Query Performance. It’s a one-stop-shop for Cloud Data Warehousing and Analytics, with full SQL support for Data Analysis and Transformations. One of the highlighting features of Snowflake is Snowflake Pivot Tables.

You are all aware of the value of Pivot Tables, but creating them in SQL can be quite tricky. Snowflake, on the other hand, has several unique qualities that make this a little easier. You could also want to think about what real-world circumstances you might wish to use the Pivot function in. You can also use Aggregate functions like SUM, AVG, MAX, MIN, and others in conjunction with the Pivot function in Snowflake. 

In this blog post, you will get to know everything about the Snowflake PIVOT and UNPIVOT functions.

Table of Contents

What is Snowflake?

Snowflake PIVOT: logo
Image Source

Snowflake is the world’s first Cloud Data Warehouse solution, built on the customer’s preferred Cloud Provider’s infrastructure (AWS, Azure, or GCP). Snowflake (SnowSQL) adheres to the ANSI Standard and includes typical Analytics and Windowing Capabilities. There are some differences in Snowflake’s syntax, but there are also some parallels. 

Snowflake’s Integrated Development Environment (IDE) is totally Web-based. Visit XXXXXXXX.us-east-1.snowflakecomputing.com. You’ll be sent to the primary Online GUI, which works as an IDE, where you can begin interacting with your Data Assets after logging in. Each query tab in the Snowflake interface is referred to as a “Worksheet” for simplicity. These “Worksheets,” like the Tab History function, are automatically saved and can be viewed at any time.

Key Features of Snowflake

  • SQL and Standard Support: Snowflake offers both standard and extended SQL support, as well as Advanced SQL features such as Merge, Lateral View, Statistical Functions, and many others.
  • Fault Resistant: Snowflake provides exceptional fault-tolerant capabilities to recover the Snowflake Object in the event of a failure (tables, views, database, schema, and so on).
  • Query Optimization: By using Clustering and Partitioning, Snowflake may optimize a query on its own. With Snowflake, Query Optimization isn’t something to be concerned about.
  • Secure Data Sharing: Data can be exchanged securely from one account to another using Snowflake Database Tables, Views, and UDFs.
  • Support for File Formats: JSON, Avro, ORC, Parquet, and XML are all Semi-Structured data formats that Snowflake can import. It has a VARIANT column type that lets you store Semi-Structured data.
  • Caching: Snowflake follows a caching strategy that allows the results of the same query to be quickly returned from the cache when the query is repeated. Snowflake uses permanent (during the session) query results to avoid regenerating the report when nothing has changed.

To get further information check out the official website here

What do you Mean by Pivoting & Unpivoting Tables?

Image Source
  • PIVOT is a relational operator that converts data from row to column. PIVOT rotates a table-valued expression by splitting the output into numerous columns based on the unique values from one column in the expression. We can do aggregate operations where we need them by using the PIVOT operator. This process is known as Pivoting
  • The reverse of the PIVOT relational operator is the UNPIVOT relational operator. The relational operator UNPIVOT converts data from a column to a row-level and this process is known as Unpivoting

Simplify Snowflake Data Transfer with Hevo’s No-code Pipeline

Hevo Data is a No-code Data Pipeline that helps you transfer data from 100+ sources (including 40+ Free Data Sources) to Snowflake in real-time in an effortless manner. After using Hevo you can easily carry out Snowflake Create Users Tasks.

Get Started with Hevo for Free

Key Features of Hevo Data:

  • Fully Managed: Hevo Data is a fully managed service and is straightforward to set up.
  • Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. You need to edit the event object’s properties received in the transform method as a parameter to carry out the transformation. Hevo also offers drag and drop transformations like Date and Control Functions, JSON, and Event Manipulation to name a few. These can be configured and tested before putting them to use.
  • Connectors: Hevo supports 100+ integrations to SaaS platforms, files, databases, analytics, and BI tools. It supports various destinations including Amazon Redshift, Snowflake Data Warehouses; Amazon S3 Data Lakes; and MySQL, SQL Server, TokuDB, DynamoDB, PostgreSQL databases to name a few.  
  • Ensure Unique Records: Hevo Data helps you ensure that only unique records are present in the tables if Primary Keys are defined.
  • Multiple Sources: Hevo Data has various connectors incorporated with it, which can connect to multiple sources with ease. 
  • Automatic Mapping: Hevo Data automatically maps the source schema to perform analysis without worrying about the changes in the schema.
  • Real-time Data Transfer: Hevo Data works on both batch as well as real-time data transfer. 
  • Resume from Point of Failure: Hevo Data can resume the ingestion from the point of failure if it occurs. 
  • Advanced Monitoring: Advanced monitoring gives you a one-stop view to watch all the activity that occurs within pipelines.
  • 24/7 Support: With 24/7 Support, Hevo provides customer-centric solutions to the business use case.

Steps to load Snowflake data using Hevo Data:

  • Sign up on the Hevo Data, and select Snowflake as the destination.
  • Provide the user credentials and connect to the server.
  • Select the database, and schema to load the data.
Sign up here for a 14-Day Free Trial!

What is the Need for Snowflake Pivot Tables?

Image Source

A Pivot Table is a dynamic approach to rapidly summarise vast volumes of data. A Pivot Table can be used to study numerical data in depth and to answer unexpected queries about your data. Snowflake Pivot Tables can be used for the following purposes:

  • Numeric Data Subtotaling and Aggregation, Data Summarization by Categories and Subcategories, and Custom Computations and Formulas are all possible.
  • Expanding and collapsing data levels to focus your results, as well as diving down to details from the summary data for regions of interest.
  • To see different summaries of the Original Data, move rows to columns or columns to rows (or “pivot“).
  • You may focus on precisely the Information you want by filtering, sorting, grouping, and conditionally formatting the most useful and intriguing subset of data.
  • Creating Online or Printed Reports that are concise, appealing, and annotated.

How to use the Snowflake PIVOT Command?

Snowflake Pivot command Rotates a table by dividing the unique values from one column in the input expression into numerous columns and aggregates the results on any leftover column values when necessary. While altering the Table Name or subquery in a query, it is supplied in the FROM Clause.

The operator supports the built-in aggregate functions AVG, COUNT, MAX, MIN, and SUM.

Snowflake PIVOT can be used to turn a narrow table (for example, empid, month, sales) into a broader table (for example, empid, jan_sales, feb_sales, mar_sales).

A) Syntax

 SELECT ...
FROM ...
   PIVOT ( <aggregate_function> ( <pivot_column> )
            FOR <value_column> IN ( <pivot_value_1> [ , <pivot_value_2> ... ] ) )

[ ... ]

B) Usage Parameters 

The parameters that are associated with Snowflake Pivot commands are:

  • <aggregate_function>: The Aggregate Function is used to combine the PIVOT column’s grouped data.
  • <pivot_column>: The Source Table or Subquery Column that will be aggregated.
  • <value_column>: The Column in the Source Table or Subquery that holds the values that will be used to produce Column Names.
  • <pivot_value_N>: A set of values for the pivot column that will be used to pivot the query results into headings.

C) Example Queries 

Pivot around the amount column in the monthly_sales Database with the following structure to sum the total sales per employee for the given months:

create or replace table monthly_sales(empid int, amount int, month text)
    as select * from values
    (1, 10000, 'JAN'),
    (1, 400, 'JAN'),
    (2, 4500, 'JAN'),
    (2, 35000, 'JAN'),
    (1, 5000, 'FEB'),
    (1, 3000, 'FEB'),
    (2, 200, 'FEB'),
    (2, 90500, 'FEB'),
    (1, 6000, 'MAR'),
    (1, 5000, 'MAR'),
    (2, 2500, 'MAR'),
    (2, 9500, 'MAR'),
    (1, 8000, 'APR'),
    (1, 10000, 'APR'),
    (2, 800, 'APR'),
    (2, 4500, 'APR');

D) Outputs 

select * 
  from monthly_sales
    pivot(sum(amount) for month in ('JAN', 'FEB', 'MAR', 'APR'))
      as p
  order by empid;
+-------+-------+-------+-------+-------+
| EMPID | 'JAN' | 'FEB' | 'MAR' | 'APR' |
|-------+-------+-------+-------+-------|
|     1 | 10400 |  8000 | 11000 | 18000 |
|     2 | 39500 | 90700 | 12000 |  5300 |
+-------+-------+-------+-------+-------+

You can add the Column Names in the AS Clause if you want the Column Names to be without quotes or if you want the output to have different Column Names than the input, as seen below:

select * 
  from monthly_sales
    pivot(sum(amount) for month in ('JAN', 'FEB', 'MAR', 'APR'))
      as p (emp_id_renamed, jan, feb, mar, apr)
  order by emp_id_renamed;
+----------------+-------+-------+-------+-------+
| EMP_ID_RENAMED |   JAN |   FEB |   MAR |   APR |
|----------------+-------+-------+-------+-------|
|              1 | 10400 |  8000 | 11000 | 18000 |
|              2 | 39500 | 90700 | 12000 |  5300 |
+----------------+-------+-------+-------+-------+

OR:

select empid as emp_id, "'JAN'" as january, "'FEB'" as february, "'MAR'" as march,
    "'APR'" as april
  from monthly_sales
    pivot(sum(amount) for month in ('JAN', 'FEB', 'MAR', 'APR')) 
      as p
  order by empid;
+--------+---------+----------+-------+-------+
| EMP_ID | JANUARY | FEBRUARY | MARCH | APRIL |
|--------+---------+----------+-------+-------|
|      1 |   10400 |     8000 | 11000 | 18000 |
|      2 |   39500 |    90700 | 12000 |  5300 |
+--------+---------+----------+-------+-------+

How to use the Snowflake UNPIVOT Command?

Transforms columns into rows to rotate a table. UNPIVOT is a relational operator that takes two columns (from a Table or a Subquery) and a list of columns and creates a row for each of the columns in the list. It is supplied after the Table Name or Subquery within the FROM clause of a query. UNPIVOT isn’t exactly the inverse of Snowflake PIVOT because it can’t undo PIVOT’s Aggregations.

This operator can be used to convert a wide table (e.g. empid, jan_sales, feb_sales, mar_sales) into a narrower table (e.g. empid, month, sales).

A) Syntax 

SELECT ...
FROM ...
   UNPIVOT ( <value_column>
             FOR <name_column> IN ( <column_list> ) )

[ ... ]

B) Usage Parameters 

  • <value_column>: The name of the created column, which will be filled with data from the columns in the Column List.
  • <name_column>: The name of the created column, which will be filled with the names of the columns from the Column List.
  • <column_list>: The names of the Source Table or Subquery Columns that will be narrowed down to a Single Pivot Column. Name column will be filled with column names, whereas value column will be filled with Column Values. Only actual column names should be included in the Column List,  subqueries should not be included. 

C) Example Queries 

UNPIVOT the separate month columns to provide a single sales value per month for each employee in a given table, monthly_sales, with the following structure and data:

-- example setup
create or replace table monthly_sales(empid int, dept text, jan int, feb int, mar int, april int);

insert into monthly_sales values
    (1, 'electronics', 100, 200, 300, 100),
    (2, 'clothes', 100, 300, 150, 200),
    (3, 'cars', 200, 400, 100, 50);

D) Outputs 

-- UNPIVOT example
select * from monthly_sales
    unpivot(sales for month in (jan, feb, mar, april))
    order by empid;

+-------+-------------+-------+-------+
| EMPID | DEPT        | MONTH | SALES |
|-------+-------------+-------+-------|
|     1 | electronics | JAN   |   100 |
|     1 | electronics | FEB   |   200 |
|     1 | electronics | MAR   |   300 |
|     1 | electronics | APRIL |   100 |
|     2 | clothes     | JAN   |   100 |
|     2 | clothes     | FEB   |   300 |
|     2 | clothes     | MAR   |   150 |
|     2 | clothes     | APRIL |   200 |
|     3 | cars        | JAN   |   200 |
|     3 | cars        | FEB   |   400 |
|     3 | cars        | MAR   |   100 |
|     3 | cars        | APRIL |    50 |
+-------+-------------+-------+-------+

How to use the CASE Statement for Creating Pivots? 

To create a Snowflake Pivot table in most SQL dialects, you’d have to utilize a series of CASE statements:

SELECT 
  <INDEX COLS>, 
  AGGREGATE_FUNCTION(CASE WHEN <CONDITION> THEN <VALUE> END) <COL_1>, 
  AGGREGATE_FUNCTION(CASE WHEN <CONDITION> THEN <VALUE> END) <COL_2>
FROM
  <TABLE>

Let’s look at another example for multiple Snowflake Pivot: 

SELECT TITLE, 
  SUM(CASE WHEN YEAR(START_TIME) = 2018 THEN DURATION END ) / (60 * 60) "2018",
  SUM(CASE WHEN YEAR(START_TIME) = 2019 THEN DURATION END ) / (60 * 60) "2019",
  SUM(CASE WHEN YEAR(START_TIME) = 2020 THEN DURATION END ) / (60 * 60) "2020"
FROM PUBLIC.NETFLIX
WHERE TITLE IN ('Friends', 'GLOW', 'Arrested Development', 'The Good Place')
GROUP BY TITLE

Output of above Query: 

TITLE
GLOW
Friends
The Good Place
Arrested Development
2018
NULL
192.049166667
9.428888889
1.868611111
2019
NULL
78.595555556
2.958888889
2.521111111
2020
15.726388889
58.134166667
9.827777778
0.9927777778

PIVOT is the other function that can be used to create Snowflake Pivot tables. PIVOT is a custom function in Snowflake that simplifies this behavior. Both of these ways have the drawback of requiring you to specify the resulting pivoted columns before you construct the table, which means you can’t have a Dynamic Pivot Table.

What are the Aggregations Supported by Snowflake PIVOT Command?

SyntaxDescription
Aggregate_functionAggregate function would be used to aggregate the values for the pivot column. The possible set of function for the aggregate function will be MAX, MIN and, AVG, COUNT, SUM
Pivot_columnIt is the column for which we are doing the aggregation. The aggregate function that will be applied on this column values.
value_columnIt is the column from which we are generating multiple columns so basically, this column get pivotised
pivot_value_NThis contains the list of the column values which will be converted as a column header in the result query
Table Source

How to Pivot Multiple Snowflake Columns?

You can only use one Aggregate function within the Pivot because there is no direct way to obtain many columns, but there is a workaround for pivoting multiple Columns in Snowflake.

To establish a pivot for many columns, you can use the Union procedure. Below is an example of the same.                                            

SELECT 'SUM' AGG_TYPE, * FROM (SELECT * FROM Employee_Bonus
 pivot(
        SUM(bonus)
        for department_name in ('CS', 'IT')))
UNION ALL
 
SELECT 'Average' AGG_TYPE, * FROM (SELECT * FROM Employee_Bonus
 pivot(
        AVG(bonus)
        for department_name in ('CS', 'IT')))
 
 
+----------+-------+--------------+--------------+
| AGG_TYPE | EMPID |     	'CS' |     	'IT' |
|----------+-------+--------------+--------------|
| SUM  	| 	1 | 10000.000000 |   400.000000 |
| SUM  	| 	2 |  4500.000000 | 35000.000000 |
| Average  | 	1 | 10000.000000 |   400.000000 |
| Average  | 	2 |  4500.000000 | 35000.000000 |
+----------+-------+--------------+--------------+

Conclusion

This article has exposed you to the various Snowflake Pivot Tables to help you improve your overall decision-making and experience when trying to make the most out of your data. In case you want to export data from a source of your choice into your desired Database/destination like Snowflake, then Hevo Data is the right choice for you! 

Visit our Website to Explore Hevo

Hevo Data provides its users with a simpler platform for integrating data from 100+ sources for Analysis. It is a No-code Data Pipeline that can help you combine data from multiple sources. You can use it to transfer data from multiple data sources into your Data Warehouses such as Snowflake, Database, or a destination of your choice. It provides you with a consistent and reliable solution to managing data in real-time, ensuring that you always have Analysis-ready data in your desired destination.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at our unbeatable pricing that will help you choose the right plan for your business needs!

Share your experience of learning about Snowflake Pivot Tables! Let us know in the comments section below!

No-code Data Pipeline for Snowflake