PostgreSQL STRING_AGG() Function: Syntax, Usage & Practical Examples

on PostgreSQL • July 31st, 2022 • Write for Hevo

POSTGRESQL STRING_AGG | Feature Image | Hevo Data

In today’s digital era, massive data sets are available to be consumed by organizationsfor analysis. The vast amount of data available seems challenging to handle, which is why data aggregation has become more critical than ever. With reduced dimensionality, aggregate data can offer a feature-rich data collection.

There are numerous built-in aggregate functions in PostgreSQL which includes sum(), min(), max(),count(), etc. Employing aggregate functions enables us to generate output from a set of values. 

The PostgreSQL STRING_AGG() aggregate function takes input data and creates strings by concatenating the values. In this article, we’ll deep dive into learning about the working & usability of PostgreSQL STRING_AGG(). Let us begin!

Table of Contents

PostgreSQL STRING_AGG()

PostgreSQL STRING_AGG meme
Image Source

The PostgreSQL STRING_AGG() function is one of the aggregate functions that PostgreSQL supports. It is used to concatenate a list of strings and adds a place for a delimiter symbol or a separator between all of the strings. The output string won’t have a separator or a delimiter symbol at the end of it. The PostgreSQL 9.0 version supports STRING_AGG() function. To concatenate the strings, we can employ a variety of separators or delimiter symbols.

The workings of the STRING_AGG() and ARRAY_AGG() functions are identical; however, STRING_AGG() is an aggregate function that returns results in string type, whereas ARRAY_AGG() delivers results in array type.

Syntax

The syntax for the STRING_AGG() function is as follows:

STRING_AGG ( expression, separator|delimiter [order_by] )

Explanation:

  • Expression: This character string can be any legitimate expression.
  • Delimiter/separator: This specifies the delimiter/separator used when concatenating strings.
  • The ORDER BY clause: specifies the order of the concatenated string results and is optional.

The syntax for the ORDER BY is as follows:

ORDER BY expression1 {ASC | DESC}, [...]

Note: PostgreSQL does not presumptively use a default delimiter. It must be mentioned as the second argument. Also, remember that string_agg() requires the usage of the group by clause, much like other aggregate functions.

Scale your data integration effortlessly with Hevo’s Fault-Tolerant No Code Data Pipeline

As the ability of businesses to collect data explodes, data teams have a crucial role in fueling data-driven decisions. Yet, they struggle to consolidate the data scattered across sources into their warehouse to build a single source of truth. Broken pipelines, data quality issues, bugs and errors, and lack of control and visibility over the data flow make data integration a nightmare.

1000+ data teams rely on Hevo’s Data Pipeline Platform to integrate data from over 150+ sources in a matter of minutes. Billions of data events from sources as varied as SaaS apps, Databases, File Storage and Streaming sources can be replicated in near real-time with Hevo’s fault-tolerant architecture. What’s more – Hevo puts complete control in the hands of data teams with intuitive dashboards for pipeline monitoring, auto-schema management, custom ingestion/loading schedules. 

All of this combined with transparent pricing and 24×7 support makes us the most loved data pipeline software on review sites.

Take our 14-day free trial to experience a better way to manage data pipelines.

Get started for Free with Hevo!

PostgreSQL STRING_AGG(): Use Cases

To elaborate on the use cases of the STRING_AGG ( ) function, we will create a sample table on which we will run queries. The PostgreSQL code for creating a sample table is mentioned below:

Create Table:

CREATE TABLE players (
player_name TEXT ,
team_name TEXT ,
player_positon TEXT
) ;

With the above syntax, a table called “players” will be created, with the columns as player_name, team_name, and player_position.

Note: Run the following SELECT query to verify that the table is created with the desired columns. SELECT * FROM “players” ;

Insert Values into the Table:

Let’s use the INSERT INTO command to add some values to the “players” table now:

INSERT INTO "players"
VALUES ( 'Virat', 'India', 'Batsman' ), ( 'Rohit', 'India', 'Batsman' ) , ( 'Jasprit', 'India', 'Bowler' );
INSERT INTO "players"
VALUES ( 'Chris', 'West Indies', 'Batsman' ), ( 'Shannon', 'West Indies', 'Bowler'), ('Bravo', 'West Indies', 'Batsman');
INSERT INTO "players"
VALUES ( 'James', 'New Zealand', 'All rounder' );
SELECT * FROM "players" ;
—OUTPUT—
player_name | team_name | player_positon 
-------------+-------------+----------------
 Virat | India | Batsman
 Rohit | India | Batsman
 Jasprit | India | Bowler
 Chris | West Indies | Batsman
 Shannon | West Indies | Bowler
 Bravo | West Indies | Batsman
 James | New Zealand | All rounder
(7 rows)

The output clearly shows that the values entered using the INSERT INTO command have been successfully inserted into the sample table.

Use Case#1: Using PostgreSQL STRING_AGG() create Comma-Separated Values

We will use the STRING_AGG() function to produce a list of values separated by commas. The syntax to create comma-separated values is as follows:

SELECT "team_name",string_agg("player_name", ',' )
FROM "players" GROUP BY "team_name" ;
—OUTPUT—
 team_name | string_agg      
-------------+---------------------
 West Indies | Chris,Shannon,Bravo
 India | Virat,Rohit,Jasprit
 New Zealand | James
(3 rows)

The “player_name” column in the SELECT query is separated by commas and displayed alongside the “team_name” as seen in the output obtained. The rows are divided according to the field “team_name” using the GROUP BY command. The expression that needs to be separated is defined in the first parameter of the STRING_AGG() function, and the values are separated in the second parameter by the comma character “,”.

Use Case#2: Using PostgreSQL STRING_AGG() to separate multiple columns with commas

Multiple STRING_AGG() functions can be used in a single SELECT statement. The demonstration of that is provided in the example below.

SELECT "team_name",string_agg("player_name",', '), string_agg("player_positon",',')
FROM "players" GROUP BY "team_name";
—OUTPUT—
 team_name | string_agg | string_agg       
-------------+-----------------------+------------------------
 West Indies | Chris, Shannon, Bravo | Batsman,Bowler,Batsman
 India | Virat, Rohit, Jasprit | Batsman,Batsman,Bowler
 New Zealand | James | All rounder
(3 rows)

Here, we’ve combined the values of columns separated by commas using two STRING_AGG() functions in two separate columns. The output obtained makes it quite evident that the list was produced with two-column values now separated by commas. 

But as seen in the output, the column names are not defined and are set as “string_agg.” To specify the column names, we can use AS keyword in the following manner:

SELECT "team_name",string_agg ("player_name", ',') AS players_name,
string_agg ("player_positon", ',') AS players_positions
FROM "players" GROUP BY "team_name";
—OUTPUT—
 team_name | players_name | players_positions    
-------------+---------------------+------------------------
 West Indies | Chris,Shannon,Bravo | Batsman,Bowler,Batsman
 India | Virat,Rohit,Jasprit | Batsman,Batsman,Bowler
 New Zealand | James | All rounder
(3 rows)

Use Case#3: Removing duplicates in our output string

In our sample table, there are multiple players with player_position as “batsman”. We have to use the DISTINCT keyword to omit duplicate values in the first parameter. 

SELECT "team_name",string_agg(DISTINCT "player_name",', '), string_agg( DISTINCT "player_positon",',')
FROM "players" GROUP BY "team_name";
—OUTPUT—
 team_name | string_agg | string_agg   
-------------+-----------------------+----------------
 India | Jasprit, Rohit, Virat | Batsman,Bowler
 New Zealand | James | All rounder
 West Indies | Bravo, Chris, Shannon | Batsman,Bowler
(3 rows)

Use Case#4: Ordering the contents within the output string

To obtain the output of the PostgreSQL STRING_AGG() function in an ordered manner(alphabetically), we can use the following command: 

SELECT "team_name",string_agg ("player_name", ',' ORDER BY "player_name" asc) AS players_name,
string_agg ("player_positon", ',' ORDER BY "player_positon" asc) AS players_positions
FROM "players" GROUP BY "team_name";
—OUTPUT—
 team_name | players_name | players_positions    
-------------+---------------------+------------------------
 India | Jasprit,Rohit,Virat | Batsman,Batsman,Bowler
 New Zealand | James | All rounder
 West Indies | Bravo,Chris,Shannon | Batsman,Batsman,Bowler
(3 rows)

How does PostgreSQL STRING_AGG() function works?

  • Character string data types should be used as the input expression. Other data types are likewise acceptable if we explicitly convert them to the character string data type before using them.
  • We receive the outcome in string type via PostgreSQL STRING_AGG().
  • Similar to how we use other PostgreSQL aggregate methods like MIN(), MAX(), AVG(), SUM(), and COUNT(), the STRING_AGG() is typically used with the GROUP BY clause.

Practical Example to Implement PostgreSQL STRING_AGG() function

The CREATE TABLE statement will be used to create the tables “student” and “course” as follows:

CREATE A STUDENT TABLE:

create table student
(
stud_id serial PRIMARY KEY,
stud_name VARCHAR(80) NOT NULL,
stud_grade CHAR(1) NOT NULL,
stud_country VARCHAR(80) NOT NULL,
course_id int NOT NULL
);

CREATE A COURSE TABLE:

create table course
(
course_id serial PRIMARY KEY,
course_name VARCHAR(80) NOT NULL
);

INSERT VALUES INTO THE COURSE TABLE:

INSERT INTO course(course_name)
VALUES
('MCA'),
('BCA'),
('MBA');
select * from course;
—OUTPUT—
course_id | course_name 
-----------+-------------
         1 | MCA
         2 | BCA
         3 | MBA
   (3 rows)

INSERT VALUES INTO THE STUDENT TABLE:

INSERT INTO student(stud_name,stud_grade,stud_country,course_id)
VALUES
('Radhika','A','India',1),
('Dev','B','India',2),
('Divanshi','C','India',3),
('Bhavya','A','India',1),
('Bella','B','Canada',2),
('Hannah','A','Canada',3),
('Elena','A','USA',1),
('Damon','B','USA',2),
('Edward','C','USA',3);
select * from student;
—OUTPUT—
stud_id | stud_name | stud_grade | stud_country | course_id 
---------+-----------+------------+--------------+-----------
       1 | Radhika | A | India | 1
       2 | Dev | B | India | 2
       3 | Divanshi | C | India | 3
       4 | Bhavya | A | India | 1
       5 | Bella | B | Canada | 2
       6 | Hannah | A | Canada | 3
       7 | Elena | A | USA | 1
       8 | Damon | B | USA | 2
       9 | Edward | C | USA | 3
(9 rows)
  • Now let us perform a JOIN and in the output obtained, we can see that each row has a separate entry that matches the course with the student’s course.
SELECT c.course_name AS "course name", s.stud_name AS "student name"
FROM course c RIGHT JOIN student s ON c.course_id = s.course_id
ORDER BY 1;
—OUTPUT—
course name | student name 
-------------+--------------
 BCA | Damon
 BCA | Bella
 BCA | Dev
 MBA | Edward
 MBA | Divanshi
 MBA | Hannah
 MCA | Elena
 MCA | Bhavya
 MCA | Radhika
(9 rows)
  • By utilizing PostgreSQL STRING_AGG() function, we can concatenate the student names along with the course_name as follows:
SELECT
crs.course_name AS "course name",
string_agg(stud.stud_name, ', ') AS "student list"
FROM course crs
JOIN student stud ON crs.course_id = stud.course_id
GROUP BY 1
ORDER BY 1;
—OUTPUT—
course name | student list       
-------------+--------------------------
 BCA | Damon, Bella, Dev
 MBA | Edward, Divanshi, Hannah
 MCA | Elena, Bhavya, Radhika
(3 rows)
  • Here, we will employ the PostgreSQL STRING_AGG() function to concatenate the student names and the stud_grade. As seen in the output, a list of students with the same grade concatenated by commas is generated.
SELECT stud_grade, STRING_AGG(stud_name,', ') AS StudentsPerGrade
FROM student
GROUP BY stud_grade
ORDER BY 1 ;
—OUTPUT—
stud_grade | studentspergrade        
------------+--------------------------------
 A | Radhika, Hannah, Elena, Bhavya
 B | Bella, Dev, Damon
 C | Edward, Divanshi
(3 rows)
  • In the example below, a comma separator is used to group and concatenate all students from the same country.
SELECT STRING_AGG(stud_name, ', ') AS "student_names", stud_country
FROM student
GROUP BY stud_country;
—OUTPUT—
        student_names | stud_country 
--------------------------------+--------------
 Elena, Damon, Edward | USA
 Bella, Hannah | Canada
 Radhika, Dev, Divanshi, Bhavya | India
(3 rows)

Advantages

  • Using the ORDER BY clause, we may govern the order of the result.
  • The PostgreSQL STRING_AGG() function can concatenate all strings and insert a separator or delimiter between them.
  • The PostgreSQL STRING_AGG() function does not add delimiter symbols or separators to the end of the string and supports a variety of delimiter symbols and separators.
  • This approach is appropriate when we must aggregate all data labels into an array format to obtain fewer rows from a denormalized database (or query result).
  • Because of the enhanced query performance, massive data from object-based schemas and other pre-joined tables may be analyzed more quickly and effectively.

Conclusion

After reading this article, we trust that you now understand the working & usage of the PostgreSQL STRING_AGG() function. We have also explained it through several examples to fully comprehend the PostgreSQL STRING_AGG() function.

This article has taught us to create comma-separated values with one or more columns using the STRING_AGG() function. The names of chosen columns can also be specified using the AS keyword. You will be prepared to employ PostgreSQL STRING_AGG() function in your own PostgreSQL database with the help of this tutorial.

To continue your learning journey, you may check out these amazing articles at Hevo:–

Copying data into a warehouse using ETL for Data Analysis may be time-consuming if you frequently utilize PostgreSQL.

Hevo, No-Code Data Pipeline, is at your disposal to rescue you. You can save your engineering bandwidth by establishing a Data Pipeline and start to replicate your data from PostgreSQL to the desired warehouse using Hevo within minutes.

VISIT OUR WEBSITE TO EXPLORE HEVO

Want to give Hevo a try? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also look at the amazing price, which will assist you while selecting the best plan for your requirements.

Share your experience understanding the PostgreSQL STRING_AGG() function in the comments below! We would love to hear your thoughts.

Try Hevo’s No-Code Automated Data Pipeline For PostgreSQL