In today’s digital era, massive data sets are available to be consumed by organizations for analysis. The vast amount of data available seems challenging to handle, which is why data aggregation has become more critical than ever. With reduced dimensionality, aggregate data can offer a feature-rich data collection.
There are numerous built-in aggregate functions in PostgreSQL which includes sum(), min(), max(),count(), etc. Employing aggregate functions enables us to generate output from a set of values.
The PostgreSQL STRING_AGG() aggregate function takes input data and creates strings by concatenating the values. In this article, we’ll deep dive into learning about the working & usability of PostgreSQL STRING_AGG(). Let us begin!
PostgreSQL STRING_AGG()
The PostgreSQL STRING_AGG() function is one of the aggregate functions that PostgreSQL supports. It is used to concatenate a list of strings and adds a place for a delimiter symbol or a separator between all of the strings. The output string won’t have a separator or a delimiter symbol at the end of it. The PostgreSQL 9.0 version supports STRING_AGG() function. To concatenate the strings, we can employ a variety of separators or delimiter symbols.
The workings of the STRING_AGG() and ARRAY_AGG() functions are identical; however, STRING_AGG() is an aggregate function that returns results in string type, whereas ARRAY_AGG() delivers results in array type.
Syntax
The syntax for the STRING_AGG() function is as follows:
STRING_AGG ( expression, separator|delimiter [order_by] )
Explanation:
- Expression: This character string can be any legitimate expression.
- Delimiter/separator: This specifies the delimiter/separator used when concatenating strings.
- The ORDER BY clause: specifies the order of the concatenated string results and is optional.
The syntax for the ORDER BY is as follows:
ORDER BY expression1 {ASC | DESC}, [...]
Note: PostgreSQL does not presumptively use a default delimiter. It must be mentioned as the second argument. Also, remember that string_agg() requires the usage of the group by clause, much like other aggregate functions.
PostgreSQL STRING_AGG(): Use Cases
To elaborate on the use cases of the STRING_AGG ( ) function, we will create a sample table on which we will run queries. The PostgreSQL code for creating a sample table is mentioned below:
Create Table:
CREATE TABLE players (
player_name TEXT ,
team_name TEXT ,
player_positon TEXT
) ;
With the above syntax, a table called “players” will be created, with the columns as player_name, team_name, and player_position.
Note: Run the following SELECT query to verify that the table is created with the desired columns. SELECT * FROM “players” ;
Insert Values into the Table:
Let’s use the INSERT INTO command to add some values to the “players” table now:
INSERT INTO "players"
VALUES ( 'Virat', 'India', 'Batsman' ), ( 'Rohit', 'India', 'Batsman' ) , ( 'Jasprit', 'India', 'Bowler' );
INSERT INTO "players"
VALUES ( 'Chris', 'West Indies', 'Batsman' ), ( 'Shannon', 'West Indies', 'Bowler'), ('Bravo', 'West Indies', 'Batsman');
INSERT INTO "players"
VALUES ( 'James', 'New Zealand', 'All rounder' );
SELECT * FROM "players" ;
—OUTPUT—
player_name | team_name | player_positon
-------------+-------------+----------------
Virat | India | Batsman
Rohit | India | Batsman
Jasprit | India | Bowler
Chris | West Indies | Batsman
Shannon | West Indies | Bowler
Bravo | West Indies | Batsman
James | New Zealand | All rounder
(7 rows)
The output clearly shows that the values entered using the INSERT INTO command have been successfully inserted into the sample table.
Use Case#1: Using PostgreSQL STRING_AGG() create Comma-Separated Values
We will use the STRING_AGG() function to produce a list of values separated by commas. The syntax to create comma-separated values is as follows:
SELECT "team_name",string_agg("player_name", ',' )
FROM "players" GROUP BY "team_name" ;
—OUTPUT—
team_name | string_agg
-------------+---------------------
West Indies | Chris,Shannon,Bravo
India | Virat,Rohit,Jasprit
New Zealand | James
(3 rows)
The “player_name” column in the SELECT query is separated by commas and displayed alongside the “team_name” as seen in the output obtained. The rows are divided according to the field “team_name” using the GROUP BY command. The expression that needs to be separated is defined in the first parameter of the STRING_AGG() function, and the values are separated in the second parameter by the comma character “,”.
Use Case#2: Using PostgreSQL STRING_AGG() to separate multiple columns with commas
Multiple STRING_AGG() functions can be used in a single SELECT statement. The demonstration of that is provided in the example below.
SELECT "team_name",string_agg("player_name",', '), string_agg("player_positon",',')
FROM "players" GROUP BY "team_name";
—OUTPUT—
team_name | string_agg | string_agg
-------------+-----------------------+------------------------
West Indies | Chris, Shannon, Bravo | Batsman,Bowler,Batsman
India | Virat, Rohit, Jasprit | Batsman,Batsman,Bowler
New Zealand | James | All rounder
(3 rows)
Here, we’ve combined the values of columns separated by commas using two STRING_AGG() functions in two separate columns. The output obtained makes it quite evident that the list was produced with two-column values now separated by commas.
But as seen in the output, the column names are not defined and are set as “string_agg.” To specify the column names, we can use AS keyword in the following manner:
SELECT "team_name",string_agg ("player_name", ',') AS players_name,
string_agg ("player_positon", ',') AS players_positions
FROM "players" GROUP BY "team_name";
—OUTPUT—
team_name | players_name | players_positions
-------------+---------------------+------------------------
West Indies | Chris,Shannon,Bravo | Batsman,Bowler,Batsman
India | Virat,Rohit,Jasprit | Batsman,Batsman,Bowler
New Zealand | James | All rounder
(3 rows)
Use Case#3: Removing duplicates in our output string
In our sample table, there are multiple players with player_position as “batsman”. We have to use the DISTINCT keyword to omit duplicate values in the first parameter.
SELECT "team_name",string_agg(DISTINCT "player_name",', '), string_agg( DISTINCT "player_positon",',')
FROM "players" GROUP BY "team_name";
—OUTPUT—
team_name | string_agg | string_agg
-------------+-----------------------+----------------
India | Jasprit, Rohit, Virat | Batsman,Bowler
New Zealand | James | All rounder
West Indies | Bravo, Chris, Shannon | Batsman,Bowler
(3 rows)
Use Case#4: Ordering the contents within the output string
To obtain the output of the PostgreSQL STRING_AGG() function in an ordered manner(alphabetically), we can use the following command:
SELECT "team_name",string_agg ("player_name", ',' ORDER BY "player_name" asc) AS players_name,
string_agg ("player_positon", ',' ORDER BY "player_positon" asc) AS players_positions
FROM "players" GROUP BY "team_name";
—OUTPUT—
team_name | players_name | players_positions
-------------+---------------------+------------------------
India | Jasprit,Rohit,Virat | Batsman,Batsman,Bowler
New Zealand | James | All rounder
West Indies | Bravo,Chris,Shannon | Batsman,Batsman,Bowler
(3 rows)
How does PostgreSQL STRING_AGG() function works?
- Character string data types should be used as the input expression. Other data types are likewise acceptable if we explicitly convert them to the character string data type before using them.
- We receive the outcome in string type via PostgreSQL STRING_AGG().
- Similar to how we use other PostgreSQL aggregate methods like MIN(), MAX(), AVG(), SUM(), and COUNT(), the STRING_AGG() is typically used with the GROUP BY clause.
Practical Example to Implement PostgreSQL STRING_AGG() function
The CREATE TABLE statement will be used to create the tables “student” and “course” as follows:
CREATE A STUDENT TABLE:
create table student
(
stud_id serial PRIMARY KEY,
stud_name VARCHAR(80) NOT NULL,
stud_grade CHAR(1) NOT NULL,
stud_country VARCHAR(80) NOT NULL,
course_id int NOT NULL
);
CREATE A COURSE TABLE:
create table course
(
course_id serial PRIMARY KEY,
course_name VARCHAR(80) NOT NULL
);
INSERT VALUES INTO THE COURSE TABLE:
INSERT INTO course(course_name)
VALUES
('MCA'),
('BCA'),
('MBA');
select * from course;
—OUTPUT—
course_id | course_name
-----------+-------------
1 | MCA
2 | BCA
3 | MBA
(3 rows)
INSERT VALUES INTO THE STUDENT TABLE:
INSERT INTO student(stud_name,stud_grade,stud_country,course_id)
VALUES
('Radhika','A','India',1),
('Dev','B','India',2),
('Divanshi','C','India',3),
('Bhavya','A','India',1),
('Bella','B','Canada',2),
('Hannah','A','Canada',3),
('Elena','A','USA',1),
('Damon','B','USA',2),
('Edward','C','USA',3);
select * from student;
—OUTPUT—
stud_id | stud_name | stud_grade | stud_country | course_id
---------+-----------+------------+--------------+-----------
1 | Radhika | A | India | 1
2 | Dev | B | India | 2
3 | Divanshi | C | India | 3
4 | Bhavya | A | India | 1
5 | Bella | B | Canada | 2
6 | Hannah | A | Canada | 3
7 | Elena | A | USA | 1
8 | Damon | B | USA | 2
9 | Edward | C | USA | 3
(9 rows)
- Now let us perform a JOIN and in the output obtained, we can see that each row has a separate entry that matches the course with the student’s course.
SELECT c.course_name AS "course name", s.stud_name AS "student name"
FROM course c RIGHT JOIN student s ON c.course_id = s.course_id
ORDER BY 1;
—OUTPUT—
course name | student name
-------------+--------------
BCA | Damon
BCA | Bella
BCA | Dev
MBA | Edward
MBA | Divanshi
MBA | Hannah
MCA | Elena
MCA | Bhavya
MCA | Radhika
(9 rows)
- By utilizing PostgreSQL STRING_AGG() function, we can concatenate the student names along with the course_name as follows:
SELECT
crs.course_name AS "course name",
string_agg(stud.stud_name, ', ') AS "student list"
FROM course crs
JOIN student stud ON crs.course_id = stud.course_id
GROUP BY 1
ORDER BY 1;
—OUTPUT—
course name | student list
-------------+--------------------------
BCA | Damon, Bella, Dev
MBA | Edward, Divanshi, Hannah
MCA | Elena, Bhavya, Radhika
(3 rows)
- Here, we will employ the PostgreSQL STRING_AGG() function to concatenate the student names and the stud_grade. As seen in the output, a list of students with the same grade concatenated by commas is generated.
SELECT stud_grade, STRING_AGG(stud_name,', ') AS StudentsPerGrade
FROM student
GROUP BY stud_grade
ORDER BY 1 ;
—OUTPUT—
stud_grade | studentspergrade
------------+--------------------------------
A | Radhika, Hannah, Elena, Bhavya
B | Bella, Dev, Damon
C | Edward, Divanshi
(3 rows)
- In the example below, a comma separator is used to group and concatenate all students from the same country.
SELECT STRING_AGG(stud_name, ', ') AS "student_names", stud_country
FROM student
GROUP BY stud_country;
—OUTPUT—
student_names | stud_country
--------------------------------+--------------
Elena, Damon, Edward | USA
Bella, Hannah | Canada
Radhika, Dev, Divanshi, Bhavya | India
(3 rows)
Advantages
- Using the ORDER BY clause, we may govern the order of the result.
- The STRING_AGG PostgreSQL function can concatenate all strings and insert a separator or delimiter between them.
- The PostgreSQL STRING_AGG() function does not add delimiter symbols or separators to the end of the string and supports a variety of delimiter symbols and separators.
- This approach is appropriate when we must aggregate all data labels into an array format to obtain fewer rows from a denormalized database (or query result).
- Because of the enhanced query performance, massive data from object-based schemas and other pre-joined tables may be analyzed more quickly and effectively.
Conclusion
After reading this article, we trust that you now understand the working & usage of the PostgreSQL STRING_AGG() function. We have also explained it through several examples to fully comprehend the PostgreSQL STRING_AGG() function.
This article has taught us to create comma-separated values with one or more columns using the STRING_AGG() function. The names of chosen columns can also be specified using the AS keyword. You will be prepared to employ PostgreSQL STRING_AGG() function in your own PostgreSQL database with the help of this tutorial.
To continue your learning journey, you may check out these amazing articles at Hevo:–
Copying data into a warehouse using ETL for Data Analysis may be time-consuming if you frequently utilize PostgreSQL.
Hevo, No-Code Data Pipeline, is at your disposal to rescue you. You can save your engineering bandwidth by establishing a Data Pipeline and start to replicate your data from PostgreSQL to the desired warehouse using Hevo within minutes.
VISIT OUR WEBSITE TO EXPLORE HEVO
Want to give Hevo a try? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also look at the amazing price, which will assist you while selecting the best plan for your requirements.
Share your experience understanding the PostgreSQL STRING_AGG() function in the comments below! We would love to hear your thoughts.
Pratibha is a seasoned Marketing Analyst with a strong background in marketing research and a passion for data science. She excels in crafting in-depth articles within the data industry, leveraging her expertise to produce insightful and valuable content. Pratibha has curated technical content on various topics, including data integration and infrastructure, showcasing her ability to distill complex concepts into accessible, engaging narratives.