If you have been scouting for the right article to learn about Regex Postgres, you have come to the right place. When there is no requirement for an exact match in the query, but you still want to see all the entries that fit the criteria, pattern matching can be helpful. The LIKE operator works flawlessly for standard SQL operations, but it appears to have some performance concerns when filtering a vast database.
Additionally, the LIKE operator’s filtering condition is restricted to using only wildcards (percent) to discover patterns. PostgreSQL employs Regular Expressions to get around pattern matching. In this article, we will learn about PostgreSQL Regex. PostgreSQL uses POSIX regular expressions, which are better than LIKE and SIMILAR TO operators used for pattern matching.
PostgreSQL Regex Match Operators
Operator | Description |
~ | This is used to match Regular Expression and it is CASE Sensitive. |
~* | This is also used to match Regular Expression and it is CASE Insensitive. |
!~ | This is used to filter the unmatched Regular Expression and it is CASE Sensitive. |
!~* | This is also used to filter the unmatched Regular Expression and it is CASE Insensitive. |
How does Regex PostgreSQL work?
Sample Table & Use Cases
We will cover the following use cases to understand the working of PostgreSQL Regex.
Ditch the manual process of writing long commands to connect your PostgreSQL and choose Hevo’s no-code platform to streamline your data migration.
With Hevo:
- Easily migrate different data types like CSV, JSON etc.
- 150+ connectors like PostgreSQL and Google Sheets(including 60+ free sources).
- Eliminate the need of manual schema mapping with the auto-mapping feature.
Experience Hevo and see why 2000+ data professionals including customers, such as Thoughtspot, Postman, and many more, have rated us 4.3/5 on G2.
Get Started with Hevo for Free
Before we deep dive to learn about the working of Regex, let us create our sample table using the following lines of code.
CREATE TABLE
CREATE TABLE Email(
ID SERIAL PRIMARY KEY,
Name TEXT NOT NULL,
Email TEXT);
CREATE TABLE Email (...)
: Creates a new table named Email
.
ID SERIAL PRIMARY KEY
: Defines an auto-incrementing ID
column as the primary key, ensuring each row has a unique identifier.
Name TEXT NOT NULL
: Defines a Name
column to store text data, which cannot be NULL
.
Email TEXT
: Defines an Email
column to store text data (email addresses), with no specific constraint (can be NULL
).
INSERT VALUES INTO THE TABLE
INSERT INTO Email(Name,Email) VALUES('Pratibha','pratibha67@sqlguide.edu');
INSERT INTO Email(Name,Email) VALUES('Bhavya','bhavyaa@pythonguide.com');
INSERT INTO Email(Name,Email) VALUES('Disha','disha@sqlguide.edu');
INSERT INTO Email(Name,Email) VALUES('Divanshi','divanshi23@spguide.com');
INSERT INTO Email(Name,Email) VALUES('Srishti','srishti@sqlguide.edu');
INSERT INTO Email(Name,Email) VALUES('Kartik','kartik@pythonguide.com');
INSERT INTO Email(Name,Email) VALUES('Rytham','rytham6@sqlguide.edu');
INSERT INTO Email(Name,Email) VALUES('Madhav','madhav@tsinfo.edu');
INSERT INTO Email(Name,Email) VALUES('Tanisha','tanisha@pythonguide.com');
INSERT INTO Email(Name,Email) VALUES('Radhika','radhika41@spguide.com');
- Each
INSERT INTO Email(Name, Email)
adds a new record into the Email
table.
Name
: The name of the person (e.g., Pratibha
, Bhavya
, etc.).
Email
: The corresponding email address for each person.
- Each of these statements inserts a row with a name and email address, like:
('Pratibha', 'pratibha67@sqlguide.edu')
('Bhavya', 'bhavyaa@pythonguide.com')
- … and so on for the other rows.
OUTPUT
SELECT *
FROM Email;
—--OUTPUT—--
id | name | email
----+----------+-------------------------
1 | Pratibha | pratibha67@sqlguide.edu
2 | Bhavya | bhavyaa@pythonguide.com
3 | Disha | disha@sqlguide.edu
4 | Divanshi | divanshi23@spguide.com
5 | Srishti | srishti@sqlguide.edu
6 | Kartik | kartik@pythonguide.com
7 | Rytham | rytham6@sqlguide.edu
8 | Madhav | madhav@tsinfo.edu
9 | Tanisha | tanisha@pythonguide.com
10 | Radhika | radhika41@spguide.com
(10 rows)
Let us understand the syntax & basic working of PostgreSQL Regex by the following example:
SELECT *
FROM Email
WHERE Email ~ '^.*$';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
1 | Pratibha | pratibha67@sqlguide.edu
2 | Bhavya | bhavyaa@pythonguide.com
3 | Disha | disha@sqlguide.edu
4 | Divanshi | divanshi23@spguide.com
5 | Srishti | srishti@sqlguide.edu
6 | Kartik | kartik@pythonguide.com
7 | Rytham | rytham6@sqlguide.edu
8 | Madhav | madhav@tsinfo.edu
9 | Tanisha | tanisha@pythonguide.com
10 | Radhika | radhika41@spguide.com
(10 rows)
As shown above, we have used PostgreSQL Regex using the TILDE (~) operator and the wildcard ‘.*’. This query will select all the records from the Email table with a valid email. Since the pattern condition is only the wildcard, it will fetch all the records from the table. Also, another essential point to note while writing PostgreSQL Regular expressions is that the pattern matching statement always starts with a ‘^’ operator and ends with a ‘$’ sign. These two operators mark the regular expression statement’s beginning and end. Summing up the figure, the expression can be implemented as “^.*$” while using wildcards to filter all records.
Integrate PostgreSQL to Redshift
Integrate PostgreSQL to BigQuery
Integrate Amazon Ads to PostgreSQL
Match Regular Expression (Case Sensitive)
SELECT *
FROM Email
WHERE Email ~ 'sqlguide';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
1 | Pratibha | pratibha67@sqlguide.edu
3 | Disha | disha@sqlguide.edu
5 | Srishti | srishti@sqlguide.edu
7 | Rytham | rytham6@sqlguide.edu
(4 rows)
In this example, we matched the Email field with the regular expression “sqlguide” using the match operator “~ ” and obtained the results with all the rows matching the pattern.
Match Regular Expression (Case Insensitive)
SELECT *
FROM Email
WHERE Email ~* 'SQLGUIDE';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
1 | Pratibha | pratibha67@sqlguide.edu
3 | Disha | disha@sqlguide.edu
5 | Srishti | srishti@sqlguide.edu
7 | Rytham | rytham6@sqlguide.edu
(4 rows)
In this example, we used the match operator ” ~* ” to match the regular expression “SQLGUIDE” with the Email column. As a result, all the rows containing the pattern “SQLGUIDE” with CASE Insensitivity are returned.
Does not Match Regular Expression (Case Sensitive)
SELECT *
FROM Email
WHERE Email !~ 'sqlguide';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
2 | Bhavya | bhavyaa@pythonguide.com
4 | Divanshi | divanshi23@spguide.com
6 | Kartik | kartik@pythonguide.com
8 | Madhav | madhav@tsinfo.edu
9 | Tanisha | tanisha@pythonguide.com
10 | Radhika | radhika41@spguide.com
(6 rows)
The regular expression “sqlguide” was matched with the Email field in this example using the match operator ”!~ ” The results showed that all the rows lacked the pattern.
Does not Match Regular Expression (Case Insensitive)
SELECT *
FROM Email
WHERE Email !~* 'sqlguide';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
2 | Bhavya | bhavyaa@pythonguide.com
4 | Divanshi | divanshi23@spguide.com
6 | Kartik | kartik@pythonguide.com
8 | Madhav | madhav@tsinfo.edu
9 | Tanisha | tanisha@pythonguide.com
10 | Radhika | radhika41@spguide.com
(6 rows)
In this illustration, we used the match operator ” ~* ” to match the regular expression “sqlguide” with the Email column. As a result, all the rows that did not contain the pattern “sqlguide” with CASE Insensitivity were obtained.
Matches the beginning of the string
SELECT *
FROM Email
WHERE Email ~ '^d';
—--OUTPUT—--
id | name | email
----+----------+------------------------
3 | Disha | disha@sqlguide.edu
4 | Divanshi | divanshi23@spguide.com
(2 rows)
For example, if we wanted to search for names that begin with the letter “d,” we would use the ” ^ ” symbol, which indicates the beginning/start of the regex pattern.
Matches the End of the String
SELECT *
FROM Email
WHERE Email ~ 'com$';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
2 | Bhavya | bhavyaa@pythonguide.com
4 | Divanshi | divanshi23@spguide.com
6 | Kartik | kartik@pythonguide.com
9 | Tanisha | tanisha@pythonguide.com
10 | Radhika | radhika41@spguide.com
(5 rows)
In this example, the “$” sign, which indicates the end of the regex pattern, was used in this case to indicate that we were searching for email addresses that ended in “com.”
Matches Numeric-type Data
SELECT *
FROM Email
WHERE Email ~ '[0-9]';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
1 | Pratibha | pratibha67@sqlguide.edu
4 | Divanshi | divanshi23@spguide.com
7 | Rytham | rytham6@sqlguide.edu
10 | Radhika | radhika41@spguide.com
(4 rows)
Using the “[0-9]” regular expression, we have shown the email addresses in this example with at least one number.
Migrate Data seamlessly Within Minutes!
No credit card required
Matches Numeric-type Data (Double Digit)
SELECT *
FROM Email
WHERE Email ~ '[0-9][0-9]';
—--OUTPUT—--
id | name | email
----+----------+-------------------------
1 | Pratibha | pratibha67@sqlguide.edu
4 | Divanshi | divanshi23@spguide.com
10 | Radhika | radhika41@spguide.com
(3 rows)
Using the regular expression “[0-9][0-9],” we have shown the results for those email addresses that include at least two numeric characters in this example.
PostgreSQL Regex Functions & Usage
We will discuss the following functions along with PostgreSQL regex example in the following sections. Let’s begin!
Postgres regexp_replace
To replace an occurrence of a numeric data type with a symbol, let’s utilize the REGEXP_REPLACE() function:
SELECT REGEXP_REPLACE(Email,'[0-9]','*')
From Email;
—--OUTPUT—--
regexp_replace
-------------------------
pratibha*7@sqlguide.edu
bhavyaa@pythonguide.com
disha@sqlguide.edu
divanshi*3@spguide.com
srishti@sqlguide.edu
kartik@pythonguide.com
rytham*@sqlguide.edu
madhav@tsinfo.edu
tanisha@pythonguide.com
radhika*1@spguide.com
(10 rows)
The output makes it clear that the REGEXP_REPLACE() function inserts a ” * ” symbol in place of any numeric data type that appears in the ” Email ” field. This function replaces the first instance of the numeric data type, as seen.
Postgres regexp_replace All Occurrences
SELECT REGEXP_REPLACE(Email,'[0-9]','*','g')
From Email;
—--OUTPUT—--
regexp_replace
-------------------------
pratibha**@sqlguide.edu
bhavyaa@pythonguide.com
disha@sqlguide.edu
divanshi**@spguide.com
srishti@sqlguide.edu
kartik@pythonguide.com
rytham*@sqlguide.edu
madhav@tsinfo.edu
tanisha@pythonguide.com
radhika**@spguide.com
(10 rows)
The output clearly shows that the REGEXP_REPLACE() function is used to replace every instance of the ” Numeric ” type data with the symbol ” * “. We provided a flag “g” that searches for every instance of the specified pattern.
Postgres regexp_matches in WHERE Clause
When a Regex is run against a string, the REGEXP_MATCHES() function compares the two and returns the string that matches the pattern as a set. We’ll look at REGEXP_MATCHES() function example for a better understanding.
SELECT REGEXP_MATCHES(Email,'.+@(.*)$')
FROM Email;
—--OUTPUT—--
regexp_matches
-------------------
{sqlguide.edu}
{pythonguide.com}
{sqlguide.edu}
{spguide.com}
{sqlguide.edu}
{pythonguide.com}
{sqlguide.edu}
{tsinfo.edu}
{pythonguide.com}
{spguide.com}
(10 rows)
In this case, we searched for the email address domain name using the REGEXP_MATCHES() function. The REGEXP_MATCHES() function returned the outcomes as a set.
Postgres Regex Substring
Let us understand this better in the following example. It extracts substrings from a given field using the PostgreSQL Regex. For instance, let us extract some substring from the Email table using the SUBSTRING() function to match it against the Regular expression.
SELECT DISTINCT SUBSTRING(Email FROM '.+@(.*)$')
FROM Email;
—--OUTPUT—--
substring
-----------------
sqlguide.edu
spguide.com
pythonguide.com
tsinfo.edu
(4 rows)
Here we’ve used the SUBSTRING() function to extract the Domain name of the Email address from the table Email using the DISTINCT keyword, which only displays the distinct names.
Postgres Regex Numbers Only
Here we will use the REGEXP_REPLACE() function to extract only the numbers from a string in PostgreSQL.
Let us look at an example to extract numeric data types’ occurrences.
SELECT REGEXP_REPLACE(Email,'\D','','g')
FROM Email;
—--OUTPUT—--
regexp_replace
----------------
67
23
6
41
(10 rows)
From the output obtained, it can be seen that we have extracted only the numeric type data from the field Email in PostgreSQL using the REGEXP_REPLACE() function. Here “\D” is class shorthand for “Not a Digit.”
Postgres Regex Split
In PostgreSQL, the SPLIT_PART() function can split a string into many parts. To divide a string into several pieces, we must pass the String, the Delimiter, and the Filed Number. The PostgreSQL SPLIT_PART() function’s syntax is as follows:
SPLIT_PART(STRING,DELIMITER,FIELD_NUMBER)
Case#1
SELECT SPLIT_PART(Email,'@',1)
FROM Email;
—--OUTPUT—--
split_part
------------
pratibha67
bhavyaa
disha
divanshi23
srishti
kartik
rytham6
madhav
tanisha
radhika41
(10 rows)
Case#2
SELECT SPLIT_PART(Email,'@',2)
FROM Email;
—--OUTPUT—--
split_part
-----------------
sqlguide.edu
pythonguide.com
sqlguide.edu
spguide.com
sqlguide.edu
pythonguide.com
sqlguide.edu
tsinfo.edu
pythonguide.com
spguide.com
(10 rows)
The Email table’s Email field was divided into two pieces in this example using the ” @ ” delimiter and the SPLIT_PART() function. It returns the text before the delimiter when we supply the FIELD NUMBER as “1,” in Case1, and the text after the delimiter when we pass the FIELD NUMBER as “2”, in Case2.
Postgres Regex Remove Special Characters
Remove Special Characters utilizing PostgreSQL Regex. Using the REGEXP_REPLACE() function, all Special Characters from a supplied text can be eliminated. We’ll examine an example where we utilize PostgreSQL’s REGEXP_REPLACE() function to eliminate every instance of a Special Character from a table’s column.
SELECT REGEXP_REPLACE(Email, '[^\w]+','','g')
FROM Email;
—--OUTPUT—--
regexp_replace
-----------------------
pratibha67sqlguideedu
bhavyaapythonguidecom
dishasqlguideedu
divanshi23spguidecom
srishtisqlguideedu
kartikpythonguidecom
rytham6sqlguideedu
madhavtsinfoedu
tanishapythonguidecom
radhika41spguidecom
(10 rows)
This example shows that the REGEXP_REPLACE() has been used to eliminate any instances of special characters.
Advantages of PostgreSQL Regex
The advantages of working with PostgreSQL Regex are listed below:
- PostgreSQL Regex functions support numerous flags. For instance: REGEXP_MATCHES() supports
- flag ‘i’: match case-insensitively.
- Flag ‘g’: search globally for each occurrence.
- The PostgreSQL Regex function returns no row, one row, or multiple rows per defined pattern.
- The PostgreSQL Regex function may return zero, one, or several rows depending on the pattern specified.
- The PostgreSQL Regex functions allow us to search for any word in the needed string at any position, or we can search for simply the first occurrence.
- We may also extract a table’s column values using PostgreSQL Regex.
- The PostgreSQL Regex functions can be used for validity purposes.
- It allows us to perform challenging operations like string searches using random characters.
- Compared to the complex mix of case statements and substrings, PostgreSQL Regex is more versatile and performs better.
Before we wrap up, let’s learn about the operators in Pattern Matching.
Functions and Operators in Pattern Matching
LIKE, SIMILAR TO Regular Expressions, and POSIX Regular Expressions are the main three approaches in pattern matching.
1. LIKE
string LIKE pattern [ESCAPE escape-character]
string NOT LIKE pattern [ESCAPE escape-character]
- The LIKE expression returns true if the string matches the given pattern. False will be returned by the NOT LIKE expression when LIKE returns true. This is true reversely as well. NOT (string LIKE pattern) is an equivalent expression.
- If the pattern does not contain percent signs or underscores, then the pattern stands for the string itself; in that case, LIKE would be the equals operator. An underscore (_) in pattern represents any single character; % matches any pattern of zero or more characters.
Some examples:
'abc' LIKE 'abc' true
'abc' LIKE 'a%' true
'abc' LIKE '_b_' true
'abc' LIKE 'c' false
2. SIMILAR TO Regular Expressions
string SIMILAR TO pattern [ESCAPE escape-character]
string NOT SIMILAR TO pattern [ESCAPE escape-character]
- True or false returned by the SIMILAR TO operator is based on whether its pattern matches the provided string. It differs from LIKE in such a way that it reads the pattern using the SQL standard’s definition of a regular expression. SQL regular expressions are a cross between LIKE and common (POSIX) regular expression notation.
- SIMILAR TO operator would be helpful only if its pattern is matching with the full string. In common regular expression behavior, the pattern need to match only a part of the string.
- _ and % as wildcard characters are used by SIMILAR TO that notes any single character and any string, in the respective order. These are similar to . and .* in POSIX regular expressions.
'abc' SIMILAR TO 'abc' true
'abc' SIMILAR TO 'a' false
'abc' SIMILAR TO '%(b|d)%' true
'abc' SIMILAR TO '(b|c)%' false
'-abc-' SIMILAR TO '%\mabc\M%' true
'xabcy' SIMILAR TO '%\mabc\M%' false
3. POSIX Regular Expressions
- These type of operators are powerful compared to the ones above. The abbreviated definition of a regular set of strings is called a regular expression. If a string belongs to the regular set defined by the regular expression, that matches with a regular expression.
- Regular expressions use various special characters compared to LIKE. A regular expression can match anywhere within a string, if the regular expression is not precisely anchored to the starting or end of the string.
Some examples:
'abcd' ~ 'bc' true
'abcd' ~ 'a.c' true — dot matches any character
'abcd' ~ 'a.*d' true — * repeats the preceding pattern item
'abcd' ~ '(b|x)' true — | means OR, parentheses group
'abcd' ~ '^a' true — ^ anchors to start of string
'abcd' ~ '^(b|c)' false — would match except for anchoring
Conclusion
In this tutorial, we went in-depth on how to use regular expressions to write SQL statements in PostgreSQL. We can create dynamic SQL queries that use regular expressions to match patterns from a database column. The SQL LIKE operator in PostgreSQL can also match patterns, but the searches are more restricted. Greater flexibility and the ability to dynamically alter the size of the pattern to be matched are provided by regular expressions.
Want to know more? Check out these fantastic Hevo articles on BigQuery Regex and Pattern Matching: 6 Major Things and Understanding Amazon Redshift Regex: Simplified 101.
Copying data into a warehouse using ETL for Data Analysis may be time-consuming if you frequently utilize PostgreSQL. Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. With integration with 150+ Data Sources (40+ free sources), we help you not only export data from sources & load data to the destinations but also transform & enrich your data, & make it analysis-ready.
Want to give Hevo a try? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also look at the amazing price, which will assist you while selecting the best plan for your requirements.
FAQ on PostgreSQL Regex and Pattern Matching
1. Can I use regex in PostgreSQL?
Yes, PostgreSQL supports regular expressions for pattern matching using operators like ~ for case-sensitive matching and ~* for case-insensitive matching.
2. How to match patterns in PostgreSQL?
Match patterns in PostgreSQL using the ~ (case-sensitive) or ~* (case-insensitive) operators. Example: SELECT * FROM table WHERE column ~ ‘pattern’;.
3. What is the equivalent of regexp_like in PostgreSQL?
The equivalent in PostgreSQL is using the ~ operator. Example: SELECT * FROM table WHERE column ~ ‘pattern’;.
4. Can SQL use regex?
Many SQL databases, including PostgreSQL and MySQL, support regex for pattern matching in queries, allowing complex string matching and manipulation.
5. Which is faster, Postgres or MySQL?
Performance varies by use case. PostgreSQL excels in complex queries, data integrity, and concurrency, while MySQL is often faster for read-heavy operations and simpler queries.
Pratibha is a seasoned Marketing Analyst with a strong background in marketing research and a passion for data science. She excels in crafting in-depth articles within the data industry, leveraging her expertise to produce insightful and valuable content. Pratibha has curated technical content on various topics, including data integration and infrastructure, showcasing her ability to distill complex concepts into accessible, engaging narratives.