The snowflake substring function is a must-have in your arsenal. In this tutorial, you will learn how to use the Snowflake substring or substr function, the different arguments that can be used with it, and also some examples which will help you understand it better. First, let’s take a look at the Snowflake data warehouse.
Table of Contents
Snowflake – An Overview
Snowflake is a cloud-hosted data warehouse platform that allows you to store, share and analyse your data. It has a multi-cluster architecture that provides support for different clouds and also lets you mix and match between these different cloud platforms. It is available on a variety of platforms including AWS, Azure as well as Google Cloud Platform. Its in-built performance eliminates the need to manage any infrastructure. It is also economical and feasible due to its pay-for-what-you-use strategy.
Snowflake enables easy integration with customized and packaged tools and applications. Snowflake string is one of the most important and extensively used data types in Snowflake. You can perform different operations on this data type. These include contains, replace, position and many more.
In this blog post, you will be looking at the Snowflake substring function.
What is Snowflake Substring Function?
This function can be used with either a VARCHAR or BINARY data type. It will return the part of the string or binary value specified by base_expr, starting from the character/byte determined by start_expr. The length of the substring can also be limited by using another argument.
The substr and substring functions are synonymous and can be used interchangeably in Snowflake.
Syntax of Snowflake Substring
Given below is the syntax for using Snowflake substring function:
SUBSTR( <base_expr>, <start_expr> [ , <length_expr> ] )
SUBSTRING( <base_expr>, <start_expr> [ , <length_expr> ] )
Where,
base_expr: This is the value from which you want to derive your substring. It must be either a VARCHAR or a BINARY value.
start_expr: This argument determines the starting point for the substring. You can either directly use an integer or an expression that evaluates to an integer. The start_expr help you specify the offset from which the substring starts. This offset is measured as follows:
- In terms of the number of UTF-8 characters if the base_expr is of VARCHAR type.
- In terms of the number of bytes if the base_expr is of BINARY type.
NOTE: The start position for the substr function starts from 1, not 0. SUBSTR(‘xyz’, 1, 1) returns ‘x’, not ‘y’.
length_expr: This is an optional argument. It is used to determine the number of characters/bytes after the start_expr you want in the substring. It can either be an integer value or an expression that evaluates to an integer. It specifies the following:
- The number of UTF-8 characters to return if the input is of VARCHAR type.
- The number of bytes to return if the input is of BINARY type.
The length must be greater than or equal to zero. If the length is negative, the function returns an empty string.
return: The data type of the return value is the same as the input data type.
How to use Snowflake Substring Function?
The SUBSTRING () function returns the position of a string or binary value from the complete string, starting with the character specified by substring_start_index. If any input is null, null is returned.
Examples for Snowflake Substring
Example 1: Get the substring from a specific string in Snowflake
select substr('testing 1 2 3 4', 9, 7);
+---------------------------------+
| SUBSTR('TESTING 1 2 3 4', 9, 7) |
|---------------------------------|
| 1 2 3 4 |
+---------------------------------+
Example 2: Get the substring from a specific string by using table data
select ID,name,substring(name,0,2) from emp;
+----+-------------+---------------------+
| ID | NAME | SUBSTRING(NAME,0,2) |
|----+-------------+---------------------|
| 1 | Ram | Ra |
| 2 | RAJU | RA |
| 3 | Sameer | Sa |
| 4 | Sai Kumar | Sa |
| 5 | Naveen Sai | Na |
| 6 | Seshi Kumar | Se |
| 7 | Rajiv Ram | Ra |
+----+-------------+---------------------+
Example 3: Full Sample of Snowflake Substring function
-- Create a Table
create or replace TABLE EMP (
ID NUMBER(38,0),
NAME VARCHAR(30) COMMENT 'NAME OF THE EMPLOYEE',
INFO VARCHAR(20) DEFAULT 'NOT-NULL'
);
-- Insert Data
insert into emp(id,name) values
(1,'Ram'),
(2,'RAJU'),
(3,'Sameer'),
(4,'Sai Kumar'),
(5,'Naveen Sai'),
(6,'Seshi Kumar'),
(7,'Rajiv Ram'),
(8,NULL)
-- Apply substring function
select id,name,substring(name,0,4) from emp;
-- Output
+----+-------------+---------------------+
| ID | NAME | SUBSTRING(NAME,0,4) |
|----+-------------+---------------------|
| 1 | Ram | Ram |
| 2 | RAJU | RAJU |
| 3 | Sameer | Same |
| 4 | Sai Kumar | Sai |
| 5 | Naveen Sai | Nave |
| 6 | Seshi Kumar | Sesh |
| 7 | Rajiv Ram | Raji |
| 8 | NULL | NULL |
+----+-------------+---------------------+
When you should use the SUBSTRING Function in Snowflake?
There are certain usage scenarios where it is recommended to use the Substring() function within Snowflake Cloud Data Warehouse which are:
- To get a specific substring from an expression or string. You can also use the substring function if you want to get the substrings in reverse order from the strings. For example:
select id,name,substring(name,-2,2) from emp;
+----+-------------+----------------------+
| ID | NAME | SUBSTRING(NAME,-2,2) |
|----+-------------+----------------------|
| 1 | Ram | am |
| 2 | RAJU | JU |
| 3 | Sameer | er |
| 4 | Sai Kumar | ar |
| 5 | Naveen Sai | ai |
| 6 | Seshi Kumar | ar |
| 7 | Rajiv Ram | am |
| 8 | NULL | NULL |
+----+-------------+----------------------+
If you use the substrings in reverse order, use the starting index as a negative value.
Collation Details
- Matching is applied to VARCHAR inputs. No collation is applied if the input data type of the first parameter is BINARY.
- No effect. Matches are syntactically acceptable, but matches do not affect the process. For example, a language that contains two or three digits (such as “dzs” in Hungarian or “ch” in Czech) counts these as two or three characters (not one character) as length arguments.
- Sorting the results corresponds to sorting the inputs. This is useful when you want to pass the return value to another function as part of a nested function call.
Conclusion
In this blog post, you have seen the different benefits that Snowflake provides. You have also learned how to use the Snowflake substring function.
An easier solution for all your data handling needs can be to use an official Snowflake partner like Hevo. Hevo is a fully-managed cloud platform that brings data from multiple sources to Snowflake easily in real-time.
Hevo Data, a No-Code Data Pipeline helps you transfer data from multiple sources to Snowflake. Hevo is fully-managed and completely automates the process of not only exporting data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.
Check out some amazing features of Hevo (Official Snowflake ETL Partner):
- Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.
- Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
- Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to export.
- Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
- Schema Management: Hevo takes away the tedious task of Schema management & automatically detects the Schema of incoming data and maps it to the destination Schema.
- Minimal Learning: Hevo with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
Get started with Hevo today! Sign up here for a 14-day free trial!
Try Hevo today by signing up for a 14-day free trial.
Share your experience of learning about Snowflake Substring! Let us know in the comments section below!