Redshift is a completely managed data warehouse service offered by Amazon Web Services. It is known for its ability to scale seamlessly, support petabytes of data storage, and super-fast querying ability. Redshift is available based on a subscription payment model. All these features make it a great choice for everyone from small and medium businesses to large enterprises. Redshift’s querying layer is based on the PostgreSQL query standard. It supports most functions supported by PostgreSQL and organizations who are already using Redshift for their transactional or warehousing loads generally find it a natural extension of their architecture.
In this blog, you will understand various Redshift date functions in detail with examples code snippets, and their expected outputs.
Table of contents
Introduction to Redshift
Under its hood, Redshift has a massively parallel processing architecture. Redshift clusters can be configured for different kinds of workloads as per the requirement. For example, customers with storage heavy workload and wants to keep the costs down can use Redshift’s dense storage nodes that are made of HDDs. On the other hand, customers with heavy processing requirements can configure it with dense compute instances that are made of SSDs.
Redshift is designed as a collection of instances that are designated as a leader or worker node. Leader nodes handle client communication, query strategy, and assign tasks to the worker nodes. Secondary nodes or worker nodes are responsible for actually executing the tasks. Tasks are usually query processes that deal with the chunk of the data that is handled by the particular worker node.
Beyond the superfast querying abilities and virtually unlimited storage through horizontal scaling, Redshift offers many other features. Redshift spectrum helps one to exploit Redshift querying abilities on data that is not stored Redshift table. It can query data stored in S3 or other storage services. Redshift’s concurrency scaling allows one to scale at times where a higher load is anticipated. An hour of concurrency scaling comes free for every 24 hours a Redshift cluster stays operational.
Redshift Pricing is broadly affected by four factors:
- The node type that the customer chooses to build his cluster.
- The region where the cluster is deployed.
- Billing strategy – on-demand billing or a reserved pricing strategy.
- Use of Redshift Spectrum.
Redshift Date functions
Redshift’s date and time functions help the customers to transform or manipulate data that is stored in the Redshift table with the data type as DATE, TIME, TIMETZ, TIMESTAMP, and TIMESTAMPTZ. Before going to date and time functions, let us learn what each of these data types represents.
DATE: Used for storing date data.
TIME: Used for storing time with hours, minutes, and seconds without explicitly mentioning time zone.
TIMETZ: Used to store time with timezone.
TIMESTAMP: Used to store the date and time of the day without a time zone.
TIMESTAMPTZ: Used to store the date and time of the day with time zone information.
Now, let us move to the date and time functions that will help you manipulate the data stored in these data types. To help us understand the operators better, let’s first create a table with the date data type. This table will help us use the operators and watch the results. Use the Reds
CREATE TABLE SUBSCRIPTION( product_id integer, stard_date date, end_date date);
You can then insert a few rows using the below statements.
INSERT INTO SUBSCRIPTION VALUES ( 1, ‘2020-01-12’ , ‘2023-02-12’);
INSERT INTO SUBSCRIPTION VALUES ( 2, ‘2008-01-12’ , ‘2011-02-12’);
A fully managed No-code Data Pipeline platform like Hevo Data helps you integrate data from 100+ data sources (including 30+ Free Data Sources) to a destination of your choice like Amazon Redshift in real-time in an effortless manner. Hevo with its minimal learning curve can be set up in just a few minutes allowing the users to load data without having to compromise performance. Its strong integration with umpteenth sources allows users to bring in data of different kinds in a smooth fashion without having to code a single line.
Get Started with Hevo for Free
Check out some of the cool features of Hevo:
Sign up here for a 14-Day Free Trial!
- Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
- Connectors: Hevo supports 100+ integrations to SaaS platforms, files, databases, analytics, and BI tools. It supports various destinations including Google BigQuery, Amazon Redshift, Firebolt, Snowflake Data Warehouses; Amazon S3 Data Lakes; and MySQL, MongoDB, TokuDB, DynamoDB, PostgreSQL databases to name a few.
- Real-Time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
- 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
- Scalable Infrastructure: Hevo has in-built integrations for 100+ sources, that can help you scale your data infrastructure as required.
- 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
- Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
- Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
Redshift Date functions: Concatenation (+) Operator
This operator helps one to combine date and time to return a timestamp. For example, we know that our SUBSCRIPTION table only has date information in the start_date field. To add default time information to each date, you can use the below query.
SELECT start_date + time ‘20:00:00’ as start_timestamp from SUBSCRIPTION;
The output will be as follows.
The operator works in the same manner for the TIMETZ data type too.
Download the Cheatsheet on How to Set Up High-performance ETL to Redshift
Learn the best practices and considerations for setting up high-performance ETL to Redshift
Redshift Date functions: ADD_MONTHS function
This function takes a date and an integer as arguments. It returns the date after adding the number of specified months. For example, if you want to find the date after 12 months from the start_date from our SUBSCRIPTION table, you can use the below query.
SELECT ADD_MONTHS(start_date,12) as next_date FROM SUBSCRIPTION;
The output will be as follows.
Redshift Date functions: AT_TIMEZONE function
This function converts a timestamp value from one time zone to another time zone. For example,
SELECT TIMESTAMP WITH TIME ZONE '2021-03-18 21:39:42-05' AT TIME ZONE 'MST';
In the above statement, the timezone of the selected value is UTC – 5, To convert that to MST ( Which is UTC -7 ) you can use the AT_TIMEZONE function.
The output of the statement will be as follows.
Redshift Date functions: CONVERT_TIMEZONE function
As the name suggests, this function converts a timezone to another timezone. For example, let’s say we had a table called CUSTOMER with a last_activity timestamp field. The default timezone of Redshift is UTC. To select the last_activity value in MST timezone, you can use the CONVERT_TIMEZONE as below.
SELECT last_activity,CONVERT_TIMEZONE(last_activity,’MST’) as last_activity_mst FROM CUSTOMER;
The output will be as below.
last_activity | last_activity_mst
—————————- + ———————–
2021-09-24 10:36:13 2021-09-24 02:36:13
Redshift Date functions: Comparison functions
The DATE_CMP function takes two dates as arguments and compares them. It returns a 0 if the dates are the same, -1 if the first date is earlier and 1 if the first date is later.
Using our SUBSCRIPTION table let’s execute a query to check the subscriptions that are earlier than 2010-01-01.
SELECT product_id, start_date, DATE_CMP(start_date,’2010-01-01’) as compare_status;
The output will be as follows.
1, 2020-01-12, -1
2, 2008-01-12, 1
Similar to DATE_CMP, there is DATE_CMP_TIMESTAMP and DATE_CMP_TIMESTAMPTZ that can compare a date with a timestamp or a timestamp with timezone respectively.
Similarly, there is the TIMESTAMP_CMP function that takes two timestamp values as arguments and compares them. To compare a timestamp with date or timestamp with timezones, Redshift provides TIMESTAMP_CMP_DATE and TIMSTAMP_CMP_TIMESTAMPTZ functions.
To compare two timestamps with timezones, there is the TIMESTAMPTZ_CMP function. If you want to compare a timestamp with a timezone to a timestamp or a date, you can use TIMETSTAMPTZ_CMP_TIMESTAMP and TIMESTAMPTZ_CMP_DATE functions.
Finally, there is an INTERVAL_CMP function that can compare two intervals and return a 0, 1, or -1 based on the comparison. For example, consider the below statement.
SELECT INTERVAL_CMP(‘5 days’,’2 year’);
The above statement will return a -1 as the output.
Redshift Date functions: PART Functions
Redshift provides two functions to extract part of the information from date values. The DATE_PART_YEAR function returns the year part of the date value. DATE_PART function is a generic function that allows you to get a specific part of the date value based on the first argument. For example, to get the year information from a date value, you can use the below statement.
SELECT DATE_PART(year,start_date) as year, start_date from SUBSCRIPTION
The output will be as follows.
The DATE_TRUNC function operates on timestamp data type and provides the first day of the specified year, the first day of the specified month, or Monday of the specified week depending on the literal that was specified as the first argument. For example, to get the Monday of the specified week, you can use the argument as ‘week’
There is another function called EXTRACT that can be used similarly to DATE_PART except that it works for all kinds of data types like TIMESTAMP, TIMESTAMPTZ, TIME, etc.
Redshift Date functions: DATE_ADD function
This function takes three arguments. The first argument is a literal value that specifies whether to add weeks, months, years, days, or any other unit of time. The other two arguments are the quantity to be added and the actual date or timestamp value to which addition has to be performed. For example, if you want to add 18 months to the start_date column in our SUBSCRIPTION table, the statement will look like below.
SELECT start_date, DATE_ADD(months,12,start_date) as next_date FROM SUBSCRIPTIONS;
The output of the above statement will be as follows.
Redshift Date functions: DATE_DIFF function
This function takes two date values and returns the difference between them in the unit specified by the ‘date part’ argument. For example, to find the difference between two dates in weeks, use the below statement.
SELECT DATEDIFF(week,'2019-01-01','2019-12-31') as numweeks;
The output will be the number of weeks between the two dates.
There is also a specific MONTHS_BETWEEN function that returns the months between the two date values.
Redshift Date functions: NEXT_DAY function
This function can be used to get the date of the next week of the day after a specific date. For example, if you want to get the date of next Monday after a specific date, you can use the statement below.
Redshift Date functions: LAST_DATE function
This function returns the last day of the month on a date. An example could be a statement like the below one.
The output will be 2020-08-31 which is the last day of the 8th month.
Redshift Date functions: Current date functions
Redshift provides two functions to get the current date. The SYSDATE function returns the current date and time in the current session timestamp. If placed inside a transaction, it returns the start of the transaction and not the actual statement execution time.
The GETDATE function also returns the current date and time. When placed inside a transaction, it returns the execution time of the statement rather than the start of the transaction.
While scalability and performance are a must for a data warehouse, it is the richness of the querying engine that ultimately decides its worth for the analysts. Redshift comes with a rich set of functions and operators that make it easy for analysts to handle all kinds of queries supported by traditional databases. The functions detailed above make handling date and time in Redshift an easy task. Another critical element of getting success with Redshift as a data warehouse is to be able to move data from various sources to Redshift for analysis.
Extracting complex data from a diverse set of data sources to carry out an insightful analysis can be challenging, and this is where Hevo Data saves the day! Hevo offers a faster way to move data from Databases, SaaS Applications, Files, and more to your Data Warehouse such as Amazon Redshift to be visualized in a BI tool, Hevo Data is fully automated and hence does not require you to code!
Visit our Website to Explore Hevo
Want to take Hevo for a spin? Sign Up for the 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.
Share your experience of learning about the Amazon Redshift Date functions in the comments section below!