Working with Redshift BETWEEN Command: Syntax, Queries & Easy Examples 101

on Amazon Redshift, Data Integration, Data Warehouses, ETL Tutorials • February 4th, 2022 • Write for Hevo

Redshift BETWEEN - Featured Image

Amazon Redshift is a cloud data warehouse platform used for the storage of data. Amazon Redshift scales massively, allowing its users to store huge volumes of data. Amazon Redshift runs on the AWS infrastructure, meaning that its users get the best performance. It provides its users with a number of tools that they can use to analyze their data and extract meaningful insights for decision-making purposes. 

When using Amazon Redshift, you will come across the Redshift BETWEEN command. This command helps you to run queries that consider a range of values. Also, Amazon Redshift supports SQL (Structured Query Language). SQL is made up of commands that you can use to create new data objects, manipulate them, load tables, run queries, and change the data stored in tables. In this article, we will be discussing the Amazon Redshift BETWEEN command in detail. 

Table of Contents

Prerequisites

  • An AWS Account.

Introduction to Amazon Redshift

Redshift BETWEEN - Amazon Redshift logo
Image Source

Amazon Redshift is an Amazon Web Services-powered petabyte-scale data warehouse solution. Because it simplifies data management, it is also used for large database migrations. The annual cost per TB is around $1000, which is much less than the cost of establishing and maintaining On-Site solutions.

The architecture of Amazon Redshift is based on Massively Parallel Processing (MPP). Amazon Redshift Databases are Column-Oriented Databases that can be connected to SQL-based clients and BI tools. This provides users with continuous access to data (structured and unstructured) and facilitates the execution of Complex Analytic queries. Standard ODBC and JDBC connections are also supported by Amazon Redshift.

Amazon Redshift is a fully-managed Data Warehouse, users can automate administrative tasks, allowing them to focus on Data Optimization and Data-driven Business decisions rather than repetitive tasks. The Client Application and the Data Warehouse Cluster must be able to communicate with one another on a consistent basis.

Each Cluster in an Amazon Redshift Data Warehouse has its own set of computing resources, as well as its own Amazon Redshift Engine with at least one Database.

Key Features of Amazon Redshift

Redshift BETWEEN - Redshift Features
Image Source

Here are a few features of Amazon Redshift

  • Parallel Processing: It is used in conjunction with a distributed design method that employs multiple CPUs to process large datasets.
  • Error Tolerance: When performing mission-critical operations in the Cloud, organizations can rely on the Data Warehouse’s Fault and Error Tolerance to ensure continuous operation.
  • End-to-End Encryption: To protect users’ privacy and security, all data handled on the Cloud is encrypted. There are several methods for sharing keys for encrypted data.
  • Network Isolation: With only IPsec VPN access, parts of the deployment can be isolated from the rest of the network and the Internet.
  • Speed: MPP technology allows for the processing and execution of a large number of queries and data in a short period of time. Other cloud service providers’ service pricing cannot compete with AWS’s.
  • Encryption of Data: For your Redshift operation, the Amazon server encrypts data. The user can specify which processes should and should not be encrypted. Data encryption adds another layer of security.
  • Familiarity: Amazon Redshift is built on the PostgreSQL database, which is widely used. It is backward compatible with all SQL queries. You’ll also select SQL, ETL (extract, transform, load), and Business Intelligence (BI) technologies that you’re already familiar with.
  • Maximum Performance with Machine Learning (ML):  Amazon Redshift offers powerful Machine Learning (ML) capabilities with high throughput and speed.

Simplify Amazon Redshift ETL using Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to Load Data from any data source such as Databases, SaaS applications, Cloud Storage, SDKs, and Streaming Services and simplifies the ETL process. It supports 100+ Data Sources (including 40+ free data sources) and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo loads the data onto the desired Data Warehouse such as Amazon Redshift, enriches the data, and transforms it into an analysis-ready form without writing a single line of code.

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different Business Intelligence (BI) tools as well.

Get Started with Hevo for free

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-day Free Trial!

Amazon Redshift BETWEEN Command

The Amazon Redshift BETWEEN condition is used to test expressions for inclusion in a range of values using BETWEEN and AND keywords. This way, you can test whether an expression is within a particular range of values. The values in this case can be characters, numbers, or dates. However, they must be compatible. The condition can be used in SQL statements like insert, select, update, and delete. 

Syntax

The Redshift BETWEEN command takes the following syntax:

expression [NOT] BETWEEN expression AND expression

Note that the expressions in the above syntax must be of compatible data types. 

Amazon Redshift BETWEEN Command Example

Redshift BETWEEN - Amazon Redshift logo

Image Source

Let us give examples that demonstrate how to use the Redshift BETWEEN command. 

We will use the tables provided in the sample TICKET database for this. 

Let us first determine the number of transactions in which 1 to 3 tickets were sold. We will query the SALES table for this as shown below:

select count (*) from sales 
where qtysold between 1 and 3;

The above query should return the number of rows in which the number of tickets sold is between 1 and 3, that is, either 1, 2, or 3. We have used the Redshift BETWEEN command to specify this range, that is, the begin and end values. 

Consider the next example given below:

select min (dateid), max (dateid) 
from sales 
where dateid between 1900 and 1910;

The query will return the minimum and the maximum value within the range specified using the Redshift BETWEEN command. 

When using the Amazon Redshift BETWEEN command, the first expression in the range condition must be a lesser value while the second expression must be a greater value. If this condition is not followed, the query will always return a zero (0). Consider the example given below:

select count (*) 
from sales 
where qtysold between 3 and 1;

The above query will return a 0 because the first expression in the range (3) is greater than the second expression in the range (1). 

Suppose we need to see the list of venues with a sitting capacity of between 40000 and 50000. We can use the following query:

select venuename, venueseats 
from venue 
where venueseats between 40000 and 50000;

We can also order the results using the ORDER BY clause as shown in the following query:

select venuename, venueseats 
from venue 
where venueseats between 40000 and 50000
order by venueseats desc;

We can also negate the results of the Amazon Redshift BETWEEN command. This means we should combine it with the NOT operator. This is demonstrated in the following query:

select venuename, venueseats 
from venue 
where venueseats not between 40000 and 50000 
order by venueseats desc;

The above query will return the list of venues where the sitting capacity is not between 40000 and 50000, which is exactly the opposite of the previous query. Here is another example:

select count (*) 
from sales 
where qtysold not between 1 and 3;

The above query will return the number of cases in which the number of tickets sold is not between 1 and 3, that is, not 1, 2, or 3. The reason is that we have used the not before the Redshift BETWEEN command. 

Let’s create another Redshift table and use it to demonstrate how the Amazon Redshift BETWEEN command works:

create table orders (order_date date, amount int);

The above command will create a new table named orders. The orders table will have two columns namely order_date and amount. The first column is of a date data type while the second one is of an integer data type. 

Let’s now insert some rows into the table and use them to run sample queries. The following query will insert the rows into the table:

insert into orders values ('2022-01-01',20),  ('2022-01-02', 26), ('2022-01-03', 16), ('2022-01-04', 34), ('2022-01-05', 18), ('2022-01-10', 22), ('2022-01-06', 26),  ('2022-01-07', 12), ('2022-01-08', 32), ('2022-01-09', 20);

Suppose we need to see the days in which the amount ordered was between 30 and 40. We can use the following query:

select order_date, amount 
from sales where amount between 30 and 40;

We have combined the SQL select statement with the Amazon Redshift BETWEEN command to query the table data. The query will only return the rows in which the value for the amount column is between 30 and 40 inclusive. The rest of the rows will not be returned. 

We can also add a NOT operator to reverse the way the command works. This is shown below:

select order_date, amount 
from sales 
where amount not between 30 and 40;

Note that we have added the NOT operator before the Amazon Redshift BETWEEN command. The query will return all rows in the table except where the value of the amount column is between 30 and 40. 

That is how the Amazon Redshift BETWEEN command works. 

Conclusion

This is what you’ve learned in this article, Amazon Redshift is a Cloud Data Warehouse platform provided by Amazon. It runs on the AWS infrastructure, which gives it the capability to give its users the best performance. The Amazon Redshift BETWEEN command helps Redshift users to run queries that consider a range of values. You can reverse how the Amazon Redshift BETWEEN command works by preceding it with the NOT operator. 

To become more efficient in handling your Databases, it is preferable to integrate them with a solution that can carry out Data Integration and Management procedures for you without much ado and that is where Hevo Data, a Cloud-based ETL Tool, comes in.

Hevo Data supports 100+ Data Sources and helps you transfer your data from these sources to Data Warehouses like Amazon Redshift in a matter of minutes, all without writing any code!

Visit our Website to Explore Hevo

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. Hevo offers plans & pricing for different use cases and business needs, check them out!

Share your experience of the Amazon Redshift BETWEEN command in the comments section below!

No-code Data Pipeline for Amazon Redshift