Querying Data in Redshift: 2 Easy Methods

Last Modified: December 29th, 2022

Querying Data in Redshift.

With most companies adopting cloud as their primary choice of storing data, the need for having a powerful and robust cloud data warehouse is on the rise. One of the most popular cloud-based data warehouse that meets all these requirements is Amazon’s Redshift data warehouse. It allows users to store potentially TBs of data with ease and provides fast query processing abilities along with robust integration support for a diverse set of data analytics, business intelligence tools and SQL clients. You can then use these tools to start querying and analysing your data as per your business needs.

This article aims at providing you with a step-by-step solution to help you get started with Querying Data in Redshift. Upon a complete walkthrough of content, you will be able to query your Amazon Redshift data using various ways, such as using the AWS Query Editor, the SQL clients, etc. helping you analyse your business data with ease.

Table of Contents

Introduction to Amazon Redshift

Amazon Redshift Logo.

Amazon Redshift is a fully managed petabyte-scale cloud-based data warehouse product designed and developed for large scale data set storage and analysis. It performs exceptionally well, especially when performing large scale database migrations and provides robust integration support with numerous SQL-based clients and business intelligence tools.

Being a fully-managed service by Amazon, all administrative tasks such as creating backups, security, and configuration settings are taken care of by Amazon themselves. It houses a multi-layered architecture that stores the data in the form of clusters and nodes, which users and applications can access independently.

For further information on Amazon Redshift, you can check the official website here.

Simplify your data analysis with Hevo’s No-code Data Pipelines

Hevo Data, a No-code Data Pipeline, helps you transfer data from 100+ sources to Amazon Redshift to visualize it in your desired BI tool. Hevo is fully-managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination. It allows you to focus on key business needs and perform insightful analysis using BI tools such as Tableau and many more.

Get Started with Hevo for Free

Check out what makes Hevo amazing:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

Prerequisites

  • Working knowledge of Amazon Redshift.
  • An Amazon Redshift account.
  • JDBC and ODBC driver for Amazon Redshift.

Methods to Query Data in Amazon Redshift

Amazon Redshift allows users to query data, either by connecting with their desired cluster and then executing queries using the AWS Query Editor or by using an external SQL-based client such as MySQL Workbench.

Method 1: Querying Data in Redshift using AWS Query Editor

Using the AWS Query Editor is one of the easiest ways to start Querying Data in Redshift. Once you’ve set up your database in your desired cluster, you can immediately begin using the Query Editor to extract data. You can implement this using the following steps:

Step 1: Enabling the access to AWS Query Editor

To start using the Query Editor, you will have to provide the IAM user you are using, the necessary permissions. To do this, you will have to add the AmazonRedshiftQueryEditor and AmazonRedshiftReadOnlyAccess policies to the IAM permissions associated with your desired IAM user.

To do this, go to the official AWS website and log in to the AWS Management Console, with your credentials such as username and password.

Logging into AWS Management Console.

Once you’ve logged in the AWS Management Console will now open up on your screen. Click on the “users” option and select the desired user and then click on the add permissions option and then select the “attach existing policies directly” option.

Now select the AmazonRedshiftQueryEditor and AmazonRedshiftReadOnlyAccess policies, click on next and then click on the add permissions option.

Step 2: Using the AWS Query Editor to extract data

Once you’ve enabled the AWS Query Editor, you can use it to start extracting data with the help of SQL-based queries. You can also download the results of your query execution.

To do this, go to the official AWS Management Console website and log in to your account using your credentials such as username and password and launch the Amazon Redshift console.

Click on the editor option found in the navigation bar and then select the Query Editor option. You now need to choose the desired schema and cluster and then use the Query Editor window to type in your SQL query to extract data.

Using Query Editor.

For example, if you want to extract the data from the shoes table, you can use the following query:

select * from shoes;

Once you’ve written the SQL query, click on the run option to execute it. You will now able to see the query results on your screen, that you can download with the help of the export option.

Running SQL queries to start Querying Data in Redshift.

This is how you can use the AWS Query Editor to start Querying Data in Redshift.

Method 2: Querying Data in Redshift using SQL Client Tools

Amazon Redshift allows connecting with numerous SQL-based clients such as SQL Workbench, PSQL, etc. with the help of Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) drivers. To do this, you will have to install them either on your Amazon EC2 instance or the client computer.

You can implement this using the following steps:

Step 1: Obtaining the Cluster Connection String

To establish a connection with an SQL client, you will need to have the connection string for your cluster. To obtain it, go to the official AWS Management Console website and log in with your credentials such as username and password and launch your Amazon Redshift console.

Logging into AWS Console.

Click on the clusters option, found in the navigation bar and then choose your desired cluster. Once you’ve selected it, click on the properties tab to view the connection details and then select the view all connections option. Here you’ll find the JDBC and ODBC connection URL.

Obtaining the Connection String.

Use the copy option to select the required connection string.

Step 2: Configuring the JDBC Connection

Amazon Redshift supports using the PostgreSQL JDBC driver, as it is a variant of PostgreSQL. Amazon further provides and recommends using the specific Amazon Redshift driver, especially when you want to make use of AWS features with your connections such as logging in with AWS security credentials, use temporary security tokens, etc. You can download the Amazon Redshift JDBC driver from here.

Once you’ve downloaded it, unzip the file. When connecting with JDBC, you will have to choose the class name of the driver, which depends upon the version of the JDBC driver you are using. For example, if you’re using version 4.2, the class name would be “com.amazon.redshift.jdbc42.Driver”.

To set up the JDBC connection, you will have to perform the following operations:

Obtaining the JDBC URL

You will have to create a JDBC URL to establish a connection with the Amazon Redshift database. The JDBC URL for Amazon Redshift uses the following format:

jdbc:redshift://endpoint:port/databas

Here the endpoint parameter denotes the endpoint of the Amazon Redshift cluster, port parameter denotes the port number that you specified while launching the Amazon Redshift cluster, and the database parameter is the name of the database that you created for your Amazon Redshift cluster. An example of a valid JDBC URL is as follows:

jdbc:redshift://acluster.xyz103abc624.us-west-2.redshift.amazonaws.com:5439/dev
Configuring SSL and Authentication for JDBC Connection

Whenever you use the JDBC connector, you should configure it to authenticate the connection depending upon the security requirements of your Amazon Redshift server. Configuring the JDBC connector for authentication requires you to provide your Amazon Redshift username and password. The following URL shows how to connect to an Amazon Redshift database named “company” using the standard login and password:

jdbc:redshift://redshift-test2.abcdefg.us-east-1.redshift.amazonaws.com:5439/company?ssl=true&tcpKeepAlive=true
Configuring TCP Keepalives for JDBC Connection

Amazon Redshift makes use of TCP Keepalives to prevent a connection from timing out. With TCP Keepalives, you can specify when the JDBC driver should send the Keepalive packets or disable them. The following URL shows how to connect with an Amazon Redshift database named “company” using the standard login and password with TCP Keepalive turned on, and SSL encryption enabled:

jdbc:redshift://redshift-test2.abcdefg.us-east-1.redshift.amazonaws.com:5439/company?ssl=true&tcpKeepAlive=true

This is how you can configure the JDBC connection to start Querying Data in Redshift.

Step 3: Configuring the ODBC Connection

Amazon Redshift also allows users to connect their database with SQL clients that don’t support JDBC, using the ODBC connection. It provides users with ODBC connectors for Windows, Linux, and macOS X operating systems. To set up the ODBC connection, you will have to perform the following operations:

Obtaining the ODBC URL

You can obtain the ODBC URL for Amazon Redshift directly from the console. The ODBC URL contains information on how you connect your client computer to the Amazon Redshift database. The ODBC URL uses the following format:

Driver={driver};Server=endpoint;Database=database_name;UID=user_name;PWD=password;Port=port_number
Description of URL Parameters.

An example of a valid ODBC URL:

Driver={Amazon Redshift (x32)}; Server=mycluster.xyz835abc345.us-west-2.redshift.amazonaws.com; Database=dev; UID=master; PWD=masterpassword; Port=5439
Installing the ODBC Driver

The Amazon Redshift driver is available for both Windows, Linux and macOS. In case, you’re a Windows user, you can, directly download the MSI package that matches the architecture of your SQL client. Once you’ve downloaded it, click on the MSI package to start the installation process.

If you’re using a Linux based system, you will have to download the ODBC driver and the correct tool to install it depending upon your Linux administration.

macOS X users will have to download the macOS X version of the ODBC driver and run the .pkg file to install the driver. 

This is how you can configure the ODBC connection to start Querying Data in Redshift.

Step 4: Using SQL Workbench to Query Amazon Redshift Data

Once you’ve set up the JDBC connection, you can start using the SQL Workbench to query data from Amazon Redshift. To do this, launch SQL Workbench on your system and then select the file and then click on the connect window option.

Once you’ve selected it, click on the create a new connection profile option and then provide a name for the profile in the new profile dialogue box.
Click on the manage drivers option, a new dialogue box will now open up on your screen. Enter the name of your driver in the name box and then click in the folder icon found just below it, and navigate to the folder where you’ve stored the driver.

Selecting the Amazon Redshift JDBC Driver.

In case a dialogue box stating “Please select one driver dialogue box” opens, select the com.amazon.redshift.jdbc4.Driver or com.amazon.redshift.jdbc41.Driver and then click okay. Once you’ve selected it, choose the driver from the driver box, and paste your Amazon Redshift URL in the URL box. You now need to set the credentials, such as username and password for your user, by providing them in the username and password box, respectively.

Saving the profile list.

Now, select the auto-commit option and click on the save profile list option and then click on okay. You can now start running SQL queries to extract data from your Amazon Redshift instance.

This is how you can use SQL Clients to start Querying Data in Redshift.

Conclusion

This article teaches you various methods to start Querying Data in Redshift with ease. It provides in-depth knowledge about the concepts behind every step to help you understand and implement them efficiently. These methods, however, can be challenging especially for a beginner & this is where Hevo saves the day.

Visit our Website to Explore Hevo

Hevo Data, a No-code Data Pipeline, helps you transfer data from a source of your choice in a fully automated and secure manner without having to write the code repeatedly. Hevo, with its strong integration with 100+ sources & BI tools, allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiff.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Tell us about your experience of Querying Data in Redshift! Share your thoughts with us in the comments section below.

Nicholas Samuel
Freelance Technical Content Writer, Hevo Data

Skilled in freelance writing within the data industry, Nicholas is passionate about unraveling the complexities of data integration and data analysis through informative content for those delving deeper into these subjects.

No-code Data Pipeline For Amazon Redshift