Amazon Redshift Lambda Function: 4 Easy Steps to Load Data

ou’ve paid for your first Amazon Redshift Subscription. Now, you want to load data into your Amazon Redshift account. While there are many ways to do this, using Amazon Lambda is a highly convenient method for ingesting data into Amazon Redshift.

This is because the AWS Lambda service allows you to upload data without owning or managing a server. Besides, when your users perform events like requesting data from your platform, your Amazon Lambda account automatically receives those alerts. Not only that, if you have linked Amazon Redshift Lambda, the system will process the user’s request even without your input.

This tutorial focuses on the easiest method of loading data from Amazon Lambda into Amazon Redshift. Read on to discover how to work on Amazon Redshift Lambda!

Table of Contents

What is Amazon Redshift?

Amazon Redshift is a scalable Data Warehouse that allows businesses to store and analyze their data. This service holds data in Clusters. As such, Amazon Redshift users have to find a way to load their databases into the Clusters.

Some Amazon Redshift Subscribers use FTP (File Transfer Protocol) to load data from their company’s server into Amazon Redshift Clusters. Others first transfer their data into Amazon S3 Environments, Amazon EMR, or Amazon DynamoDB, before loading the data into Amazon Redshift.

However, most of these Amazon Services require users to actively run and manage the servers that perform the Data Ingestion. They also have to deal with any complications that arise while monitoring the servers. Only Amazon Lambda allows you to load data into Amazon Redshift without worrying about owning or managing a server.

What is the Amazon Lambda Function?

Redshift Lambda - Amazon Lambda Function

Amazon Lambda Function is a service that allows businesses and developers to Compute data and run code without owning or maintaining a server. With AWS Lambda, programmers can also build event-driven apps which the service automatically hosts for them.

Apart from Data Ingestion and Coding, Amazon Lambda also processes data requests from users. For instance, the AWS Lambda function is triggered when a user tries to add a product to their cart. It, then, collects the information from the Data Warehouse, processes it, and delivers it to the user.

AWS Lambda holds all data computed into its system as Lists. This ensures that only one Data File is uploaded to a Redshift Cluster at a particular time.

Once the system has uploaded a file to a Cluster, the List displays the time that the file was loaded and the location of the file. Any Lambda account holder can, therefore, can easily track data that was transferred to Amazon Redshift in their absence.

Key Benefits of Amazon Lambda

No Server Required: AWS Lambda lets you load data and run code without the need to manage a server or other infrastructure. This gives you more time than you can allocate to other tasks.
Easy to Use: AWS Lambda is structured in a way that enables you to write code easily. If you have already stored the code on your computer, you can upload it to the system as a .zip file.
Free Request Processing: Once you start using Amazon Lambda, the system will process 1 million of your users’ requests for free.
Automatic Response to Events: When your users place a request, AWS Lambda responds to them automatically. That said, this service has the capacity to process over 100,000 requests per day.
Affordable: Unlike most event-driven services, Amazon Lambda does not charge you upfront for your usage. Instead, the service calculates your bill based on the amount of time you spend computing data or running code.
Scalable: AWS Lambda allows you to increase your memory size, and improve data delivery as demand for your application increases.

How to Send Data from AWS Lambda to Amazon Redshift?

Below are prerequisites for working with Amazon Redshift Lambda Function:

You must have installed the AWS Lambda Amazon Redshift Database Loader (1.1 version).
You must have an Active Amazon Redshift Cluster.

Follow these steps to ingest data into your Amazon Redshift from AWS Lambda:

Step 1: Download the AWS Lambda Amazon Redshift Database Loader
Step 2: Configure Amazon Redshift Cluster to Permit Access from External Sources
Step 3: Enable the Amazon Lambda Function
Step 4: Configure an event source to deliver requests from S3 buckets into Amazon Lambda

Step 1: Download the AWS Lambda Amazon Redshift Database Loader

Redshift Lambda - Redshift Database Loader — Redshift Database Loader

The AWS Lambda Amazon Redshift Database Loader eases the process of loading data from AWS Lambda into your Amazon Redshift Cluster. To install the AWS Lambda Amazon Redshift Database Loader, click this link.

Step 2: Configure Amazon Redshift Cluster to Permit Access from External Sources

Before you connect your Amazon Redshift Cluster to Amazon Lambda, you must configure your Cluster to permit access from sources outside Amazon Redshift. The following guide will help you:

Sign in to the Amazon Redshift Console.
Open the navigation panel on the right side of your screen, and click on ‘Security’. You’ll see a list of Cluster Security groups.
Next, look for the Cluster Security group that contains your Target Cluster and Select it.
A list of connection types will appear. Select ‘CIDR/IP’ connection type, and add it to the Cluster Security group. Then, input the value 0.0.0.0/0.
Click on ‘Authorize’ to save the updates.
You can now connect to AWS Lambda.

Step 3: Enable the Amazon Lambda Function

After configuring your Amazon Redshift Cluster for external access, you also need to enable the Lambda function to load data to Amazon Redshift.

Here’s how to do it:

Click the AWS Navigation Panel on the same tab where you have opened the Amazon Redshift console. Then, select ‘Featured Services’ and scroll to the AWS Lambda tab and select it to move to the AWS Lambda console.
Click on ‘Create a Lambda Function’, choose a name and for the function, and enter it.
Next, click on the ‘Code Entry Type’ tab, and choose ‘ Upload a zip file’ to upload the AWS Lambda Amazon Redshift Database Loader file you downloaded earlier.
Before you complete your Amazon Lambda Execution, the service will require you to provide specific values. Some of them are the Filename, Max Timeout, and the Handler. In this section, stick to the default values. For instance, if your default filename is index.js, don’t change it.
Once you have filled in the default values for each section, deploy the AWS Lambda function.

Redshift Lambda Step 4: Configure an Event Source to Deliver Requests from S3 Buckets to Amazon Lambda

As stated earlier, one of the purposes of Amazon Lambda is that it processes requests from users. All these requests come through Amazon S3 Buckets. That said, AWS Lambda can only receive those requests if you link them to Amazon S3 Buckets.

Follow these steps to connect AWS Lambda to Amazon S3 Buckets with an event source:

Select your deployed Amazon Lambda function, and choose ‘Configure Event Source’.
Then, pick the Amazon S3 Bucket you want to the AWS Lambda Function to receive data from. You can select the Bucket by clicking on the ‘Create/Select’ function.
Select ‘ Submit’ to save the new changes.

Now, you can use your AWS Lambda function to ingest data into Amazon Redshift and process Data Requests from users with information in Amazon Redshift.

Conclusion

Congratulations! You just loaded data into your Amazon Redshift Data Warehouse from AWS Lambda. Note that while incorporating Amazon Redshift Lambda & your Clusters, you may encounter certain difficulties. If this happens, contact AWS Support for assistance. You can also contact the support team for more inquiries.

Understand how moving data from SFTP/FTP to Redshift can streamline your data operations. Our resource provides clear instructions for efficient integration.

Share your experience of configuring and working in the comments section below!

Veeresh Biradar Senior Customer Experience Engineer

Veeresh is a skilled professional specializing in JDBC, REST API, Linux, and Shell Scripting. With a knack for resolving complex issues and implementing Python transformations, he plays a crucial role in enhancing Hevo's data integration solutions.