Amazon Redshift Lambda Function: 4 Easy Steps to Load Data

• January 11th, 2022

You’ve paid for your first Amazon Redshift Subscription. Now, you want to load data into your Amazon Redshift account. While there are many ways to do this, using Amazon Lambda is a highly convenient method for ingesting data into Amazon Redshift.

This is because the AWS Lambda service allows you to upload data without owning or managing a server. Besides, when your users perform events like requesting data from your platform, your Amazon Lambda account automatically receives those alerts. Not only that, if you have linked Amazon Redshift Lambda, the system will process the user’s request even without your input.

This tutorial focuses on the easiest method of loading data from Amazon Lambda into Amazon Redshift. Read on to discover how to work on Amazon Redshift Lambda!

Table of Contents

What is Amazon Redshift?

Redshift Lambda - Amazon Redshift logo
Image Source

Amazon Redshift is a scalable Data Warehouse that allows businesses to store and analyze their data. This service holds data in Clusters. As such, Amazon Redshift users have to find a way to load their databases into the Clusters. 

Some Amazon Redshift Subscribers use FTP (File Transfer Protocol) to load data from their company’s server into Amazon Redshift Clusters. Others first transfer their data into Amazon S3 Environments, Amazon EMR, or Amazon DynamoDB, before loading the data into Amazon Redshift.  

However, most of these Amazon Services require users to actively run and manage the servers that perform the Data Ingestion. They also have to deal with any complications that arise while monitoring the servers. Only Amazon Lambda allows you to load data into Amazon Redshift without worrying about owning or managing a server.

What is the Amazon Lambda Function?

Redshift Lambda - Amazon Lambda Function
Image Source

Amazon Lambda Function is a service that allows businesses and developers to Compute data and run code without owning or maintaining a server. With AWS Lambda, programmers can also build event-driven apps which the service automatically hosts for them. 

Apart from Data Ingestion and Coding, Amazon Lambda also processes data requests from users. For instance, the AWS Lambda function is triggered when a user tries to add a product to their cart. It, then, collects the information from the Data Warehouse, processes it, and delivers it to the user.

AWS Lambda holds all data computed into its system as Lists. This ensures that only one Data File is uploaded to a Redshift Cluster at a particular time. 

Once the system has uploaded a file to a Cluster, the List displays the time that the file was loaded and the location of the file. Any Lambda account holder can, therefore, can easily track data that was transferred to Amazon Redshift in their absence.

Key Benefits of Amazon Lambda

  • No Server Required: AWS Lambda lets you load data and run code without the need to manage a server or other infrastructure. This gives you more time than you can allocate to other tasks.
  • Easy to Use: AWS Lambda is structured in a way that enables you to write code easily. If you have already stored the code on your computer, you can upload it to the system as a .zip file. 
  • Free Request Processing: Once you start using Amazon Lambda, the system will process 1 million of your users’ requests for free. 
  • Automatic Response to Events: When your users place a request, AWS Lambda responds to them automatically. That said, this service has the capacity to process over 100,000 requests per day. 
  • Affordable: Unlike most event-driven services, Amazon Lambda does not charge you upfront for your usage. Instead, the service calculates your bill based on the amount of time you spend computing data or running code. 
  • Scalable: AWS Lambda allows you to increase your memory size, and improve data delivery as demand for your application increases. 

Simplify Amazon Redshift ETL using Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDKs, and Streaming Services and simplifies the ETL process. It supports 100+ Data Sources (including 40+ free data sources) and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo loads the data onto the desired Data Warehouse such as Amazon Redshift, enriches the data, and transforms it into an analysis-ready form without writing a single line of code.

Get Started with Hevo for free

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different Business Intelligence (BI) tools as well.

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-day Free Trial!

How to Send Data from AWS Lambda to Amazon Redshift?

Below are prerequisites for working with Amazon Redshift Lambda Function:

  • You must have installed the AWS Lambda Amazon Redshift Database Loader (1.1 version).
  • You must have an Active Amazon Redshift Cluster.

Follow these steps to ingest data into your Amazon Redshift from AWS Lambda:

Redshift Lambda Step 1: Download the AWS Lambda Amazon Redshift Database Loader

Redshift Lambda - Redshift Database Loader
Image Source

The AWS Lambda Amazon Redshift Database Loader eases the process of loading data from AWS Lambda into your Amazon Redshift Cluster. To install the AWS Lambda Amazon Redshift Database Loader, click this link.

Redshift Lambda Step 2: Configure Amazon Redshift Cluster to Permit Access from External Sources

Before you connect your Amazon Redshift Cluster to Amazon Lambda, you must configure your Cluster to permit access from sources outside Amazon Redshift. The following guide will help you:

  • Sign in to the Amazon Redshift Console.
  • Open the navigation panel on the right side of your screen, and click on ‘Security’. You’ll see a list of Cluster Security groups.
  • Next, look for the Cluster Security group that contains your Target Cluster and Select it.
  • A list of connection types will appear. Select ‘CIDR/IP’ connection type, and add it to the Cluster Security group. Then, input the value 0.0.0.0/0.
  • Click on ‘Authorize’ to save the updates. 
  • You can now connect to AWS Lambda.

Redshift Lambda Step 3: Enable the Amazon Lambda Function

After configuring your Amazon Redshift Cluster for external access, you also need to enable the Lambda function to load data to Amazon Redshift. 

Here’s how to do it:

  • Click the AWS Navigation Panel on the same tab where you have opened the Amazon Redshift console. Then, select ‘Featured Services’ and scroll to the AWS Lambda tab and select it to move to the AWS Lambda console.
  • Click on ‘Create a Lambda Function’, choose a name and for the function, and enter it.
  • Next, click on the ‘Code Entry Type’ tab, and choose ‘ Upload a zip file’ to upload the AWS Lambda Amazon Redshift Database Loader file you downloaded earlier.
  • Before you complete your Amazon Lambda Execution, the service will require you to provide specific values. Some of them are the Filename, Max Timeout, and the Handler. In this section, stick to the default values. For instance, if your default filename is index.js, don’t change it.
  • Once you have filled in the default values for each section, deploy the AWS Lambda function. 

Redshift Lambda Step 4: Configure an Event Source to Deliver Requests from S3 Buckets to Amazon Lambda

As stated earlier, one of the purposes of Amazon Lambda is that it processes requests from users. All these requests come through Amazon S3 Buckets. That said, AWS Lambda can only receive those requests if you link them to Amazon S3 Buckets.

Follow these steps to connect AWS Lambda to Amazon S3 Buckets with an event source:

  • Select your deployed Amazon Lambda function, and choose ‘Configure Event Source’.
  • Then, pick the Amazon S3 Bucket you want to the AWS Lambda Function to receive data from. You can select the Bucket by clicking on the ‘Create/Select’ function.
  • Select ‘ Submit’ to save the new changes.

Now, you can use your AWS Lambda function to ingest data into Amazon Redshift and process Data Requests from users with information in Amazon Redshift. 

This concludes the process of setting up Amazon Redshift Lambda!

Conclusion

Congratulations! You just loaded data into your Amazon Redshift Data Warehouse from AWS Lambda. Note that while incorporating Amazon Redshift Lambda & your Clusters, you may encounter certain difficulties. If this happens, contact AWS Support for assistance. You can also contact the support team for more inquiries. 

To become more efficient in handling your Databases, it is preferable to integrate them with a solution that can carry out Data Integration and Management procedures for you without much ado and that is where Hevo Data, a Cloud-based ETL Tool, comes in.

Visit our Website to Explore Hevo

Hevo Data supports 100+ Data Sources and helps you transfer your data from these sources to Data Warehouses like Amazon Redshift in a matter of minutes, all without writing any code!

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Share your experience of configuring and working with Amazon Redshift Lambda in the comments section below!

No-code Data Pipeline for Amazon Redshift