Using AWS Lambda S3 Made Easy

on Amazon S3, AWS Lambda, Uncategorized • April 26th, 2022 • Write for Hevo

AWS Lambda S3 FI

In a world where volumes of information are growing and fast, frequent transactions are in demand; it’s imperative that systems process information as soon as it becomes available. Due to this, event-driven design patterns have become increasingly popular over the years.

Developing and deploying an event-driven architecture is often a challenge and requires custom development. Fortunately, a new service from Amazon, called Lambda, will make it much easier to build event-driven systems. 

As part of AWS, the Lambda service is located between the data and compute layers. The service enables developers to create dynamic event-driven applications by uploading Node.js code and triggering it using AWS triggers. Currently, you can set triggers on Amazon S3, Kinesis, and Dynamo DB. In addition, the Lambda service scales with the size of the event, so it is both cost-efficient and reliable.

This article explains how to configure an event-driven system using AWS Lambda S3.

Table of Contents

What is Amazon S3?

Amazon S3 (Simple Storage Service) is a low-latency, high-throughput object storage service that allows developers to store massive volumes of data. It was launched by Amazon in 2006. To put it another way, Amazon S3 is a virtual and limitless object storage space where you may store any form of data file, including documents, mp3s, mp4s, apps, photos, and more. To work with Amazon S3, you can configure Amazon S3 buckets to store, organise, and manage various data files using an easy-to-use online web interface.

Because it duplicates or replicates data objects across multiple devices or servers in different S3 clusters on a regular basis, Amazon S3 is exceptionally fault-tolerant. It also allows you to save, retrieve, and restore previous versions of all objects in the relevant buckets, allowing you to quickly recover when data is accidentally deleted by users or an application fails.

Key Features of Amazon S3

AWS Lambda S3: Architecture Blog
Image Source
  • Storage Management: Storage management capabilities in Amazon S3 are used to reduce latency, manage expenses, and save multiple copies of your data.
    • S3 Replication: This feature replicates items and their metadata to one or more destination buckets in the same or different AWS region for a variety of purposes, including latency reduction, security, and compliance.
    • S3 Batch Operations: With a single Amazon S3 API request, it manages billions of objects at scale. On billions of objects, batch operations are also utilised to accomplish the Invoke AWS Lamda function, Restore, and Copy actions.
  • Access Control: You can control who has access to your objects and buckets with Amazon S3. S3 buckets and objects are kept private by default. Users can access the S3 resources that they have produced. You can grant permission to some resources for specific use cases by using the access features shown below.
    • Bucket Policies: Bucket Policies is an IAM-based policy language for configuring resource-based permissions for S3 buckets and objects.
    • Amazon S3 Access Points: It’s used to build up network endpoints with dedicated access policies so that shared datasets in Amazon S3 can be accessed at scale.
    • S3 Access Analyzer: This tool is used to examine and monitor S3 bucket access controls in order to verify that your S3 resources are accessible.

What is AWS Lambda?

AWS Lambda is an event-driven serverless compute solution that allows you to run code for almost any application or backend service without having to provision or manage servers. More than 200 AWS services and SaaS apps can contact Lambda, and you only pay for what you use.

Users of AWS Lambda create functions, which are self-contained applications written in one of the supported languages and runtimes, and upload them to AWS Lambda, which executes them quickly and flexibly.

Key Features of AWS Lambda

Here are a few key features:

  • Concurrency and Scaling Controls: Concurrency and scaling controls, such as concurrency limits and provisioned concurrency, let you fine-tune your production applications’ scalability and reactivity.
  • Functions Defined as Container Images: You may utilise your chosen container image tooling, processes, and dependencies to design, test, and deploy your Lambda functions.
  • Code Signing: Code signing for Lambda provides trust and integrity controls, allowing you to ensure that your Lambda services only use unmodified code provided by authorised developers.
  • Lambda Extensions: You can use Lambda extensions to improve your Lambda functions. For example, you can utilise extensions to make it easier to integrate Lambda with your preferred monitoring, observability, security, and governance tools.

Perform Amazon S3’s ETL in Minutes Using Hevo’s No-Code Data Pipeline

Hevo Data, a Fully-managed Automated Data Pipeline solution, can help you automate, simplify & enrich your data flow from various AWS sources such as AWS S3 and AWS Elasticsearch in a matter of minutes. With Hevo’s out-of-the-box connectors and blazing-fast Data Pipelines, you can extract & aggregate data from 100+ Data Sources (including 40+ free sources) like Amazon S3 straight into your Data Warehouse, Database, or any destination. To further streamline and prepare your data for analysis, you can process and enrich Raw Granular Data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!”

GET STARTED WITH HEVO FOR FREE

Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!

How to Use AWS Lambda S3?

You can process Amazon Simple Storage Service event notifications using Lambda. Whenever an object is created or deleted, Amazon S3 can send an event to notify a Lambda function. Amazon S3 permissions are granted on the function’s resource-based permissions policy, so you can configure notification settings on a bucket.

In Amazon S3, your function is invoked asynchronously with an event that includes information about the object. The following example illustrates an Amazon S3 event that was triggered when a deployment package was uploaded to Amazon S3.

Event notification example for Amazon S3:

{
  "Records": [
    {
      "eventVersion": "2.1",
      "eventSource": "aws:s3",
      "awsRegion": "us-east-2",
      "eventTime": "2019-09-03T19:37:27.192Z",
      "eventName": "ObjectCreated:Put",
      "userIdentity": {
        "principalId": "AWS:AIDAINPONIXQXHT3IKHL2"
      },
      "requestParameters": {
        "sourceIPAddress": "205.255.255.255"
      },
      "responseElements": {
        "x-amz-request-id": "D82B88E5F771F645",
        "x-amz-id-2": "vlR7PnpV2Ce81l0PRw6jlUpck7Jo5ZsQjryTjKlc5aLWGVHPZLj5NeC6qMa0emYBDXOo6QBU0Wo="
      },
      "s3": {
        "s3SchemaVersion": "1.0",
        "configurationId": "828aa6fc-f7b5-4305-8584-487c791949c1",
        "bucket": {
          "name": "DOC-EXAMPLE-BUCKET",
          "ownerIdentity": {
            "principalId": "A3I5XTEXAMAI3E"
          },
          "arn": "arn:aws:s3:::lambda-artifacts-deafc19498e3f2df"
        },
        "object": {
          "key": "b21b84d653bb07b05b1e6b33684dc11b",
          "size": 1305107,
          "eTag": "b21b84d653bb07b05b1e6b33684dc11b",
          "sequencer": "0C0F6F405D6ED209E1"
        }
      }
    }
  ]
}

Amazon S3 needs the resource-based policy of your function before it can invoke your function. You can configure an Amazon S3 trigger based on the bucket name and account ID in the Lambda console. The console then modifies the resource-based policy to allow Amazon S3 to invoke the function. The policy is updated using the Lambda API when the notification is configured in Amazon S3. Likewise, you can grant permissions to another account or restrict them to a particular alias using the Lambda API.

AWS Lambda S3 Steps: Invoking a Lambda Function from an Amazon S3 Trigger

The first step in using AWS Lambda S3 is Invoking Lambda Function. We will discuss how to create a Lambda function that triggers Amazon Simple Storage Service (Amazon S3) using the console. Then, when an Amazon S3 bucket is added, your trigger calls the function you specified.

Please Note: You need an AWS account to use Lambda and other services on AWS. Visit aws.amazon.com to create an account if you don’t already have one. 

It’s assumed that you are familiar with the Lambda console and basic Lambda operations.

AWS Lambda S3 Steps: Creating a Bucket and Uploading a Sample Object

The next step in using AWS Lambda S3 is to create a bucket. You will need to create an Amazon S3 bucket and upload a test file there. When you view this file from the console, your Lambda function retrieves information about it.

You can use the console to create an Amazon S3 bucket by following the processes below:

  • Step 1: Log in to the Amazon S3 console.
  • Step 2: Select the Create bucket option.
  • Step 3: Do the following in General configuration:
    • Name your bucket with a unique name.
    • Please select a region for AWS. A Lambda function must be created within the same region.
  • Step 4: Select the Create bucket option.

Once the bucket has been created, Amazon S3 opens the Buckets page, which displays a list of all buckets for the current region in your account.

The Amazon S3 console can be used to upload a test object:

  • Step 1: Using the buckets page of the Amazon S3 console, select the bucket that you created from the list of bucket names.
  • Step 2: Select the Upload option under the Objects tab.
  • Step 3: Drag and drop a sample file from your local computer to the Upload page.
  • Step 4: Select the Upload option.

AWS Lambda S3 Steps: Creating the Lambda Function

The next step in using AWS Lambda S3 is Lambda functions are created using function blueprints. A function demonstrates how Lambda can be used with other AWS services in a blueprint. In addition, blueprints include samples of code and presets for function configuration for a certain runtime. For this practice, either Node.js or Python can be chosen.

Creating Lambdas from blueprints on the console

To use AWS Lambda S3, create lambdas from blueprints:

  • Step 1: Go to the Lambda console’s Functions page.
  • Step 2: Select the Create option.
  • Step 3: Select Use a blueprint under Create function.
  • Step 4: Search for S3 under Blueprints.
  • Step 5: Select one of these options from the search results:
    • Choose s3-get-object for a Node.js function.
    • Choose s3-get-object-python if you want a Python function.
  • Step 6: Select Configure.
  • Step 7: Do the following under Basic information:
    • Enter the name of the function my-s3-function.
    • You can create an Execution role from an AWS policy template by choosing to Create a new role.
    • Enter my-s3-function-role in the Role Name field.
  • Step 7: You need to select the S3 bucket you created previously under the S3 trigger.
  • Step 8: By configuring an S3 trigger in the Lambda console, you allow Amazon S3 to invoke your function with a resource-based policy.
  • Step 9: Select the Create option.

AWS Lambda S3 Steps: Re-examine the Function Code

The next step in using AWS Lambda S3 is n response to the event parameter, the Lambda function retrieves the uploaded object’s S3 bucket name and key name from the source. The function retrieves the content type of an object using the Amazon S3 getObject API.

AWS Lambda S3 Steps: Test in the Console

The next step in using AWS Lambda S3 is to run the Lambda function manually with sample data from Amazon S3.

The console can be used to test the Lambda function by following the steps below:

  • Step 1: You can configure the test events on the Code tab by clicking the arrow next to Test and selecting Configure test events from the dropdown list.
  • Step 2: You can configure a test event by following these steps:
    • Select Create a new test event.
    • Select Amazon S3 Put (s3-put) as the template for your event.
    • Give the test event a name. For instance, mys3testevent.
    • Change the bucket name and object key from the example-bucket JSON to your bucket name and test file name in the test event JSON. You should get something like this:
{
  "Records": [
    {
      "eventVersion": "2.0",
      "eventSource": "aws:s3",
      "awsRegion": "us-west-2",
      "eventTime": "1970-01-01T00:00:00.000Z",
      "eventName": "ObjectCreated:Put",
      "userIdentity": {
        "principalId": "EXAMPLE"
      },
      "requestParameters": {
        "sourceIPAddress": "127.0.0.1"
      },
      "responseElements": {
        "x-amz-request-id": "EXAMPLE123456789",
        "x-amz-id-2": "EXAMPLE123/5678abcdefghijklambdaisawesome/mnopqrstuvwxyzABCDEFGH"
      },
      "s3": {
        "s3SchemaVersion": "1.0",
        "configurationId": "testConfigRule",
        "bucket": {
          "name": "my-s3-bucket",
          "ownerIdentity": {
            "principalId": "EXAMPLE"
          },
          "arn": "arn:aws:s3:::example-bucket"
        },
        "object": {
          "key": "HappyFace.jpg",
          "size": 1024,
          "eTag": "0123456789abcdef0123456789abcdef",
          "sequencer": "0A1B2C3D4E5F678901"
        }
      }
    }
  ]
}
  • Step 3: Select the Create option.
  • Step 4: To call this function from your test event, choose Test under Code source.

What Makes Hevo’s Data Ingestion Process from Amazon S3 Unique

Transforming data can be a mammoth task without the right set of tools. Hevo’s automated platform empowers you with everything you need to have a smooth Data Collection, Processing and Aggregation experience. Our platform has the following in store for you!

  • Exceptional Security: A Fault-tolerant Architecture that ensures Zero Data Loss.
  • Built to Scale: Exceptional Horizontal Scalability with Minimal Latency for Modern-data Needs.
  • Built-in Connectors: Support for 100+ Data Sources, including Databases, SaaS Platforms, Files & More. Native Webhooks & REST API Connector available for Custom Sources.
  • Transformations: Hevo provides preload transformations to make your incoming data from AWS S3 and AWS Elasticsearch fit for the chosen destination. You can also use drag and drop transformations like Date and Control Functions, JSON, and Event Manipulation to name a few.
  • Blazing-fast Setup: Straightforward interface for new customers to work on, with minimal setup time.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Auto Schema Mapping: Hevo takes away the tedious task of schema management & automatically detects the format of incoming data and replicates it to the destination schema. You can also choose between Full & Incremental Mappings to suit your Data Replication requirements. 

Simplify your Data Analysis with Hevo today! SIGN UP HERE FOR A 14-DAY FREE TRIAL!

AWS Lambda S3 Steps: Use the S3 Trigger to Test

The next step in using AWS Lambda S3 is to use s# trigger. Your function is called as soon as you upload a file to the Amazon S3 destination bucket.

For testing Lambda functions using S3 triggers, follow the steps below:

  • Step 1: You will need to select the bucket name that you have already created in the buckets section of the Amazon S3 console.
  • Step 2: You can upload a few .jpg or .png image files to the bucket from the Upload page.
  • Step 3: On the Lambda console, open the Functions page.
  • Step 4: Select the name of your function (my-s3-function).
  • Step 5: Check the Monitor tab to see if each file you uploaded was processed once. Lambda sends metrics to CloudWatch, which are presented on this page. According to the Invocations graph, the number of files in the Amazon S3 bucket should match the number in the Invocations graph.
  • Step 6: The logs can be viewed in the CloudWatch console by selecting View logs in CloudWatch on the menu. Choose a log stream to display the logs for one of the function invocations.

AWS Lambda S3 Steps: Reorganize your Resources

The next step in using AWS Lambda S3 is to recognize your resources. If you don’t want to keep the resources that you created for this tutorial, you can delete them now. Your AWS account will avoid unnecessary charges when you delete no longer-used AWS resources.

In order to delete a Lambda function, follow the steps below:

  • Step 1: Go to the Lambda console’s Functions page.
  • Step 2: Choose the function you created.
  • Step 3: Select Delete from the Actions menu.
  • Step 4: Click Delete.

Deleting the IAM policy

To use AWS Lambda S3, delete the IAM Policy:

  • Step 1: On the AWS IAM) console, open the Policies page.
  • Step 2: Make sure you select the Lambda-created policy. The name of the policy starts with AWSLambdaS3ExecutionRole-.
  • Step 3: Select Delete from the Policy actions menu.
  • Step 4: Click Delete.

Deleting an execution role

To use AWS Lambda S3, delete an execution role:

  • Step 1: Navigate to the Roles page of the IAM console.
  • Step 2: You need to select the execution role you created.
  • Step 3: Select the Delete role.
  • Step 4: Select Yes, delete.

Deleting the S3 bucket

To use AWS Lambda S3, delete the S3 bucket:

  • Step 1: Launch the Amazon S3 console.
  • Step 2: Choose the bucket you created.
  • Step 3: Select Delete.
  • Step 4: Provide a name for the bucket.
  • Step 5: Select Confirm.

Conclusion

You have successfully learned how to use AWS Lambda S3. An S3 trigger is used in this tutorial to create thumbnail images for each image that is uploaded to an S3 bucket. Before attempting this tutorial, you should have some familiarity with the AWS and Lambda domains. AWS Command Line Interface (AWS CLI) is used to create resources, and .zip files are created as deployment packages for the function and its dependencies.

However, as a Developer, extracting complex data from a diverse set of data sources like Databases, CRMs, Project management Tools, Streaming Services, and Marketing Platforms to your Database can seem to be quite challenging. If you are from non-technical background or are new in the game of data warehouse and analytics, Hevo Data can help!

Visit our Website to Explore Hevo

Hevo Data will automate your data transfer process, hence allowing you to focus on other aspects of your business like Analytics, Customer Management, etc. This platform allows you to transfer data from 100+ multiple sources like Amazon S3 to Cloud-based Data Warehouses like Snowflake, Google BigQuery, Amazon Redshift, etc. It will provide you with a hassle-free experience and make your work life much easier.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

You can also have a look at our unbeatable pricing that will help you choose the right plan for your business needs!

No-Code Data Pipeline for Amazon S3