Working with Boto3 Lambda (AWS): 4 Easy Steps

|

Boto3 Lambda - Featured Image

Lambda is a compute service that allows you to run code without having to provision or manage servers. Lambda executes your code on a High-Availability Compute Infrastructure and manages all Compute Resources, including Server and Operating System Maintenance, Capacity Provisioning and Automatic Scaling, Code Monitoring, and Logging.

AWS Lambda is an Amazon Web Services (AWS) serverless compute highly scalable and cost-effective solution that allows you to execute code in the form of self-contained applications in an efficient and flexible manner. It can do anything from providing web pages and processing data streams to using APIs and connecting with other AWS and non-AWS services.

To accomplish our aim by Working with Boto3 Lambda (AWS), doing some data wrangling, and saving the metrics and charts on report files on an S3 bucket. We will use Python3, boto3, and a few more libraries loaded in Lambda Layers. Although utilizing the Boto3 Lambda (AWS) services is not the best practice approach to cloud computing, we will demonstrate each step using the console because it is more convenient for novices to learn the basic structure of Amazon Lambda.

This article describes how to use the Boto3 Lambda for AWS to deploy and update Lambda functions.

Table of Contents

What is AWS Lambda?

Boto3 Lambda - AWS Lambda Image
Image Source

AWS Lambda is a serverless, event-driven computing platform offered by Amazon as part of Amazon Web Services. It is a computing service that runs code in response to events and manages the computing resources required by that code automatically.

As of 2018, Node.js, Python, Java, Go, Ruby, and C# (via.NET) are all officially supported. AWS Lambda gained custom runtime support in late 2018. AWS Lambda supports the execution of native Linux executables by calling out from a supported runtime, such as Node.js. Haskell code, for example, can be run on Lambda.

AWS Lambda was created for use cases such as uploading images or objects to Amazon S3, updating DynamoDB tables, responding to website clicks, or responding to sensor readings from an IoT-connected device. AWS Lambda can also be used to automatically provision back-end services triggered by custom HTTP requests, as well as “spin down” such services when they are not in use to save resources. These custom HTTP requests are set up in AWS API Gateway, which can also handle authentication and authorization when used in conjunction with AWS Cognito.

Key Features of AWS Lambda

Here are some features of AWS Lambda:

  • Integrated Fault Tolerance: To help protect your code against individual machine or data center facility failures, AWS Lambda maintains compute capacity across multiple Availability Zones (AZs) in each AWS Region.
  • Scaling on the fly: AWS Lambda executes your code only when it is required and scales automatically to meet the volume of incoming requests without any manual configuration. Your code has no limit on the number of requests it can handle.
  • Access Shared File Systems: You can securely read, write, and persist large volumes of data at any scale using the Amazon Elastic File System (EFS) for AWS Lambda. To process data, you do not need to write code or download it to temporary storage. This saves you time, allowing you to concentrate on your business logic.
  • Custom Logic can be used to extend other AWS services: AWS Lambda enables you to add custom logic to AWS resources like Amazon S3 buckets and Amazon DynamoDB tables, allowing you to easily compute data as it enters or moves through the cloud.
  • Create your own Backend Services: AWS Lambda can be used to build new backend application services that are triggered on-demand via the Lambda application programming interface (API) or custom API endpoints built with Amazon API Gateway.

What is Boto3?

Boto3 Lambda - Boto3 Logo
Image Source

Boto is a Software Development Kit (SDK) that aims to improve the use of Python programming in Amazon Web Services. The Boto project began as a user-contributed library to assist developers in building Python-based cloud applications by converting AWS application programming interface (API) responses into Python classes.

Boto has been designated as the official AWS Python SDK. Boto comes in three flavors: Boto, Boto3, and Botocore. Boto3 is the most recent version of the SDK, and it supports Python versions 2.6.5, 2.7, and 3.3. Boto3 includes a number of service-specific features to make development easier. Boto is compatible with all current AWS cloud services, such as Elastic Compute Cloud, DynamoDB, AWS Config, CloudWatch, and Simple Storage Service.

Boto3 replaced Boto version 2, which is incompatible with the most recent versions of Python but remains appealing to software developers who use older versions of the programming language. Botocore provides more basic access to AWS tools, allowing users to make low-level client requests and receive results from APIs.

Key Features of Boto3

Features of Boto3 are listed below:

  • APIs for Resources: Boto3’s APIs are divided into two levels. Client (or “low-level”) APIs map to the underlying HTTP API operations one-to-one.
  • Consistent and up-to-date Interface: Boto3’s ‘client’ and ‘resource’ interfaces are driven by JSON models that describe AWS APIs and have dynamically generated classes. This enables us to provide extremely fast updates while maintaining high consistency across all supported services.
  • Waiters: Boto3 includes ‘waiters,’ which poll for pre-defined status changes in AWS resources. You can, for example, start an Amazon EC2 instance and use a waiter to wait until it reaches the ‘running’ state, or you can create a new Amazon DynamoDB table and wait until it is ready to use.
  • High-level Service-Specific Features: Boto3 includes many service-specific features, such as automatic multi-part transfers for Amazon S3 and simplified query conditions for Amazon DynamoDB.
Simplify Amazon ETL using Hevo’s No-code Data Pipeline

Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from 100+ Data Sources (including 40+ free data sources) straight into your Data Warehouse or any Databases. To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!

Get Started with Hevo for free

Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!

Creating Boto3 Lambda Application

Boto3 Lambda - Creating Boto3 Lambda Application
Image Source

We now understand what Boto3 is and what features it offers. Let’s create a basic Python serverless application using Lambda and Boto3.

When a file is uploaded to an S3 bucket, a Boto3 Lambda Function is triggered to read the file and store it in a DynamoDB table. The architecture will be as follows:

Step 1: Create a JSON file

First, let’s make a small JSON file with some sample customer data. This is the file we’d put in the S3 bucket. Let us call it data.JSON.

#data.json

{
"customerId": "xy100",
"firstName": "Tom",
"lastName": "Alter",
"status": "active",
"isPremium": true
}

Step 2: Make an S3 bucket

Let’s now create an S3 Bucket to which the JSON file will be uploaded. Let’s call it Boto3 Lambda customer for now. For this example, we’ve created a bucket with all of the default features:

Boto3 Lambda - S3 Bucket
Image Source

Step 3: Make a DynamoDB Table

Let’s make a DynamoDB table (customer) to store the JSON file. Customer-id should be marked as a partition key. We must ensure that our data is accurate. This field must be present in the JSON file before it can be inserted into the table; otherwise, the table will complain about a missing key.

Boto3 Lambda - DynamoDB Table
Image Source

Step 4: Construct a Lambda Function

To interact with these services, we must first create an IAM role with access to CloudWatch Logs, S3, and DynamoDB. Then, we’ll write code in boto3 to download, parse, and save the data to the customer’s DynamoDB table. Then, create a trigger that connects the S3 bucket to Boto3 Lambda so that when we push a file into the bucket, it is picked up by the Lambda Function.

Let’s start by creating an IAM role. To log every event transaction, the IAM Role must have at least read access to S3, write access to DynamoDB, and full access to the CloudWatch Logs service:

Boto3 Lambda - Lambda Function
Image Source

Create a function now. Give it a unique name and choose Python 3.7 as the runtime language:

Now, select the role Boto3 Lambda S3 DyanamoDB that we created earlier and press the Create function button.

Now, for the Boto3 Lambda Function, follow the steps below:

  • To load the service instance object, write the python code using the boto3 resource API.
  • An event object is used to pass the file’s metadata (S3 bucket, filename).
  • Then, using S3 client action methods, load S3 file data into the JSON object.
  • Save the JSON data into the DyanamoDB table after parsing it (customer)
import json
import boto3
dynamodb =  boto3.resource('dynamodb')
s3_client = boto3.client('s3')

table = dynamodb.Table('customer')

def lambda_handler(event, context):
   # Retrieve File Information
   bucket_name =   event['Records'][0]['s3']['bucket']['name']
   s3_file_name =  event['Records'][0]['s3']['object']['key']

   # Load Data in object
   json_object =   s3_client.get_object(Bucket=bucket_name, Key= s3_file_name)
   jsonFileReader  =   json_object['Body'].read()
   jsonDict    =   json.loads(jsonFileReader)

   # Save date in dynamodb table
   table.put_item( Item=jsonDict)

Evaluating the Boto3 Lambda Function

Let’s put this Boto3 Lambda function customer update to the test.

  • To begin, we must upload a JSON file to the S3 bucket boto3customer.
  • The customer is updated as soon as the file is uploaded to the S3 bucket.
  • It will run the code that receives the file’s metadata via the event object and loads the file’s content using boto3 APIs.
  • The content is then saved to the customer table in DynamoDB.
Boto3 Lambda - Lambda Function
Image Source

The file content was saved in the DyanamoDB table, This concludes our implementation of a serverless application with Python runtime using the boto3 library.

What makes Hevo’s ETL Process Best-In-Class

Providing a high-quality ETL solution can be a difficult task if you have a large volume of data. Hevo’s automated, No-code platform empowers you with everything you need to have for a smooth data replication experience.

Check out what makes Hevo amazing:

  • Fully Managed: Hevo requires no management and maintenance as it is a fully automated platform.
  • Data Transformation: Hevo provides a simple interface to perfect, modify, and enrich the data you want to transfer.
  • Faster Insight Generation: Hevo offers near real-time data replication so you have access to real-time insight generation and faster decision making. 
  • Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources (with 40+ free sources) that can help you scale your data infrastructure as required.
  • Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-day free trial!

Limitations of Boto3 Lambda Application

Here are some limitations you need to watch out for before using Boto3 Lambda:

  • Extensive Unit Testing: Given the distributed nature of serverless functions, having verification in place is important for peace of mind when deploying functions that run infrequently. While unit and functional testing cannot fully protect against all issues, it can provide you with the assurance you need to deploy your changes.
  • Time Limits: AWS Lambda functions have built-in time limits for execution. These can be set to as many as 900 seconds, but the default is 3. If your function is likely to require long run times, make sure this value is properly configured.
  • Third-party Resources: It’s powerful to have complete control over everything, but it can also be distracting and time-consuming.

Conclusion

AWS Lambda is an excellent resource for event-driven and intermittent workloads. Because Lambda functions are dynamic, you can create and iterate on your functionality right away, rather than having to spend cycles getting the basic infrastructure up and running and scaling properly.

Python brings a powerful language with a robust environment into the serverless realm and can be a powerful tool when best practices are followed carefully.

To become more efficient in managing your databases, it is preferable to integrate them with a solution that can perform Data Integration and Management procedures for you without much difficulty, which is where Hevo Data, a Cloud-based ETL Tool, comes in.

To become more efficient in handling your Databases, it is preferable to integrate them with a solution that can carry out Data Integration and Management procedures for you without much ado and that is where Hevo Data, a Cloud-based ETL Tool, comes in. Hevo Data supports 100+ Data Sources and helps you transfer your data from these sources to Data Warehouses like Amazon Redshift in a matter of minutes, all without writing any code!

Visit our Website to Explore Hevo

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. Hevo offers plans & pricing for different use cases and business needs, check them out!

Share your experience of Working with Boto3 Lambda (AWS) in the comments section below!

Davor DSouza
Research Analyst, Hevo Data

Davor DSouza is a data analyst with a passion for using data to solve real-world problems. His experience with data integration and infrastructure, combined with his Master's in Machine Learning, equips him to bridge the gap between theory and practical application. He enjoys diving deep into data and emerging with clear and actionable insights.

No-code Data Pipeline For Your Data Warehouse