Amazon S3 to RDS: 5 Easy Steps

• May 27th, 2021

Amazon’s Simple Storage Service(S3) and Relational Database Service(RDS) both have mustered some popularity in recent times owing to their simplistic yet cost-effective and efficient features. Several business owners and professionals dealing in large amounts of data have shifted and linked their prior data warehouses to Amazon S3 and RDS. Some of these users face the need to make a direct integration between the two, to load data from S3 to RDS through their AWS accounts.

This article will help you explore the basic functionality and key features of Amazon S3 and Amazon RDS. You will also read about the steps you can follow to integrate Amazon S3 to RDS as well as the challenges faced in this process. 

Table of Contents 

Understanding Amazon S3

AWS S3
Image Source: Free Code Camp

Amazon S3 i.e. Simple Storage Service is a highly scalable and versatile object storage service. It optimizes features such as data availability, performance, and security. Web-scale computing is simplified with easy storage and retrieval of data. The data is stored in S3 buckets or file folders that use the metadata and data of an object for storing. S3 operations are carried out in batches as illustrated below:

Amazon S3 Batch Operations
Image Source: AWS

Understanding the Key Features of Amazon S3

Alongside robust data storage and availability for easy computing, Amazon S3 offers a range of key services and features through its platform:

  • Metadata tags can be appended to objects for storage and migration of data across S3 storage classes.
  • You can configure and implement a variety of data access controls for data security and prompt protection against any unauthorized users.
  • Easy monitoring of data and objects is facilitated with advanced features like Big Data analytics and other processes.
  • Storage in the form of objects enables the storage, analysis, and retrieval of large amounts of data from various platforms and multiple devices.
  • Four distinct storage classes offer varying levels of availability, performance-based requirements, and durability to the platform. 

Understanding Amazon RDS

AWS RDS
Image Source: Whizlabs

Amazon RDS is a Relational Database Service that provides features for easy setup, operations, and scalability with cloud services.  The merits of Amazon RDS can be illustrated as shown below:

Benefits of Amazon RDS
Image Source: System Admins Pro

Understanding the Key Features of Amazon RDS

Amazon RDS offers several integral features for streamlined administration and easy data management. These include:

  • Features for resizable capacities and automation of mundane tasks such as database setups, hardware provisioning, backups, and patching processes.
  • It allows push-button compute scaling and lowers the administrative burden with an improved performance scale.
  • The availability and durability of the system are improved with automated backups and robust processes.
  • RDS provides prompt security with thorough encryption taking place at all rest and in transit states.
  • With the monitoring of various metrics, it makes manageability easier along with a pay for use basis policy, thus, making RDS highly cost-effective and efficient.

Simplify your Data Analysis with Hevo’s No-code Data Pipelines

Hevo Data, a No-code Data Pipeline helps to transfer your data from multiple sources(among 100+ sources) to the Data Warehouse/Destination of your choice to visualize it in your desired BI tool. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also taking care to transform it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

It provides a consistent & reliable solution to manage data in real-time and you always have analysis-ready data in your desired destination. It allows you to focus on key business needs and perform insightful analysis using a BI tool of your choice.

GET STARTED WITH HEVO FOR FREE

Check out Some of the Cool Features of Hevo:

  • Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
  • Real-Time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
  • 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources that can help you scale your data infrastructure as required.
  • 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
SIGN UP HERE FOR A 14-DAY FREE TRIAL!

Steps to Integrate Amazon S3 to RDS

To integrate Amazon S3 to RDS and effectively load data from an S3 bucket to an RDS instance, you can follow the approach of copying the CSV file from your S3 bucket in one account to an RDS instance in another. Follow these steps to achieve the same:

Download the Guide on Should you build or buy a data pipeline?
Download the Guide on Should you build or buy a data pipeline?
Download the Guide on Should you build or buy a data pipeline?
Explore the factors that drive the build vs buy decision for data pipelines

S3 to RDS Step 1: Create and attach IAM role to RDS Cluster

Start by creating an IAM role for your RDS cluster in the second account. To do this head over to the IAM page and click the Roles tab. Here, create a new Role and select “service: RDS” along with “use case: RDS – CloudHSM and Directory Service”.

Image Source: AWS

Proceed to attach the policy “AmazonS3FullAccess” to the role, and consequently attach the role to the concerned RDS instance. For this select your Aurora cluster, click “Actions” and “Manage IAM roles”. Add the new role to your cluster.

S3 to RDS Step 2: Create and Attach Parameter Group to RDS Cluster

The next step is to create and attach a custom parameter group to your cluster. For this, go to the RDS page>Parameter Groups and create a new parameter group. Assign the type “DB cluster parameter group” and add a name and a description. 

Now, assign the ARN of the previously created IAM role to aurora_load_from_s3_role and aws_default_s3_role parameters. 

S3 to RDS: Creating parameter group
Image Source: Medium

You also need to attach this group to the RDS cluster. Go to Databases tab>Your Aurora Cluster and start making alterations. Alter the DB cluster parameter group to make the attachment.

Choosing database option
Image Source: Medium

S3 to RDS Step 3: Reboot your RDS Instances 

Reboot your RDS Instances
Image Source: GeeksforGeeks

Reboot your RDS instances by selecting the instance and clicking on “Actions” and “Reboot”. This will bring all the previous changes made into effect,

S3 to RDS Step 4: Alter S3 Bucket Policy

Now, you need to select the Buket Policy to update your S3 bucket permissions. You can do so by altering the ARN of the IAM role created earlier and the ARN of the S3 bucket as shown below:

{
   “Version”: “2012–10–17”,
   “Statement”: [
   {
       “Effect”: “Allow”,
       “Principal”: {
           “AWS”: “arn:aws:iam::<account-id>:role/<rds-role>”
       },
       “Action”: [
           “s3:*”
       ],
       “Resource”: [
           “arn:aws:s3:::<s3-bucket>/*”
       ]
   }]
}

S3 to RDS Step 5: Establish a VPC Endpoint 

Finally, you need to set up a VPC endpoint in the second account. Go to the VPC page and select the Endpoints tab. Here, select the desired “Service Name”, for instance: com.amazonaws.us-east-1.s3 .  Select the VPC, Subnets associated with the cluster, and the Policy as per your requirements. 

For a full access integration, the following can be used:

{
“Statement”: [
{
“Action”: “*”,
“Effect”: “Allow”,
“Resource”: “*”,
“Principal”: “*”
}
]
}

Custom access to only specific S3 buckets can be established as follows:

{
   “Statement”: [
   {
       “Principal”: “*”,
       “Action”: [
           “s3:*”
       ],
       “Effect”: “Allow”,
       “Resource”: [“arn:aws:s3:::<s3-bucket>/*”]
   }]
}

With all these setups followed through you can now load files from your S3 bucket in your first account to RDS in another using this simple command:

LOAD DATA FROM S3 “s3://sample_bucket/sample_file.csv” INTO TABLE
dbname.dbtable(field1, field2, ...);

Thus, a successful integration between Amazon S3 and Amazon RDS has been made. 

Understanding the Challenges of Migrating Data from Amazon S3 to RDS

The process of migrating data from Amazon S3 to Amazon RDS is really easy and simple to follow. However, there are some typical challenges that users might face owing to the manual process of migration. Challenges owing to resiliency can be observed with networking issues to retain RDS instances. If the application state of the machine isn’t preserved it can be challenging during migration. 

The limitations due to security vacuums and risks are also major while migrating from an on-premise location to the cloud. The manual management and storage of this data call for centralized monitoring and other operations that require manpower and a hefty budget. Some of these challenges can be easily tackled with integration into an automation platform such as Hevo Data. 

Conclusion 

Thus, the layered benefits of Amazon S3 and Amazon RDS can be observed with a swift integration between your S3 bucket and RDS instance in respective accounts. To experience less bandwidth eros and an increasingly automated system, Hevo can be a great option for complete management of data transfers, integrations, replication, and analytics.

Integrating and analyzing your data from a huge set of diverse sources can be challenging, this is where Hevo comes into the picture. Hevo is a No-code Data Pipeline and has awesome 100+ pre-built integrations that you can choose from. Hevo can help you integrate your data from numerous sources and load them into a destination to analyze real-time data with a BI tool and create your Dashboards. It will make your life easier and make data migration hassle-free. It is user-friendly, reliable, and secure.

VISIT OUR WEBSITE TO EXPLORE HEVO

Want to take Hevo for a spin?

SIGN UP and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

No-code Data Pipeline for your Data Warehouse