Most cloud services house robust support for seamless & real-time replication of data, a functionality that most organizations seek to achieve. Amazon Web Services (AWS) is one such cloud service by Amazon that provides users and businesses with robust end-to-end cloud-based solutions & APIs. One of the most popular services that Amazon Web Services provides is the simple storage service, popularly known as S3. It allows users to access, retrieve and replicate their data on demand & seamlessly across a diverse set of regions.
This article focuses on Cross Region Replication in S3 & aims at providing you with a comprehensive step-by-step guide to help you set up replication in S3, and seamlessly replicate your data across buckets present in a diverse set of regions. Upon a complete walkthrough of the content, you will have in-depth knowledge of data replication in S3, and you will be able to set up Cross Region Replication with ease.
Introduction to Amazon S3
Amazon S3 (Simple Storage Service) is a highly scalable cloud-based storage service provided by Amazon. It allows users to create online backups of their data from numerous data sources, allowing them to store data up to 5 TB in size. Amazon S3 provides users with object-based data storage functionality and lets them store data in S3 buckets, ensuring 99.999999999% of data durability and 99.99% object availability.
It stores data in the form of objects, with each of them consisting of files along with their metadata. It lets users select the kind of storage class they want to use, choosing between S3 Standard, Infrequent Access and Glacier. Amazon S3 houses an easy-to-use platform and provides exceptional support for numerous programming languages such as Java, Python, Scala, etc., and lets users transfer data to S3 buckets by leveraging the S3 APIs and various other ETL tools, connectors, etc.
For further information on Amazon S3, you can check the official website here.
Understanding Replication in S3
Data replication in S3 refers to the process of copying data from an S3 bucket of your choice to another bucket in an automatic manner, without affecting any other operation. With S3 replication in place, you can replicate data across buckets, either in the same or in a different region, known as Cross Region Replication S3. Amazon S3 further maintains metadata and allows users to store information such as origin, modifications, etc. of the data source and monitor any changes.
For further information on Amazon S3 replication, you can check the official documentation here.
Download the Ultimate Guide on Database Replication
Learn the 3 ways to replicate databases & which one you should prefer.
Prerequisites
- Working knowledge of Amazon S3.
- An active Amazon S3 account with IAM permissions.
- A general idea about data replication.
Steps to Set Up Cross Region Replication in S3
You can implement Cross Region Replication in S3 using the following steps:
Step 1: Creating Buckets in S3
To start replicating data from your desired S3 bucket, you first need to log into the AWS management console for S3. To do this, go to the official website of AWS S3’s management console and enter your credentials such as your username and password.
Once you’ve logged in, S3 homepage will now open up on your screen, where you need to click on the create a bucket option, found in the top right corner of your screen:
The “create a bucket” window will now open up on your screen, where you need to configure your new S3 bucket by providing details such as a unique name for your bucket and its region.
You will now be able to see the newly created S3 bucket in the bucket details section as follows:
To set up Cross Region Replication successfully, creating just one S3 bucket isn’t enough and hence, you now need to set up another bucket in a different region.
This is how you can create buckets in S3 to start setting up Cross Region Replication.
Step 2: Creating an IAM User
With your S3 buckets now ready, you now need to create an IAM user. To do this, click on the IAM option, found in the main menu.
The IAM page will now open up on your screen, where you need to click on the roles option from the panel on the left and then click on the create role option.
You now need to select S3 as your desired service and then choose “S3: Allow S3 to call AWS services on your behalf” as your use case.
Once you’ve selected the right use case and service, you now need to choose the role policy. To do this, use the search bar and search for
“AmazonS3FullAccess” and select it:
With your IAM role now ready and configured, the “review” window will now open up on your screen, where you’ll be able to find all necessary information about your role. To complete the setup, click on the create role option, present in the bottom right corner of your screen.
Step 3: Configuring the Bucket Policy in S3
With your IAM role now set up, you now need to define the bucket policy that will help outline and decide the actions a user can perform. To configure the bucket policy, select the desired S3 bucket and click on the permissions option.
Locate the bucket policy section in the permissions tab and then click on the edit option as follows:
The bucket policy page will now open up on your screen, where you need to click on the policy generator option. In case you want to learn more about the AWS policy generator, you can click here to check out the official documentation.
Once you’ve clicked on the policy generator option, the AWS policy generator window will now open up, where you need to choose the bucket policy. To do this, click on the policy drop-down list & select the “S3 Bucket Policy” option, and then click on the add statement option.
You will now be able to find the IAM user ARN value in the summary section as follows:
Once you’ve configured the user ARN, you now need to set up the bucket ARN value. To configure the bucket policy and ARN, add three operations, namely, Get Object, Put Object & Delete Object and then click on the add statement option.
The bucket policy statement will now appear on your screen as follows:
With your bucket statement now ready, click on the generate policy button. The newly created policy will now appear on your screen as follows:
{
"Id": "Policy1608168001400",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1608167858466",
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::myrepbucket101",
"Principal": {
"AWS": [
"arn:aws:iam::098671074698:user/demoCrossregion"
]
}
}
]
}
Copy the bucket policy and add it to your bucket policy list as follows:
You now need to repeat the same process for your second bucket.
{
"Id": "Policy1608169434646",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1608169432435",
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::myrepbucket102",
"Principal": {
"AWS": [
"arn:aws:iam::098671074698:role/cross_region_demo1"
]
}
}
]
}
This is how you can set up the bucket policy in S3 to set up Cross Region Replication.
Step 4: Initializing Cross Region Replication in S3
Once you’ve created your S3 buckets and have configured their policies, you can now perform a Cross Region Replication for your data in S3. To do this, you’ll first have to create an IAM role for the user. To set up the IAM role, go to the roles page and click on the create role option present in the bottom of your screen:
Once you’ve clicked on it, you now need to select the use case for your IAM role as follows:
With your use case now set up, select the role policy permission as AmazonS3FullAccess.
Once you’ve made all the necessary configurations, the IAM role review page will open up on your screen, where you need to provide a unique name for your IAM role and then click on create. With your new IAM role in place, the bucket policies for both bucket 1 & 2 will get modified as follows:
Bucket 1 Policy:
{
"Id": "Policy1608168001400",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1608167858466",
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::myrepbucket101",
"Principal": {
"AWS": [
"arn:aws:iam::098671074698:user/demoCrossregion"
]
}
}
]
}
{
"Id": "Policy1608169434646",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1608169432435",
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::myrepbucket102",
"Principal": {
"AWS": [
"arn:aws:iam::098671074698:role/cross_region_demo1"
]
}
}
]
}
Bucket 2 Policy:
{
"Id": "Policy1608168001400",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1608167858466",
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::myrepbucket101",
"Principal": {
"AWS": [
"arn:aws:iam::098671074698:user/demoCrossregion"
]
}
}
]
}
{
"Id": "Policy1608169434646",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1608169432435",
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::myrepbucket102",
"Principal": {
"AWS": [
"arn:aws:iam::098671074698:role/cross_region_demo1"
]
}
}
]
}
To initialize the Cross Region Replication, click on the management option, present in the bucket details section and enable bucket versioning for both buckets.
With bucket versioning now enabled, you now need to provide the name of your destination bucket as follows:
Now, click on the IAM role drop-down list and select the IAM role you created.
Once you’ve selected the IAM role, click on the save option to bring the changes into effect. You now need to perform the same operation for your second bucket.
You can now verify the success of the replication process by checking the status of both buckets. The original bucket will now have a status value as “Completed” as follows:
The replica bucket will now have the status value as “Replica” as follows:
This is how you can set up Cross Region Replication in S3.
Conclusion
This article teaches you how to set up Cross Region Replication in S3 with ease, and answers all your queries regarding it. It provides a brief introduction of various concepts related to it & helps the users understand them better and use them to perform data replication & recovery in the most efficient way possible. These methods, however, can be challenging especially for a beginner & this is where Hevo saves the day.
Visit our Website to Explore Hevo
Hevo Data, a No-code Data Pipeline, can help you replicate data in real-time without writing any code. Hevo being a fully-managed system provides a highly secure automated solution to help perform replication in just a few clicks using its interactive UI.
Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.
Why don’t you share your experience of setting up S3 Cross Region Replication in the comments? We would love to hear from you!
Muhammad Faraz is an AI/ML and MLOps expert with extensive experience in cloud platforms and new technologies. With a Master's degree in Data Science, he excels in data science, machine learning, DevOps, and tech management. As an AI/ML and tech project manager, he leads projects in machine learning and IoT, contributing extensively researched technical content to solve complex problems.