Most cloud services house robust support for seamless & real-time replication of data, a functionality that most organizations seek to achieve. Amazon Web Services (AWS) is one such cloud service by Amazon that provides users and businesses with robust end-to-end cloud-based solutions & APIs. One of the most popular services that Amazon Web Services provides is the simple storage service, popularly known as S3. It allows users to access, retrieve and replicate their data on demand & seamlessly across a diverse set of regions.

This article focuses on Cross Region Replication in S3 & aims at providing you with a comprehensive step-by-step guide to help you set up replication in S3, and seamlessly replicate your data across buckets present in a diverse set of regions. Upon a complete walkthrough of the content, you will have in-depth knowledge of data replication in S3, and you will be able to set up Cross Region Replication with ease.

Table of Contents

Introduction to Amazon S3

Cross Region Replication- Amazon S3 Logo.
Image Source: eescorporation.com

Amazon S3 (Simple Storage Service) is a highly scalable cloud-based storage service provided by Amazon. It allows users to create online backups of their data from numerous data sources, allowing them to store data up to 5 TB in size. Amazon S3 provides users with object-based data storage functionality and lets them store data in S3 buckets, ensuring 99.999999999% of data durability and 99.99% object availability. 

It stores data in the form of objects, with each of them consisting of files along with their metadata. It lets users select the kind of storage class they want to use, choosing between S3 Standard, Infrequent Access and Glacier. Amazon S3 houses an easy-to-use platform and provides exceptional support for numerous programming languages such as Java, Python, Scala, etc., and lets users transfer data to S3 buckets by leveraging the S3 APIs and various other ETL tools, connectors, etc.

For further information on Amazon S3, you can check the official website here.

Understanding Replication in S3

Data replication in S3 refers to the process of copying data from an S3 bucket of your choice to another bucket in an automatic manner, without affecting any other operation. With S3 replication in place, you can replicate data across buckets, either in the same or in a different region, known as Cross Region Replication S3. Amazon S3 further maintains metadata and allows users to store information such as origin, modifications, etc. of the data source and monitor any changes.

For further information on Amazon S3 replication, you can check the official documentation here.

Download the Ultimate Guide on Database Replication
Download the Ultimate Guide on Database Replication
Download the Ultimate Guide on Database Replication
Learn the 3 ways to replicate databases & which one you should prefer.
Simplify Data Replication using Hevo’s No-code Data Pipelines

Hevo Data, a No-code Data Pipeline, can help you replicate data from Amazon S3 (among 100+ sources) swiftly to a database/data warehouse of your choice. Hevo is fully-managed and completely automates the process of monitoring and replicating the changes on the secondary database rather than making the user write the code repeatedly. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Hevo provides you with a truly efficient and fully-automated solution to replicate and manage data in real-time and always have analysis-ready data in your desired destination. It allows you to focus on key business needs and perform insightful analysis using BI tools. 

Get Started with Hevo for Free

Have a look at the amazing features of Hevo:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
  • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to export. 
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
  • Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.
Sign up here for a 14-Day Free Trial!

Prerequisites

  • Working knowledge of Amazon S3.
  • An active Amazon S3 account with IAM permissions.
  • A general idea about data replication.

Steps to Set Up Cross Region Replication in S3

You can implement Cross Region Replication in S3 using the following steps:

Step 1: Creating Buckets in S3

To start replicating data from your desired S3 bucket, you first need to log into the AWS management console for S3. To do this, go to the official website of AWS S3’s management console and enter your credentials such as your username and password.

Cross Region Replication- AWS Login Page.
Image Source: geekylane.com

Once you’ve logged in, S3 homepage will now open up on your screen, where you need to click on the create a bucket option, found in the top right corner of your screen:

Cross Region Replication-Amazon S3 Homepage.
Image Source: geekylane.com

The “create a bucket” window will now open up on your screen, where you need to configure your new S3 bucket by providing details such as a unique name for your bucket and its region.

Cross Region Replication- Creating a Bucket in S3.
Image Source: geekylane.com

You will now be able to see the newly created S3 bucket in the bucket details section as follows:

Cross Region Replication- Newly Created Bucket in S3.
Image Source: geekylane.com

To set up Cross Region Replication successfully, creating just one S3 bucket isn’t enough and hence, you now need to set up another bucket in a different region.

Cross Region Replication- Creating a bucket in a different region.
Image Source: geekylane.com

This is how you can create buckets in S3 to start setting up Cross Region Replication.

Step 2: Creating an IAM User

With your S3 buckets now ready, you now need to create an IAM user. To do this, click on the IAM option, found in the main menu.

Cross Region Replication- Selecting the IAM option from the main menu.
Image Source: geekylane.com

The IAM page will now open up on your screen, where you need to click on the roles option from the panel on the left and then click on the create role option.

Cross Region Replication- Selecting the Create Role button.
Image Source: geekylane.com

You now need to select S3 as your desired service and then choose “S3: Allow S3 to call AWS services on your behalf” as your use case.

Cross Region Replication- Selecting the S3 Use Case.
Image Source: geekylane.com

Once you’ve selected the right use case and service, you now need to choose the role policy. To do this, use the search bar and search for
“AmazonS3FullAccess” and select it:

Cross Region Replication- Selecting the Role Policy in S3.
Image Source: geekylane.com

With your IAM role now ready and configured, the “review” window will now open up on your screen, where you’ll be able to find all necessary information about your role. To complete the setup, click on the create role option, present in the bottom right corner of your screen.

Cross Region Replication- Reviewing the IAM Role.
Image Source: geekylane.com

Step 3: Configuring the Bucket Policy in S3

With your IAM role now set up, you now need to define the bucket policy that will help outline and decide the actions a user can perform. To configure the bucket policy, select the desired S3 bucket and click on the permissions option.

Cross Region Replication- Selecting the permissions option.
Image Source: geekylane.com

Locate the bucket policy section in the permissions tab and then click on the edit option as follows:

Cross Region Replication- Editing the Bucket Policy.
Image Source: geekylane.com

The bucket policy page will now open up on your screen, where you need to click on the policy generator option. In case you want to learn more about the AWS policy generator, you can click here to check out the official documentation.

Cross Region Replication- Selecting the Policy generator.
Image Source: geekylane.com

Once you’ve clicked on the policy generator option, the AWS policy generator window will now open up, where you need to choose the bucket policy. To do this, click on the policy drop-down list & select the “S3 Bucket Policy” option, and then click on the add statement option.

Cross Region Replication- Selecting the Policy type in AWS Policy Generatoe.
Image Source: geekylane.com

You will now be able to find the IAM user ARN value in the summary section as follows:

Cross Region Replication-User ARN.
Image Source: geekylane.com

Once you’ve configured the user ARN, you now need to set up the bucket ARN value. To configure the bucket policy and ARN, add three operations, namely, Get Object, Put Object & Delete Object and then click on the add statement option.

Cross Region Replication- Selecting Bucket Policy.
Image Source: geekylane.com

The bucket policy statement will now appear on your screen as follows:

Cross Region Replication- Generated Bucket Policy.
Image Source: geekylane.com

With your bucket statement now ready, click on the generate policy button. The newly created policy will now appear on your screen as follows:

Cross Region Replication- Generate Policy Button.
Image Source: geekylane.com
{
  "Id": "Policy1608168001400",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608167858466",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket101",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:user/demoCrossregion"
        ]
      }
    }
  ]
}

Copy the bucket policy and add it to your bucket policy list as follows:

Cross Region Replication- Adding Generated Statement to Policy list.
Image Source: geekylane.com

You now need to repeat the same process for your second bucket.

{
  "Id": "Policy1608169434646",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608169432435",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket102",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:role/cross_region_demo1"
        ]
      }
    }
  ]
}

This is how you can set up the bucket policy in S3 to set up Cross Region Replication.

Step 4: Initializing Cross Region Replication in S3

Once you’ve created your S3 buckets and have configured their policies, you can now perform a Cross Region Replication for your data in S3. To do this, you’ll first have to create an IAM role for the user. To set up the IAM role, go to the roles page and click on the create role option present in the bottom of your screen:

Cross Region Replication- Creating the IAM role.
Image Source: geekylane.com

Once you’ve clicked on it, you now need to select the use case for your IAM role as follows:

Cross Region Replication- Selecting use case for your IAM role.
Image Source: geekylane.com

With your use case now set up, select the role policy permission as AmazonS3FullAccess.

Cross Region Replication- Selecting the Policy Permissions .
Image Source: geekylane.com

Once you’ve made all the necessary configurations, the IAM role review page will open up on your screen, where you need to provide a unique name for your IAM role and then click on create. With your new IAM role in place, the bucket policies for both bucket 1 & 2 will get modified as follows:

Bucket 1 Policy:

{
  "Id": "Policy1608168001400",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608167858466",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket101",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:user/demoCrossregion"
        ]
      }
    }
  ]
}
{
  "Id": "Policy1608169434646",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608169432435",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket102",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:role/cross_region_demo1"
        ]
      }
    }
  ]
}

Bucket 2 Policy:

{
  "Id": "Policy1608168001400",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608167858466",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket101",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:user/demoCrossregion"
        ]
      }
    }
  ]
}
{
  "Id": "Policy1608169434646",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608169432435",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket102",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:role/cross_region_demo1"
        ]
      }
    }
  ]
}

To initialize the Cross Region Replication, click on the management option, present in the bucket details section and enable bucket versioning for both buckets.

Cross Region Replication- Enabling Bucket Versioning.
Image Source: geekylane.com

With bucket versioning now enabled, you now need to provide the name of your destination bucket as follows:

Cross Region Replication- Adding the name of the destination bucket.
Image Source: geekylane.com

Now, click on the IAM role drop-down list and select the IAM role you created.

Cross Region Replication- Selecting the IAM Role.
Image Source: geekylane.com

Once you’ve selected the IAM role, click on the save option to bring the changes into effect. You now need to perform the same operation for your second bucket.

Cross Region Replication- Cross Region Replication Enabled.
Image Source: geekylane.com

You can now verify the success of the replication process by checking the status of both buckets. The original bucket will now have a status value as “Completed” as follows:

Cross Region Replication- Original Bucket Status.
Image Source: geekylane.com

The replica bucket will now have the status value as “Replica” as follows:

Cross Region Replication- Replica Bucket Status.
Image Source: geekylane.com

This is how you can set up Cross Region Replication in S3.

Conclusion

This article teaches you how to set up Cross Region Replication in S3 with ease, and answers all your queries regarding it. It provides a brief introduction of various concepts related to it & helps the users understand them better and use them to perform data replication & recovery in the most efficient way possible. These methods, however, can be challenging especially for a beginner & this is where Hevo saves the day. 

Visit our Website to Explore Hevo

Hevo Data, a No-code Data Pipeline, can help you replicate data in real-time without writing any code. Hevo being a fully-managed system provides a highly secure automated solution to help perform replication in just a few clicks using its interactive UI.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Why don’t you share your experience of setting up S3 Cross Region Replication in the comments? We would love to hear from you!

Muhammad Faraz
Freelance Technical Content Writer, Hevo Data

In his role as a freelance writer, Muhammad loves to use his analytical mindset and a problem-solving ability to help businesses solve problems by offering extensively researched content.

No-code Data Pipeline For Your Data Warehouse

Get Started with Hevo