Setting up S3 Cross Region Replication: 4 Easy Steps

on Data Integration, ETL, Tutorials • January 1st, 2021 • Write for Hevo

Most cloud services house robust support for seamless & real-time replication of data, a functionality that most organisations seek to achieve. Amazon Web Services (AWS) is one such cloud service by Amazon that provides users and businesses with robust end-to-end cloud-based solutions & APIs. One of the most popular services that Amazon Web Services provides is the simple storage service, popularly known as S3. It allows users to access, retrieve and replicate their data on demand & seamlessly across a diverse set of regions.

This article focuses on Cross Region Replication in S3 & aims at providing you with a comprehensive step-by-step guide to help you set up replication in S3, and seamlessly replicate your data across buckets present in a diverse set of regions. Upon a complete walkthrough of the content, you will have in-depth knowledge of data replication in S3, and you will be able to set up Cross Region Replication with ease.

Table of Contents

Introduction to Amazon S3

Amazon S3 Logo.

Amazon S3 (Simple Storage Service) is a highly scalable cloud-based storage service provided by Amazon. It allows users to create online backups of their data from numerous data sources, allowing them to store data up to 5 TB in size. Amazon S3 provides users with object-based data storage functionality and lets them store data in S3 buckets, ensuring 99.999999999% of data durability and 99.99% object availability. 

It stores data in the form of objects, with each of them consisting of files along with their metadata. It lets users select the kind of storage class they want to use, choosing between S3 Standard, Infrequent Access and Glacier. Amazon S3 houses an easy-to-use platform and provides exceptional support for numerous programming languages such as Java, Python, Scala, etc., and lets users transfer data to S3 buckets by leveraging the S3 APIs and various other ETL tools, connectors, etc.

For further information on Amazon S3, you can check the official website here.

Understanding Replication in S3

Data replication in S3 refers to the process of copying data from an S3 bucket of your choice to another bucket in an automatic manner, without affecting any other operation. With S3 replication in place, you can replicate data across buckets, either in the same or in a different region, known as Cross Region Replication. Amazon S3 further maintains metadata and allows users to store information such as origin, modifications, etc. of the data source and monitor any changes.

For further information on Amazon S3 replication, you can check the official documentation here.

Simplify Data Replication using Hevo’s No-code Data Pipelines

Hevo Data, a No-code Data Pipeline, can help you replicate data from Amazon S3 (among 100+ sources) swiftly to a database/data warehouse of your choice. Hevo is fully-managed and completely automates the process of monitoring and replicating the changes on the secondary database rather than making the user write the code repeatedly. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Hevo provides you with a truly efficient and fully-automated solution to replicate and manage data in real-time and always have analysis-ready data in your desired destination. It allows you to focus on key business needs and perform insightful analysis using BI tools. 

Have a look at the amazing features of Hevo:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
  • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to export. 
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
  • Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.

Get started with Hevo today! Sign up here for a 14-day free trial!

Prerequisites

  • Working knowledge of Amazon S3.
  • An active Amazon S3 account with IAM permissions.
  • A general idea about data replication.

Steps to Set Up Cross Region Replication in S3

You can implement Cross Region Replication in S3 using the following steps:

Step 1: Creating Buckets in S3

To start replicating data from your desired S3 bucket, you first need to log into the AWS management console for S3. To do this, go to the official website of AWS S3’s management console and enter your credentials such as your username and password.

AWS Login Page.

Once you’ve logged in, S3 homepage will now open up on your screen, where you need to click on the create a bucket option, found in the top right corner of your screen:

Amazon S3 Homepage.

The “create a bucket” window will now open up on your screen, where you need to configure your new S3 bucket by providing details such as a unique name for your bucket and its region.

Creating a Bucket in S3.

You will now be able to see the newly created S3 bucket in the bucket details section as follows:

Newly Created Bucket in S3.

To set up Cross Region Replication successfully, creating just one S3 bucket isn’t enough and hence, you now need to set up another bucket in a different region.

Creating a bucket in a different region.

This is how you can create buckets in S3 to start setting up Cross Region Replication.

Step 2: Creating an IAM User

With your S3 buckets now ready, you now need to create an IAM user. To do this, click on the IAM option, found in the main menu.

Selecting the IAM option from the main menu.

The IAM page will now open up on your screen, where you need to click on the roles option from the panel on the left and then click on the create role option.

Selecting the Create Role button.

You now need to select S3 as your desired service and then choose “S3: Allow S3 to call AWS services on your behalf” as your use case.

Selecting the S3 Use Case.

Once you’ve selected the right use case and service, you now need to choose the role policy. To do this, use the search bar and search for
“AmazonS3FullAccess” and select it:

Selecting the Role Policy in S3.

With your IAM role now ready and configured, the “review” window will now open up on your screen, where you’ll be able to find all necessary information about your role. To complete the setup, click on the create role option, present in the bottom right corner of your screen.

Reviewing the IAM Role.

Step 3: Configuring the Bucket Policy in S3

With your IAM role now set up, you now need to define the bucket policy that will help outline and decide the actions a user can perform. To configure the bucket policy, select the desired S3 bucket and click on the permissions option.

Selecting the permissions option.

Locate the bucket policy section in the permissions tab and then click on the edit option as follows:

Editing the Bucket Policy.

The bucket policy page will now open up on your screen, where you need to click on the policy generator option. In case you want to learn more about the AWS policy generator, you can click here to check out the official documentation.

Selecting the Policy generator.

Once you’ve clicked on the policy generator option, the AWS policy generator window will now open up, where you need to choose the bucket policy. To do this, click on the policy drop-down list & select the “S3 Bucket Policy” option, and then click on the add statement option.

Selecting the Policy type in AWS Policy Generatoe.

You will now be able to find the IAM user ARN value in the summary section as follows:

User ARN.

Once you’ve configured the user ARN, you now need to set up the bucket ARN value. To configure the bucket policy and ARN, add three operations, namely, Get Object, Put Object & Delete Object and then click on the add statement option.

Selecting Bucket Policy.

The bucket policy statement will now appear on your screen as follows:

Generated Bucket Policy.

With your bucket statement now ready, click on the generate policy button. The newly created policy will now appear on your screen as follows:

Generate Policy Button.
{
  "Id": "Policy1608168001400",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608167858466",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket101",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:user/demoCrossregion"
        ]
      }
    }
  ]
}

Copy the bucket policy and add it to your bucket policy list as follows:

Adding Generated Statement to Policy list.

You now need to repeat the same process for your second bucket.

{
  "Id": "Policy1608169434646",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608169432435",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket102",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:role/cross_region_demo1"
        ]
      }
    }
  ]
}

This is how you can set up the bucket policy in S3 to set up Cross Region Replication.

Step 4: Initializing Cross Region Replication in S3

Once you’ve created your S3 buckets and have configured their policies, you can now perform a Cross Region Replication for your data in S3. To do this, you’ll first have to create an IAM role for the user. To set up the IAM role, go to the roles page and click on the create role option present in the bottom of your screen:

Creating the IAM role.

Once you’ve clicked on it, you now need to select the use case for your IAM role as follows:

Selecting use case for your IAM role.

With your use case now set up, select the role policy permission as AmazonS3FullAccess.

Selecting the Policy Permissions .

Once you’ve made all the necessary configurations, the IAM role review page will open up on your screen, where you need to provide a unique name for your IAM role and then click on create. With your new IAM role in place, the bucket policies for both bucket 1 & 2 will get modified as follows:

Bucket 1 Policy:

{
  "Id": "Policy1608168001400",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608167858466",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket101",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:user/demoCrossregion"
        ]
      }
    }
  ]
}
{
  "Id": "Policy1608169434646",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608169432435",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket102",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:role/cross_region_demo1"
        ]
      }
    }
  ]
}

Bucket 2 Policy:

{
  "Id": "Policy1608168001400",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608167858466",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket101",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:user/demoCrossregion"
        ]
      }
    }
  ]
}
{
  "Id": "Policy1608169434646",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1608169432435",
      "Action": [
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::myrepbucket102",
      "Principal": {
        "AWS": [
          "arn:aws:iam::098671074698:role/cross_region_demo1"
        ]
      }
    }
  ]
}

To initialise the Cross Region Replication, click on the management option, present in the bucket details section and enable bucket versioning for both buckets.

Enabling Bucket Versioning.

With bucket versioning now enabled, you now need to provide the name of your destination bucket as follows:

Adding the name of the destination bucket.

Now, click on the IAM role drop-down list and select the IAM role you created.

Selecting the IAM Role.

Once you’ve selected the IAM role, click on the save option to bring the changes into effect. You now need to perform the same operation for your second bucket.

Cross Region Replication Enabled.

You can now verify the success of the replication process by checking the status of both buckets. The original bucket will now have a status value as “Completed” as follows:

Original Bucket Status.

The replica bucket will now have the status value as “Replica” as follows:

Replica Bucket Status.

This is how you can set up Cross Region Replication in S3.

Conclusion

This article teaches you how to set up Cross Region Replication in S3 with ease, and answers all your queries regarding it. It provides a brief introduction of various concepts related to it & helps the users understand them better and use them to perform data replication & recovery in the most efficient way possible. These methods, however, can be challenging especially for a beginner & this is where Hevo saves the day. Hevo Data, a No-code Data Pipeline, can help you replicate data in real-time without writing any code. Hevo being a fully-managed system provides a highly secure automated solution to help perform replication in just a few clicks using its interactive UI.

Want to take Hevo for a spin? Get started by signing up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at our unbeatable pricing that will help you choose the right plan for you!

Why don’t you share your experience of setting up S3 Cross Region Replication in the comments? We would love to hear from you!

No-code Data Pipeline For Your Data Warehouse