Exporting data from an Amazon RDS PostgreSQL instance to Amazon S3 can be useful for archiving, analytics, sharing, and more. AWS provides multiple ways to perform AWS RDS Postgres Export to S3, allowing you to extract your RDS Postgres data and save it to S3 storage easily.

This post outlines two straightforward approaches to exporting your AWS RDS PostgreSQL database contents to files in an S3 bucket. Whether for one-time data transfers or setting up regular exports, these simple methods make it easy to copy your RDS Postgres data over to AWS S3.

By the end of this guide, you’ll know how to quickly get your RDS PostgreSQL data exported and safely stored within Amazon S3 using your preferred export technique.

What is Amazon RDS PostgreSQL?

postgresql logo

Amazon RDS for PostgreSQL is the AWS relational database service. It is managed, meaning that it has made it easier to set up, run, and scale PostgreSQL in the cloud. The service provisions manage and scale for users while providing for automated backups, patching, and hardware provisioning. This leaves the users with less management of the infrastructure; instead, they will focus more on their applications. Amazon RDS provides high availability, security, and performance optimizations to run PostgreSQL workloads.

What is Amazon S3?

amazon s3 logo

Amazon S3 is a web service for storing and retrieving large amounts of data of any type, such as text, images, audio files, and video. It delivers fast, secure, and cost-effective storage of data to its users in the areas of over-the-top content, data archiving, and application development. Amazon S3 provides users with the capacity to store any amount of data anywhere while maintaining support for data encryption, lifecycle management, and access controls to ensure the safety of the data stored.

Methods To Connect Amazon RDS PostgreSQL to Amazon S3

Method 1: Automating the Data Replication Process using a No-Code Tool
Using an automated, no-code tool like Hevo, you can start replicating your data in minutes. It only has two steps, i.e., configure source and destination. It is more convenient and effective.
Method 2: Using AWS Data Pipeline
This method is time consuming and has various limitations like limited customization and performance impact. It involves a lot of steps to create an AWS Data Pipeline.

Get Started with Hevo for Free

Method 1: Using Hevo for AWS RDS Postgres Export to S3

You can effortlessly replicate data from AWS RDS Postgres to Amazon S3 using Hevo by following the simple steps given below:

Step 1: Connect AWS RDS Postgres as Source

  • Configure your Amazon RDS PostgreSQL Source page and enter the details such as Pipeline Name, Database Host, Database Port etc.
AWS RDS Postgres export to S3

Step 2: Connect S3 as Destination

  • On the Configure your S3 Destination page, enter details like Destination Name, External ID, Bucket Name etc.
AWS RDS Postgres export to S2: Configure Destination
Export Amazon RDS to PostgreSQL
Export Amazon S3 to PostgreSQL
Export Amazon RDS to MySQL

Method 2: Exporting data using AWS RDS for PostgreSQL to S3

AWS RDS Postgres Export to S3: Flow Diagram

The following sections cover the 6 steps involved in AWS RDS Postgres Export to S3 in great detail.

Step 1: Exporting Data to Amazon S3

  • First, ensure that your RDS instance version for PostgreSQL supports Amazon S3 exports. 
  • Next, you need to install the required PostgreSQL extensions, namely aws_s3 and aws_commons. Start psql and type in the following command:
CREATE EXTENSION IF NOT EXISTS aws_s3 CASCADE;
  • The aws_s3 extension provides the aws_s3.query_export_to_s3 function that can be used to export data to Amazon S3. The aws_commons extension provides additional helper functions to the user for data export.
  • Identify a database query to obtain the data and export the query data by calling the aws_s3.query_export_to_s3 function.

Step 2: Data conversion

While exporting from RDS to S3, Amazon RDS performs data conversion to export it. After conversion, it is then stored in the Parquet format.  Parquet stores all data as one of the following primitive types:

  • BOOLEAN
  • INT32
  • INT64
  • INT96
  • FLOAT
  • DOUBLE
  • BYTE_ARRAY – A variable-length byte array, also known as binary
  • FIXED_LEN_BYTE_ARRAY – A fixed-length byte array used when the values have a constant size 

Step 3: PostgreSQL Version Verification

  • Amazon S3 exports are currently supported for PostgreSQL 10.14, 11.9, 12.4, and later. You can verify the support by using the following command:
describe-db-engine-versions
  • If the output includes the string “s3Export”, it means that the RDS engine supports Amazon S3 exports.  

Step 4: Amazon S3 File Path Specification

  • You need to specify the bucket name and the file path to identify the location in Amazon S3 where you want to export your data.
  • To hold the information about where the Amazon S3 File exports are to be stored, you can use the aws_commons._s3_uri function that can be used to create an aws_commons._s3_uri_1 composite structure as follows:
psql=> SELECT aws_commons.create_s3_uri(
   'sample-bucket',
   'sample-filepath',
   'us-west-2'
) AS s3_uri_1 gset
  • This s3_uri_1 value can later be used as a parameter in the call to the aws_s3.query_export_to_s3 function.

Step 5: Configuring Access to Amazon S3 Bucket

  • Identify the Amazon S3 path you wish to use for exporting data and provide the necessary permissions to access the Amazon S3 bucket.
  • This involves creating an IAM policy that provides access to the desired Amazon S3 bucket, creating an IAM role, attaching the policy to the role created, and finally adding the role to your DB instance.
  • Once you have created the IAM policy, you need to note down the ARN (Amazon Resource Name) of the policy that will be needed when you attach the policy to an IAM role. Here’s the code snippet to simplify the creation of an IAM policy:
aws iam create-policy  --policy-name rds-s3-export-policy  --policy-document '{
     "Version": "2012-10-17",
     "Statement": [
       {
         "Sid": "s3export",
         "Action": [
           "S3:PutObject"
         ],
         "Effect": "Allow",
         "Resource": [
           "arn:aws:s3:::your-s3-bucket/*"
         ] 
       }
     ] 
   }'
  • You can use the following AWS CLI command to create a role named rds-s3-export-role:
aws iam create-role  --role-name rds-s3-export-role  --assume-role-policy-document '{
     "Version": "2012-10-17",
     "Statement": [
       {
         "Effect": "Allow",
         "Principal": {
            "Service": "rds.amazonaws.com"
          },
         "Action": "sts:AssumeRole"
       }
     ] 
   }'
  • The following AWS CLI command can be used to attach the policy to the IAM role created earlier:
aws iam attach-role-policy  --policy-arn your-policy-arn  --role-name rds-s3-export-role
  • Finally, you need to add the IAM role to the DB instance. This can be carried out through the AWS Management Console or AWS CLI. If you decide to use the AWS Management Console for this step, sign in to the Console and choose the PostgreSQL DB instance name to display the details to start. 
  • Go to the Connectivity and Security tab, in the Manage IAM roles section, choose the role you wish to add under the Add IAM roles to this instance section. Go to Feature and choose s3Export. Choose Add Role to finish the step.  

Step 6: Exporting Query Data

  • The last step in the process of AWS RDS Postgres Export to S3 is calling the aws_s3.query_export_to_s3 function.
  • This function requires two parameters, namely query and s3_info. The first one defines the query to be exported and verifies the Amazon S3 bucket to export to.
  • You can also provide an optional parameter called options for defining various export parameters. To call it, you can use the following syntax:
aws_s3.query_export_to_s3(
    query text,    
    s3_info aws_commons._s3_uri_1,    
    options text
)

The aws_commons._s3_uri_1 composite type in the above syntax contains the following details about your S3 object:

  • bucket: The name of the Amazon S3 bucket to contain the file.
  • file_path: The Amazon S3 file name and path.
  • region: The AWS Region that the bucket is in.

To build the aws_commons._s3_uri_1 structure for holding Amazon S3 file information, you can use the following syntax:

aws_commons.create_s3_uri(
   bucket text,
   file_path text,
   region text
)
  • For this example, you can use the variable s3_uri_1 to identify a structure that has the information to identify the Amazon S3 file. You can use the aws_commons.create_s3_uri function to create the desired structure as follows:
psql=> SELECT aws_commons.create_s3_uri(
   'sample-bucket',
   'sample-filepath',
   'us-west-2'
) AS s3_uri_1 gset

Limitations of Manual Method

Supported PostgreSQL Versions

This export option is only available for PostgreSQL versions 10.14, 11.9, 12.4, and later. PostgreSQL versions prior to this that are hosted on RDS do not support using the aws_s3.query_export_to_s3 function for exporting data into S3.

Few File Formats

Currently the export process supports the Parquet format for exported data only. For formats such as CSV or JSON, you have to do additional steps by converting the exported Parquet files.

Large Data Exports

Instead of this, data is split into multiple files when large data sets are being exported. The data is split in cases where it exceeds more than 6 GB. When working with very large exports, complexity may be added when working with multiple files. For instance, if the data needs to be combined or further processed, this becomes another hassle with multiple files.

No direct scheduling

AWS RDS Postgres does not feature automated scheduled exports, which run on a recurring schedule. So, if you require regular exports, you will need to implement the functionality through another mechanism-a mechanism such as AWS Lambda, AWS Step Functions, or possibly a cron job-to automate the process.

Conclusion

This article discusses the steps involved in setting up AWS RDS Postgres Export to S3 in great detail. Extracting complex data from a diverse set of data sources can be challenging, and this is where Hevo saves the day!

Hevo offers a faster way to move data from Amazon S3, AWS RDS Postgres, and other Databases or SaaS applications into your desired destination to be visualized. Hevo is fully automated and, hence, does not require you to code.

Want to take Hevo for a spin? Sign Up for a 14-day free trial. You can also look at the unbeatable Hevo pricing to help you choose the right plan for your business needs.

Tell us about your experience with AWS RDS Postgres Export to S3 in the comments.

FAQs

1. How do we back up RDS Postgres to S3?

You can back up RDS Postgres to S3 by using the AWS RDS snapshot export feature or by exporting specific query results with the aws_s3.query_export_to_s3 function. For snapshots, create a snapshot and export it to S3 in Parquet format.

2. How much does RDS snapshot export to S3 cost?

The cost of exporting RDS snapshots to S3 includes a fee based on the amount of data exported (per GB) and the S3 storage cost for the exported data. AWS pricing may vary depending on the region and the total size of your export.

3. How do I save data to my S3 bucket?

You can save data to your S3 bucket by using the aws_s3.query_export_to_s3 function in RDS PostgreSQL or by uploading files directly to the bucket via the AWS Management Console, AWS CLI, or an SDK such as Boto3.

mm
Content Marketing Manager, Hevo Data

Amit is a Content Marketing Manager at Hevo Data. He is passionate about writing for SaaS products and modern data platforms. His portfolio of more than 200 articles shows his extraordinary talent for crafting engaging content that clearly conveys the advantages and complexity of cutting-edge data technologies. Amit’s extensive knowledge of the SaaS market and modern data solutions enables him to write insightful and informative pieces that engage and educate audiences, making him a thought leader in the sector.