Exporting data from an Amazon RDS PostgreSQL instance to Amazon S3 can be useful for archiving, analytics, sharing, and more. AWS provides multiple ways to extract your RDS Postgres data and save it to S3 storage.

This post outlines two straightforward approaches to export your AWS RDS PostgreSQL database contents to files in an S3 bucket. Whether for one-time data transfers or setting up regular exports, these simple methods make it easy to copy your RDS Postgres data over to AWS S3.

By the end of this guide, you’ll know how to quickly get your RDS PostgreSQL data exported and safely stored within Amazon S3 using your preferred export technique.

Methods for AWS RDS Postgres Export to S3

Method 1: Using Hevo for AWS RDS Postgres Export to S3

Hevo is a No-code Data Pipeline solution that can help you seamlessly replicate data in real-time from 150+ data sources(Including 40+ free sources) like AWS RDS Postgres to Destinations such as Amazon S3, Data Warehouses, or a destination of your choice in a completely hassle-free & automated manner. 

GET STARTED WITH HEVO FOR FREE

You can effortlessly replicate data from AWS RDS Postgres to Amazon S3 using Hevo by following the simple steps given below:

Step 1: Connect AWS RDS Postgres as Source

Configure your Amazon RDS PostgreSQL Source page and enter the details such as Pipeline Name, Database Host, Database Port etc.

AWS RDS Postgres export to S3
Image Source

Step 2: Connect S3 as Destination

To configure Amazon S3 as a Destination in Hevo, follow the below steps:

  • Click DESTINATIONS in the Navigation Bar.
  • In the Destinations List View, Click + CREATE.
  • On the Add Destination page, select S3.
  • On the Configure your S3 Destination page, enter details like Destination Name, External ID, Bucket Name etc.
AWS RDS Postgres export to S2: Configure Destination
Image Source

For more information on the integration process, refer to Hevo documentation:

Method 2: Exporting data using AWS RDS for PostgreSQL to S3

Let us first understand the steps in AWS RDS Postgres Export to S3 before going into the details.

Understanding the Steps in AWS RDS Postgres Export to S3

Amazon RDS supports DB instances running multiple versions of PostgreSQL. Amazon RDS for PostgreSQL is compliant with various industry standards as well. Amazon RDS also provides access to databases on a DB instance using any standard SQL client. 

AWS RDS Postgres Export to S3: Flow Diagram
Image Source

The following sections cover the 6 steps involved in AWS RDS Postgres Export to S3 in great detail.

AWS RDS Postgres Export to S3: Exporting Data to Amazon S3

  • To start the process of AWS RDS Postgres Export to S3 you need to export the data stored in an RDS instance for your PostgreSQL database to an Amazon S3 bucket, you first need to ensure that your RDS instance version for PostgreSQL supports Amazon S3 exports. Next, you need to install the required PostgreSQL extensions namely aws_s3 and aws_commons. Start psql and type in the following command:
CREATE EXTENSION IF NOT EXISTS aws_s3 CASCADE;
  • The aws_s3 extension provides the aws_s3.query_export_to_s3 function that can be used to export data to Amazon S3. The aws_commons extension provides additional helper functions to the user for data export.
  • Identify a database query to obtain the data, and export the query data by calling the aws_s3.query_export_to_s3 function. Once you finish the preceding preparation tasks, you can use the aws_s3.query_export_to_s3 function to export query results to Amazon S3.  

AWS RDS Postgres Export to S3: Data conversion

While exporting from RDS to S3, Amazon RDS performs data conversion to export it. After conversion, it is then stored in the Parquet format. 

Parquet stores all data as one of the following primitive types:

  • BOOLEAN
  • INT32
  • INT64
  • INT96
  • FLOAT
  • DOUBLE
  • BYTE_ARRAY – A variable-length byte array, also known as binary
  • FIXED_LEN_BYTE_ARRAY – A fixed-length byte array used when the values have a constant size

To reduce the complexity of reading and writing format, there are only a few Parquet data types. It provides logical types for extending primitive types. A logical type is applied as an annotation with the data in a LogicalType metadata field. The logical type annotation explains the interpretation of primitive type.

When a BYTE_ARRAY type is annotated by the STRING logical type, it indicates that the byte array should be interpreted as a UTF-8 encoded character string. Amazon RDS notifies you if any string conversion takes place after an export task is completed. The underlying data exported is always the same as the data from the source. However, some characters might appear different from the source when read in tools such as Athena due to the encoding difference in UTF-8

AWS RDS Postgres Export to S3: PostgreSQL Version Verification

  • Amazon S3 exports are currently supported for PostgreSQL 10.14, 11.9, 12.4, and later. You can verify the support by using the following command:
describe-db-engine-versions
  • If the output includes the string “s3Export”, it means that the RDS engine supports Amazon S3 exports.  

AWS RDS Postgres Export to S3: Amazon S3 File Path Specification

  • You need to specify the bucket name and the file path to identify the location in Amazon S3 where you want to export your data.
  • The file path consists of an optional path prefix that identifies a virtual folder path and a file prefix that identifies the files to be stored. Large exports are stored in multiple files, each with a maximum size of approximately 6 GB.
  • To hold the information about where the Amazon S3 File exports are to be stored, you can use the aws_commons._s3_uri function that can be used to create an aws_commons._s3_uri_1 composite structure as follows:
psql=> SELECT aws_commons.create_s3_uri(
   'sample-bucket',
   'sample-filepath',
   'us-west-2'
) AS s3_uri_1 gset
  • This s3_uri_1 value can later be used as a parameter in the call to the aws_s3.query_export_to_s3 function.

AWS RDS Postgres Export to S3: Configuring Access to Amazon S3 Bucket

  • To begin the data export to Amazon S3, you need to give your PostgreSQL DB instance relevant permissions to access the Amazon S3 bucket that will be containing the files.
  • Identify the Amazon S3 path you wish to use for exporting data, and provide the necessary permissions to access the Amazon S3 bucket. This involves creating an IAM policy that provides access to the desired Amazon S3 bucket, creating an IAM role, attaching the policy to the role created, and finally adding the role to your DB instance.
  • Once you have created the IAM policy, you need to note down the ARN (Amazon Resource Name) of the policy that will be needed when you attach the policy to an IAM role. Here’s the code snippet to simplify the creation of an IAM policy:
aws iam create-policy  --policy-name rds-s3-export-policy  --policy-document '{
     "Version": "2012-10-17",
     "Statement": [
       {
         "Sid": "s3export",
         "Action": [
           "S3:PutObject"
         ],
         "Effect": "Allow",
         "Resource": [
           "arn:aws:s3:::your-s3-bucket/*"
         ] 
       }
     ] 
   }'
  • You can use the following AWS CLI command to create a role named rds-s3-export-role:
aws iam create-role  --role-name rds-s3-export-role  --assume-role-policy-document '{
     "Version": "2012-10-17",
     "Statement": [
       {
         "Effect": "Allow",
         "Principal": {
            "Service": "rds.amazonaws.com"
          },
         "Action": "sts:AssumeRole"
       }
     ] 
   }'
  • The following AWS CLI command can be used to attach the policy to the IAM role created earlier:
aws iam attach-role-policy  --policy-arn your-policy-arn  --role-name rds-s3-export-role
  • Finally, you need to add the IAM role to the DB instance. This can be carried out through the AWS Management Console or AWS CLI. If you decide to use the AWS Management Console for this step, sign in to the Console and choose the PostgreSQL DB instance name to display the details to start. 
  • Go to the Connectivity and Security tab, in the Manage IAM roles section, choose the role you wish to add under the Add IAM roles to this instance section. Go to Feature and choose s3Export. Choose Add Role to finish the step.  

AWS RDS Postgres Export to S3: Exporting Query Data

  • The last step in the process of AWS RDS Postgres Export to S3 is calling the aws_s3.query_export_to_s3 function.

This function requires two parameters, namely query and s3_info. The first one defines the query to be exported and verifies the Amazon S3 bucket to export to. You can also provide an optional parameter called options for defining various export parameters. To call it, you can use the following syntax:

aws_s3.query_export_to_s3(
    query text,    
    s3_info aws_commons._s3_uri_1,    
    options text
)

The aws_commons._s3_uri_1 composite type in the above syntax contains the following details about your S3 object:

  • bucket: The name of the Amazon S3 bucket to contain the file.
  • file_path: The Amazon S3 file name and path.
  • region: The AWS Region that the bucket is in.

To build the aws_commons._s3_uri_1 structure for holding Amazon S3 file information, you can use the following syntax:

aws_commons.create_s3_uri(
   bucket text,
   file_path text,
   region text
)
  • For this example, you can use the variable s3_uri_1 to identify a structure that has the information to identify the Amazon S3 file. You can use the aws_commons.create_s3_uri function to create the desired structure as follows:
psql=> SELECT aws_commons.create_s3_uri(
   'sample-bucket',
   'sample-filepath',
   'us-west-2'
) AS s3_uri_1 gset

AWS RDS Postgres Export to S3: Troubleshooting access to Amazon S3

You may face a few connectivity issues when trying to AWS RDS Postgres Export to S3. First make sure, that the VPC security group’s outbound access rules associated with your DB instance allow network connectivity. In particular, you must allow access to port 443 for SSL connections.

Conclusion

This article discusses the steps involved in setting up AWS RDS Postgres Export to S3 in great detail. Extracting complex data from a diverse set of data sources can be challenging, and this is where Hevo saves the day!

Visit our Website to Explore Hevo

Hevo offers a faster way to move data from Amazon S3, AWS RDS Postgres, and other Databases or SaaS applications into your desired destination to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.

Want to take Hevo for a spin? Sign Up for a 14-day free trial. You can also look at the unbeatable Hevo pricing to help you choose the right plan for your business needs.

Tell us about your experience with AWS RDS Postgres Export to S3 in the comments.

mm
Content Marketing Manager, Hevo Data

Amit is a Content Marketing Manager at Hevo Data. He enjoys writing about SaaS products and modern data platforms, having authored over 200 articles on these subjects.

No-code Data Pipeline For Your Data Warehouse