Data migration from one instance of a data warehouse to another is essential to consider if you want to optimize cost, improve performance, and consolidate operations in a single place.

Amazon Redshift is a cloud data warehousing service that allows you to deploy your application while securely storing your data. Migrating data from one Redshift cluster to another can enable you to perform collaborative analysis between different databases.

This article will explore the most prominent methods of migrating data from Amazon Redshift to Redshift.

Why Integrate Data from Redshift to Redshift?

  • Redshift to Redshift data integration can be beneficial if you assign separate databases for development, testing, and production purposes. 
  • With lower AWS region pricing, moving data from one Redshift cluster to another can reduce the overall operation costs.
  • This migration can enable you to improve the performance of data queries by moving the data to a cluster that is present near the user base.
Simplify Redshift to Redshift migration with Hevo’s reliable, no-code, automated pipelines with 150+ connectors.
Get your free trial right away!

An Overview of Amazon Redshift

Amazon Redshift is a cloud data warehousing service that enables you to analyze and perform queries on your data. It can give you six times better price performance than other data warehouses. With Amazon Redshift’s massively parallel processing (MPP) feature, you can quickly improve performance and scale your data

It allows you to run SQL queries, visualize data, and create dashboards catering to your business needs. You can quickly and securely integrate data within and across AWS regions and third-party applications.

Methods to Load Data from Redshift to Redshift

This section will highlight the most prominent ways on how to insert Redshift data into Redshift table.

Method 1: Migrate Redshift Data to Redshift Using Hevo

Hevo Data is a no-code, real-time ELT platform that automates data pipelines cost-effectively to meet your specific needs. It provides 150+ data source connectors to access and integrate data to a destination. With Hevo’s extremely easy-to-use user interface, you can quickly move data from a source to a destination.

Here are some of the most popular features that Hevo provides:

  • Data Transformation: Hevo provides Python-based and drag-and-drop data transformation techniques that enable you to clean and transform data to make it analysis-ready.
  • Incremental Data Load: Hevo allows real-time modified data transfer, ensuring efficient bandwidth utilization at the source and destination.
  • Automated Schema Mapping: Hevo automates the schema management process by detecting incoming data and replicating it to the destination schema. It lets you choose between Full & Incremental Mappings according to your data replication requirements.

Configure Amazon Redshift as a Source

This section will highlight the steps to configure Amazon Redshift as a source connector in Hevo Data. But before getting started, you must ensure you satisfy the prerequisites.

Prerequisites:

Set up Amazon Redshift as a Source

After satisfying all the prerequisites, follow the steps below:

  • Select PIPELINES from the Navigation Bar and click + CREATE in the Pipelines List View.
  • Select Amazon Redshift on the Select Source Type page.
  • Specify the necessary fields on the Configure your Amazon Redshift Source page.
Redshift to Redshift: Configure your Amazon Redshift Source
Redshift to Redshift: Configure your Amazon Redshift Source

For detailed information about the steps involved, follow Hevo Data Amazon Redshift Source Documentation.

Configure Amazon Redshift as a Destination

This section will guide you through the steps to configure Amazon Redshift as a destination in Hevo. But before getting into the steps involved, ensure you satisfy the prerequisites.

Prerequisites:

Set up Amazon Redshift as a Destination

After meeting all the prerequisite conditions, you must follow the steps below:

  • Select DESTINATIONS from the Navigation Bar and choose + CREATE in the Pipelines List View.
  • Select Amazon Redshift on the Add Destination page.
  • Specify the mandatory details on the Configure your Amazon Redshift Destination page.
Redshift to Redshift: Configure your Amazon Redshift Destination page
Redshift to Redshift: Configure your Amazon Redshift Destination page

To learn more about the steps involved in configuring Amazon Redshift as a destination, refer to Hevo Amazon Redshift Destination Documentation.

You can also use Hevo Data to load Redshift to Databricks, Redshift to BigQuery, and connect Redshift to many other data warehouses.

Get Started with Hevo for Free

Method 2: Moving Data from Redshift to Redshift Using COPY and UNLOAD Commands

This section will highlight the COPY and UNLOAD commands to convert Redshift to Redshift table. Follow the steps given below to perform Redshift Redshift integration.

Step 1: Create a Folder in Amazon S3 to Unload the Data

In this step, you must create an Amazon S3 folder and associate an IAM role to access it. The S3 cluster will mediate between the two Redshift database instances.

Step 2: Select Tables to Unload Data

You can run queries in the source before loading the data to the destination table to ensure the correct data is present. Remember to query the fields and check if they return the desired values so you can use them with the COPY command. You can load a query instead of a table to the destination database with the help of the UNLOAD command.

Step 3: Use UNLOAD Command

Using the UNLOAD command, you can load data from the Amazon Redshift Database to Amazon S3. To unload data to Amazon S3, follow the syntax given below.

UNLOAD ('select-statement')

TO 's3://object-path/name-prefix'

authorization

[ option, ...] 

where authorization is

IAM_ROLE { default | 'arn:aws:iam::<AWS account-id-1>:role/<role-name>[,arn:aws:iam::<AWS account-id-2>:role/<role-name>][,...]' }

where option is

| [ FORMAT [ AS ] ] CSV | PARQUET | JSON

| PARTITION BY ( column_name [, ... ] ) [ INCLUDE ]

| MANIFEST [ VERBOSE ]

| HEADER

| DELIMITER [ AS ] 'delimiter-char'

| FIXEDWIDTH [ AS ] 'fixedwidth-spec'

| ENCRYPTED [ AUTO ]

| BZIP2

| GZIP

| ZSTD

| ADDQUOTES

| NULL [ AS ] 'null-string'

| ESCAPE

| ALLOWOVERWRITE

| CLEANPATH

| PARALLEL [ { ON | TRUE } | { OFF | FALSE } ]

| MAXFILESIZE [AS] max-size [ MB | GB ]

| ROWGROUPSIZE [AS] size [ MB | GB ]

| REGION [AS] 'aws-region' }

| EXTENSION 'extension-name'

The UNLOAD command uses an SQL query, the Amazon S3 path where you want to load data, and the IAM role with permissions to access S3.

After performing the above command, you can check for the files in the Amazon S3 path.

Redshift to Redshift: UNLOAD Command
Redshift to Redshift: UNLOAD Command

You can also use Hevo Data to load data from Amazon Redshift to S3. For more information, refer to Amazon S3 to Redshift data transfer.

Step 4: Create a Destination Table

In this step, you create a table in the destination Redshift database. You must ensure that the schema and table you make in the destination database match the structure of the initial Redshift database.

You can use the CREATE TABLE command to create new tables in the Redshift database. You can follow this syntax to create new tables:

CREATE [ [LOCAL ] { TEMPORARY | TEMP } ] TABLE

[ IF NOT EXISTS ] table_name

( { column_name data_type [column_attributes] [ column_constraints ]

  | table_constraints

  | LIKE parent_table [ { INCLUDING | EXCLUDING } DEFAULTS ] }

  [, ... ]  )

[ BACKUP { YES | NO } ]

[table_attributes]

where column_attributes are:

  [ DEFAULT default_expr ]

  [ IDENTITY ( seed, step ) ]

  [ GENERATED BY DEFAULT AS IDENTITY ( seed, step ) ]

  [ ENCODE encoding ]

  [ DISTKEY ]

  [ SORTKEY ]

  [ COLLATE CASE_SENSITIVE | COLLATE CASE_INSENSITIVE  ]

and column_constraints are:

  [ { NOT NULL | NULL } ]

  [ { UNIQUE  |  PRIMARY KEY } ]

  [ REFERENCES reftable [ ( refcolumn ) ] ]

and table_constraints  are:

  [ UNIQUE ( column_name [, ... ] ) ]

  [ PRIMARY KEY ( column_name [, ... ] )  ]

  [ FOREIGN KEY (column_name [, ... ] ) REFERENCES reftable [ ( refcolumn ) ]

and table_attributes are:

  [ DISTSTYLE { AUTO | EVEN | KEY | ALL } ]

  [ DISTKEY ( column_name ) ]

  [ [COMPOUND | INTERLEAVED ] SORTKEY ( column_name [,...]) |  [ SORTKEY AUTO ] ]

  [ ENCODE AUTO ]

Step 5: Use COPY Command

The COPY command moves data from the Amazon S3 bucket to the new Amazon Redshift table, enabling you to move large amounts of data in seconds. To execute the COPY command, you must have:

  • The Amazon Redshift table’s name.
  • The columns you want to copy.
  • IAM roles for permission, and
  • The location in S3 where the data is present.

You can follow the syntax given below to execute the COPY command.

COPY table-name 

[ column-list ]

FROM data_source

authorization

[ [ FORMAT ] [ AS ] data_format ] 

[ parameter [ argument ] [, ... ] ]

After executing the command by replacing the arguments with your own values, you can move your data from the Amazon S3 bucket to the new Redshift database table. You must cross-check to see if the new data matches with the previous table.

You can also use data sharing to query external data using Amazon Redshift spectrum or read data from another cluster using cluster snapshots. These methods also provide an easy guide for moving your data from one Redshift cluster to another.

Limitations of Using Copy and Unload Commands

There are certain limitations to using the second method, which you must know before using it to perform Redshift Redshift migration. Here are some of the specific limitations that you must keep in mind:

  • Lack of Real-Time Data Integration: This method lacks the feature of real-time data integration, as it requires you to move data manually from one Amazon Redshift cluster to another. Any changes in the source data would not reflect on the destination.
  • Error Production: Manual data transfer can lead to error production, so you must be cautious when moving data from source to destination. This method requires continuous monitoring to reduce the risk of error production.
  • External Schemas Compatibility: Amazon Redshift does not support adding external schemas or tables to data shares.
  • Schema Management: You must also be careful about the source and destination schema, as they must match in structure to perform efficient data integration.

Use Cases of Redshift to Redshift Integration

  • Cost Optimization: This migration allows you to deploy clusters in the AWS region at lower pricing, saving additional costs.
  • Latency Optimization: Moving the cluster closer to the user base in a different AWS region can reduce latency and improve query performance.
  • Consolidation: Migrating the cluster to a region where the rest of the company’s deployments are present can help streamline management and reduce complexity. It can also optimize costs using a centralized resource management system.

Conclusion

This article explains two of the most widely used methods for moving data from one Redshift database to another. Although both methods are efficient, the second method has some limitations.

To overcome these limitations, you can use Hevo Data to transfer data between different Redshift clusters. Hevo provides over 150 data source connectors from which you can integrate data. Its easy-to-use user interface lets you easily combine data between various sources and destinations without requiring technical knowledge.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand. Also checkout our unbeatable pricing to choose the best plan for your organization.

Share your experience of Redshift to Redshift integration in the comments section below!

Frequently Asked Questions (FAQs)

Q. What are the ways to copy data from one Redshift cluster to another?

  1. There are multiple ways to transfer your data from one redshift cluster to another. Here are the prominent ones:
    1. Using SaaS-based tools such as Hevo.
    2. Using COPY and UNLOAD commands by transferring data to an Amazon S3 bucket.
    3. Using a cluster snapshot of the source and restoring it into the destination cluster.
Suraj Kumar Joshi
Technical Content Writer, Hevo Data

Suraj is a skilled technical content writer with a bachelor’s degree in Electronics Engineering. As a highly motivated data enthusiast, he specializes in journaling and writing about the latest trends in the data industry. Suraj has authored numerous articles on topics such as data science, engineering, and analysis, demonstrating his expertise and deep understanding of these fields. In addition to his writing, he is passionate about developing and training machine learning models to generate impactful insights.