In today’s data-rich world, businesses must select the right data storage and analysis platform. For many, Heroku PostgreSQL has long been a trusted solution, offering a reliable relational database service in the cloud. However, as organizations grow, Heroku PostgreSQL becomes more expensive and has slower performance when handling large data volumes compared to other cloud-based platforms. Since the Heroku platform for Postgres databases is standardized, you have limited customization options for your application. The solution to this problem is migrating to a cloud-based data warehousing platform like Snowflake.
Snowflake’s architecture offers unparalleled scalability, elasticity, and performance for data storage and analytics workloads of any size or complexity.
Let’s look into migrating data from Heroku PostgreSQL to Snowflake.
Why Migrate Data from Heroku PostgreSQL to Snowflake?
When you migrate from Heroku PostgreSQL to Snowflake, your organization can drive innovation and insights from their data. Here are a few reasons to migrate to Snowflake:
- Enhanced Cloud Security: Snowflake provides the highest level of security for financial or any other organizations that deal with sensitive data using a Virtual Private Snowflake(VPS). It is a separate Snowflake environment that does not share data with others outside the VPS.
- Compatibility with Existing Infrastructure: Snowflake’s native support for semi-structured data, such as CSV files, ensures smooth integration into existing data workflows. This eliminates the need to reconfigure the existing infrastructure before adopting Snowflake.
Heroku PostgreSQL: A Brief Overview
Heroku is a cloud platform that allows the direct deployment of database applications and email services. Heroku PostgreSQL is Heroku’s powerful database service that is compatible with PostgreSQL. With its fully managed service, you can focus more on building applications based on your data than database management tasks.
Heroku PostgreSQL is known for its reliability, extensibility, and robustness. It provides features including continuous protection, auto-scaling, and automated health checks.
You can access PostgreSQL instances on Heroku using any of its officially supported languages with a PostgreSQL driver and manage them efficiently with Heroku CLI commands.
Snowflake: A Brief Overview
Snowflake is a software-as-a-service (SaaS) that offers a unified data analytics platform for data warehousing, data lakes, and application development. All the hardware and software components can execute on a cloud infrastructure such as AWS, Azure, or GCP.
Snowflake employs a shared-disk architecture that maintains a central data repository accessible from all virtual compute instances. This allows your organization to scale the storage and compute resources independently based on your needs.
To connect with Snowflake, you can choose a web-based interface, OBDC or JDBC drivers, SnowSQL CLI, and native connectors such as Python or Spark.
Methods for Data Migration from Heroku PostgreSQL to Snowflake
To convert Heroku PostgreSQL to Snowflake, you can utilize Hevo Data, SQL utilities, Python modules.
Method 1: Export Data from Heroku PostgreSQL to Snowflake File Using Hevo Data
Hevo Data is a no-code, real-time ELT platform that helps you cost-effectively automate flexible data pipelines. With its 150+ pre-built connectors, you can smoothly extract data from different sources, load it to destinations, and transform it for in-depth analysis.
Some of the key features of Hevo Data are given below:
- Data Transformation: Analyst-friendly data transformation approaches, such as Python-based transformation scripts or drag-and-drop transformation blocks, allow you to clean, prepare, and transform data before importing it to the desired destination.
- Incremental Data Load: Hevo Data facilitates real-time data transfer, optimizing bandwidth utilization at both ends of the data pipeline.
- Auto Schema Mapping: Hevo’s Auto Mapping feature eliminates the tedious task of managing schema manually. It identifies and transfers the incoming data format to the desired destination schema. You can choose either full or incremental mappings based on your data replication needs.
Integrate Heroku for PostgreSQL to BigQuery
Integrate Heroku for PostgreSQL to Redshift
Integrate Heroku for PostgreSQL to Snowflake
Let’s see how to integrate Heroku PostgreSQL to Snowflake using the Hevo Data.
Step 1: Configure Heroku PostgreSQL as Your Source Connector
Before proceeding with the configuration, check the following prerequisites are in place:
- An active Heroku account with a PostgreSQL database instance.
- Retrieve PostgreSQL database credentials.
- You must assigned a Pipeline Administrator, Team Collaborator, or Team Administrator role in Hevo.
Here are the steps to set up Heroku PostgreSQL as the source in Hevo:
- Log in to your Hevo account and select the PIPELINES option from the Navigation Bar.
- In the Pipelines List View, click the + CREATE button.
- Navigate to the Select Source Type page and choose Heroku PostgreSQL as your source type.
- Provide the required information in the Configure your Heroku PostgreSQL Source page.
Heroku PostgreSQL to Snowflake: Configuring your Heroku PostgreSQL Source Page
- Click the TEST CONNECTION > TEST & CONTINUE button to complete the source configuration.
To learn more about source configuration, read Heroku PostgreSQL documentation in Hevo.
Step 2: Configure Snowflake as Your Destination Connector
Before you begin, verify the following prerequisites are met:
Here are the steps to set up Snowflake as your destination in Hevo:
- Select the option DESTINATIONS from the Navigation Bar.
- Go to the Destinations List View page and click the + CREATE button.
- In the Add Destination page, select Snowflake as your destination type.
- Specify the mandatory fields on the Configure your Snowflake Destination page.
Heroku PostgreSQL to Snowflake: Configuring your Snowflake Destination Page
- Click the TEST CONNECTION > SAVE & CONTINUE to complete the destination configuration.
For more information about destination configuration, read Snowflake documentation in Hevo.
Method 2: Transfer Heroku PostgreSQL Data to Snowflake Using SQL Utilities
For this migration, you must export the data from Heroku PostgreSQL to a CSV file using the psql utility. Then, using the SnowSQL CLI commands, load the staged CSV file into the Snowflake table.
Perform the following steps to transfer Heroku PostgreSQL to Snowflake using SQL utilities:
- Connect to your Heroku PostgreSQL database with psql using the following command:
heroku pg:psql -a <app-name>
Sign into the psql utility and run the following command to copy the data to a local CSV file:
\\copy sample_data TO sample_db.csv WITH (FORMAT CSV);
Log in to SnowSQL CLI using the following command:
snowsql -a <account_name> -u <user_name> -d <snowflake_db>
Enter your Snowflake password and create a new database in your Snowflake warehouse using the following command:
CREATE DATABASE IF NOT EXISTS UTILS_DATABASE
Use the following command to activate the current database:
USE DATABASE UTILS_DATABASE
Create a table that matches the source table using the following syntax. Replace the placeholders with your specific credentials.
CREATE OR REPLACE TABLE tableName (column_name1 datatype, column_name2 datatype, column_name3 datatype,......., column_name4 datatype)
Run the following command to put the CSV file into a staged file:
PUT file://sample_db.csv @%sample_data;
Load the staged file into the main table using the COPY INTO command:
COPY INTO sample_data FROM @%sample_data FILE_FORMAT = (TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '\"');
With the Heroku and SnowSQL CLI commands, you can transfer the Heroku PostgreSQL data to Snowflake tables.
Limitations of Heroku PostgreSQL to Snowflake Migration Using SQL Utilities
- High Complexity: When migrating to Snowflake with CLI commands, you may encounter potential complexity in mastering the command-line interfaces of both platforms. This may require extra resources and time for training, especially for individuals who are not proficient with command-line tools.
- Data Leakage: The plain text format allows CSV files to be easily read during data migration from Heroku PostgreSQL to Snowflake. This vulnerability could result in data leakage if the data is sensitive.
Method 3: Load Heroku PostgreSQL Data to Snowflake Using Python Modules
If you are a Python expert, you can alternatively utilize psycopg, a popular PostgreSQL adapter for Python, to extract data from Heroku PostgreSQL. Then, import the extracted data into Snowflake using its own Python connector. Let’s look at the steps for using Python utilities:
- Open your command line interface and install psycopg2 and snowflake-connector-python modules using the pip package manager.
- Write the following code in a new Python file:
import psycopg2
import snowflake.connector
import csv
postgresConnection = psycopg2.connect(
database=<database_name>,
user=<user_name>,
password=<password>,
host=<host_name>,
port='8866'
)
postgresCursor = postgresConnection.cursor()
postgresCursor.execute("""SELECT tableName FROM information_schema.tables
WHERE table_schema = 'public'""")
tableName = cur.fetchall()[0][0]
postgresCursor.execute("""select *
from information_schema.columns
where table_schema NOT IN ('information_schema','pg_catalog')
order by table_schema, tableName""")
schema = "("
for row in cur:
if row[7] == 'character varying':
schema += f"{row[3]} string, "
elif row[7] == 'integer':
schema += f"{row[3]} {row[7]}, "
schema = f"{schema[:-2]})"
postgresCursor.execute("SELECT * FROM usersdemo")
extract_rows = postgresCursor.fetchall()
with open('usersdemo.csv', 'w') as f:
writer = csv.writer(f)
writer.writerows(extract_rows)
postgresCursor.close()
postgresConnection.close()
Once you run this code, you will obtain a CSV file containing all the data that is ready to import into Snowflake.
Update the existing Python file to load the data in CSV format to Snowflake using the following code:
snowflakeConnection = snowflake.connector.connect(
user=<user_name>,
password=<password>,
account=<account_name>
)
snowflakeCursor = snowflakeConnection.cursor()
databaseName = 'python_database' snowflake_cursor.execute(f"CREATE DATABASE IF NOT EXISTS {databaseName}")
snowflakeCursor.execute(f"USE DATABASE {databaseName}")
tableName = 'usersdemo'
snowflakeCursor.execute(
"CREATE OR REPLACE TABLE "
f"{tableName}{schema}")
snowflakeCursor.execute(
f"PUT file://usersdemo.csv @%{tableName}")
snowflakeCursor.execute(
f"COPY INTO {tableName} FROM @%{tableName} FILE_FORMAT = (TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '\\"')")
snowflakeCursor.close()
snowflakeConnection.close()
To learn about the Python code in detail, refer to the steps in Python Utilities to transfer from Heroku Postgres to Snowflake.
Limitations of Heroku PostgreSQL to Snowflake Migration Using Python Modules
- Lack of Real-time Integration: When working with Python, any updates to the Heroku PostgreSQL databases will not be reflected in Snowflake until the next migration is performed. This delay affects the decision-making process that relies on up-to-date data.
- Security Concerns: Storing database credentials directly in the Python code poses security risks, especially when the code is shared in a version control system like Git.
Use Cases of Heroku PostgreSQL to Snowflake Migration
- Ad-hoc Analysis: When migrating to Snowflake, your organization can implement data warehouses more quickly to perform ad hoc analysis by executing SQL queries.
- Micro-Partitioned Data Storage: Unlike traditional static partitioning, Snowflake tables automatically partition data into micro-partitions, with contiguous storage units ranging from 50 MB to 500 MB in size.
Conclusion
Heroku PostgreSQL to Snowflake migration offers organizations a pathway to improved data management and analytics capabilities. Whether performing ad-hoc analysis or optimizing data storage, migrating Snowflake allows you to unlock new scalability, performance, and data integrity levels in your operations.
In this article, you have learned how to utilize Hevo Data, SQL utilities, or Python modules to integrate Heroku PostgreSQL to Snowflake. The SQL method is only suitable for smaller datasets and is accompanied by security considerations. If you use Python utilities, you must be highly proficient in Python coding to import data into Snowflake from Heroku Postgres.
Utilizing pre-built connectors in Hevo is the ideal solution for quickly migrating large data volumes with real-time capabilities.
To learn more about SnowSQL, read Snowflake SQL Command-Line Client.
Frequently Asked Questions (FAQs)
Q. I am planning to migrate the data from Heroku PostgreSQL to Snowflake. Can anyone recommend the best solution for real-time synchronization between them, including deletes, updates, and insert operations?
A. You can utilize any one of the following approaches:
- Manually export the data using the pg_dump command from the Heroku PostgreSQL database. Then, utilize Snowflake commands like PUT and COPY INTO to load the dump file into Snowflake.
- Utilize Hevo Data to migrate large volumes of data from Heroku PostgreSQL to Snowflake efficiently.
Sony is a technical writer with over six years of experience, including three years as a writer and three years as a teacher. She leverages her Master’s degree in Computer Science to craft engaging and informative articles that span a broad spectrum of topics within data science, machine learning, and AI. Her dedication to excellence and passion for education are evident in her numerous published works, enlightening and empowering data professionals.