PostgreSQL to Snowflake ETL: 4 Easy Steps to Migrate Data

on Tutorial, Data Integration, Data Warehouse, Database, ETL, PostgreSQL, snowflake • November 17th, 2021 • Write for Hevo

Before diving deep into PostgreSQL to Snowflake data migration steps, let me walk give you a brief introduction about Snowflake. As you might have heard, Snowflake is an Analytic Data Warehouse on the Cloud. It is offered as a true Software-as-a-Service (SaaS) – no hardware to configure, no software to install, no maintenance. Snowflake’s architecture is defined newly from scratch, not an extension of the existing Big Data framework like Hadoop.

Snowflake has a hybrid of the traditional shared-disk database and modern shared-nothing database architectures. It uses a central repository for persisted data that is accessible from all compute nodes in the data warehouse and processes queries using MPP (Massively Parallel Processing) compute clusters where each node in the cluster stores a portion of the data set. Snowflake processes using “Virtual Warehouses” which is an MPP compute cluster composed of multiple compute nodes. All components of Snowflake’s service run in a public Cloud-like AWS.

Snowflake Data Warehouse is considered a cost-effective high performing analytical solution and is used by many organizations for critical workloads. In this post, we will discuss how to move real-time data from PostgreSQL to Snowflake. So, read along and understand the steps to migrate data from PostgreSQL to Snowflake.

Table Of Contents

Methods to Replicate PostgreSQL to Snowflake

There are two popular methods to perform PostgreSQL to Snowflake ETL.

Method 1: A ready to use Hevo, Official Snowflake ETL Partner (14 Days Free Trial) to move data from PostgreSQL to Snowflake.


Method 2: Write a Custom Code to move data from PostgreSQL to Snowflake.

postgresql to snowflake

As in the above-shown figure, steps to replicate PostgreSQL to Snowflake using Custom code (Method 2) are as follows:

1. Extract Data from PostgreSQL

COPY TO command is the most popular and efficient method to extract data from the PostgreSQL table to a file. We can also use the pg_dump utility for the first time full data extraction. We will have a look at both methods.

a. Extract Data Using the COPY Command

As mentioned above, COPY TO is the command used to move data between PostgreSQL tables and standard file-system files. It copies an entire table or the results of a SELECT query to a file:

COPY table or sql_query TO out_file_name WITH options.


COPY employees TO 'C:tmpemployees_db.csv'  WITH DELIMITER ',' CSV HEADER;
COPY (select * from contacts where age < 45) TO 'C:tmpyoung_contacts_db.csv'  WITH DELIMITER ',' CSV HEADER;

Some frequently used options are:

  • FORMAT: The format of the data to be written are text, CSV, or binary (default is text).
  • ESCAPE: The character that should appear before a data character that matches the QUOTE value.
  • NULL: Represents the string that is a null value. The default is N (backslash-N) in text and an unquoted empty string in CSV.
  • ENCODING: Encoding of the output file. The default value is the current client encoding.
  • HEADER: If it is set, on the output file, the first line contains the column names from the table.
  • QUOTE: The quoting character to be used when data is quoted. The default is double-quote(“).
  • DELIMITER: The character that separates columns within each line of the file.

Next, we can have a look at how the COPY command can be used to extract data from multiple tables using a PL/PgSQL procedure. Here, the table named tables_to_extract contains details of tables to be exported.

    tables RECORD;
    statement TEXT;
 FOR tables IN 
    SELECT (schema || '.' || table_name) AS table_with_schema
    FROM tables_to_extract

    statement := 'COPY ' || tables.table_with_schema || ' TO ''' || path || '/' || tables.table_with_schema || '.csv' ||''' DELIMITER '';'' CSV HEADER';
    EXECUTE statement;
 $ LANGUAGE plpgsql;

SELECT db_to_csv('/home/user/dir'/dump); -- This will create one csv file per table, in /home/user/dir/dump/

Sometimes you want to extract data incrementally. To do that, add more metadata like the timestamp of last data extraction to the table tables_to_extract and use that information while creating the COPY command to extract data changed after that timestamp.

Consider you are using a column name last_pull_time corresponding to each table in the table tables_to_extract which stores last successful data pull time. Each time data in the table which are modified after that timestamp has to be pulled. The body of the loop in procedure will change like this:

Here a dynamic SQL is created with predicate comparing last_modified_time_stamp from the table to be extracted and last_pull_time from table list_of_tables.

 FOR tables IN 
    SELECT (schema || '.' || table_name) AS table_with_schema, last_pull_time AS lt
    FROM tables_to_extract
    statement := 'COPY (SELECT * FROM ' || tables.table_with_schema ||   ' WHERE last_modified_time_stamp > ' || last_pull_time ') TO ' '' || path || '/' || tables.table_with_schema || '.csv' ||''' DELIMITER '';'' CSV HEADER';
    EXECUTE statement;

b. Extract Data Using the pg_dump

As mentioned above, pg_dump is the utility for backing up a PostgreSQL database or tables. It can be used to extract data from the tables also.

Example syntax:

pg_dump --column-inserts --data-only --table=<table> <database> > table_name.sql

Here output file table_name.sql will be in the form of INSERT statements like

INSERT INTO my_table (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);

This output has to be converted into a CSV file with the help of a small script in your favorites like Bash or Python.

2. Data Type Conversion and Formatting

There will be domain-specific logic to be applied while transferring data. Apart from that following things are to be noted while migrating data to avoid surprises.

  • Snowflake out of box support number of character sets including UTF-8. Check out the full list of encodings.
  • Unlike many other cloud analytical solutions, Snowflake supports SQL constraints like UNIQUE, PRIMARY KEY, FOREIGN KEY, NOT NULL constraints.
  • Snowflake by default has a rich set of data types. Below is the list of Snowflake data types and corresponding PostgreSQL types.
PostgreSQL Data TypeSnowflake Data Type
ENUMNo type for ENUM. Must use any type which can represent values in ENUM
  • Snowflake allows almost all of the date/time format. The format can be explicitly specified while loading data to the table using the File Format Option which we will discuss in detail later. The complete list of supported date/time formats can be found.

3. Stage Data Files

Before inserting data from Postgresql to Snowflake table it needs to be uploaded to a temporary location which is called staging. There are two types of stages – internal and external.

a. Internal Stage

Each user and table is automatically allocated an internal stage for data files. It is also possible to create named internal stages.

  • The user named and accessed as ‘@~’.
  • The name of a table stage will be the same as that of the table.
  • The user or table stages can’t be altered or dropped.
  • The user or table stages do not support setting file format options.

As mentioned above, Internal Named Stages can be created by the user using the respective SQL statements. It provides a lot of flexibility while loading data by giving options to you to assign file format and other options to named stages.

While running DDL and commands like load data, SnowSQL is quite a handy CLI client which can be used to run those commands and is available in Linux/Mac/Windows. Read more about the tool and options.

Below are some example commands to create a stage:

Create a names stage:

create or replace stage my_postgres_stage
copy_options = (on_error='skip_file')
file_format = (type = 'CSV' field_delimiter = '|' skip_header = 1);

PUT command is used to stage data files to an internal stage. The syntax of the command is as given below :

PUT file://path_to_file/filename internal_stage_name


Upload a file named cnt_data.csv in the /tmp/postgres_data/data/ directory to an internal stage named postgres_stage.

put file:////tmp/postgres_data/data/cnt_data.csv @postgres_stage;

There are many useful options that can be helpful to improve performance like set parallelism while uploading the file, automatic compression of data files, etc. More information about those options is listed here.

b. External Stage

Amazon S3 and Microsoft Azure are external staging locations currently supported by Snowflake. We can create an external stage with any of those locations and load data to a Snowflake table.

To create an external stage on S3, IAM credentials have to be given. If the data is encrypted, then encryption keys should also be given.

create or replace stage postgre_ext_stage url='s3://snowflake/data/load/files/'
credentials=(aws_key_id='111a233b3c' aws_secret_key='abcd4kx5y6z');
encryption=(master_key = 'eSxX0jzYfIjkahsdkjamtnBKONDwOaO8=');

Data to the external stage can be uploaded using AWS or Azure web interfaces. For S3 you can upload using AWS web console or any AWS SDK or third-party tools.

4. Copy Staged Files from PostgreSQL to Snowflake Table

COPY INTO is the command used to load the contents of the staged file(s) from Postgresql to Snowflake table. To execute the command compute resources in the form of virtual warehouses are needed. You know more about it this command in the Snowflake ETL best practices.

To load from a named internal stage

COPY INTO postgres_table
FROM @postgres_stage;

Loading from the external stage. Only one file is specified.

COPY INTO my_external_stage_table
FROM @postgres_ext_stage/tutorials/dataloading/contacts_ext.csv;

You can even copy directly from an external location:

COPY INTO postgres_table
FROM s3://mybucket/snow/data/files
credentials = (aws_key_id='$AWS_ACCESS_KEY_ID' aws_secret_key='$AWS_SECRET_ACCESS_KEY')
encryption = (master_key = 'eSxX0jzYfIdsdsdsamtnBKOSgPH5r4BDDwOaO8=')
file_format = (format_name = csv_format);

Files can be specified using patterns.

COPY INTO pattern_table
FROM @postgre_stage
file_format = (type = 'CSV')

Some common format options for CSV format supported in the COPY command:

  • COMPRESSION: Compression algorithm used for the input data files.
  • RECORD_DELIMITER: Records or lines separator character in an input CSV file.
  • FIELD_DELIMITER: Character separating fields in the input file.
  • SKIP_HEADER: How many header lines to be skipped.
  • DATE_FORMAT: To specify the date format.
  • TIME_FORMAT: To specify the time format.

Check out the full list of options. So, now you have finally loaded data from PostgreSQL to Snowflake.

Update Snowflake Table

We have discussed how to extract data incrementally from PostgreSQL. Now we will look at how to migrate data from PostgreSQL to Snowflake effectively.

As we discussed in the introduction, Snowflake is not based on any big data framework and does not have any limitations for row-level updates like in systems like Hive. It supports row-level updates makes delta data migration much easier. Basic idea is to load incrementally extracted data into an intermediate table and modify records in the final table as per data in the intermediate table.

There are three popular methods to modify the final table once data is loaded into the intermediate table.

  • Update the rows in the final table and insert new rows from the intermediate table which are not in the final table.
UPDATE  final_target_table t
SET t.value = s.value
FROM  intermed_delta_table in
INSERT INTO final_target_table (id, value)
SELECT id, value
FROM  intermed_delta_table
WHERE NOT id IN (SELECT id FROM final_target_table);
  • Delete all records from the final table which are in the intermediate table. Then insert all rows from the intermediate table to the final table.
DELETE .final_target_table f
WHERE IN (SELECT id from intermed_delta_table);
INSERT final_target_table (id, value)
SELECT id, value
FROM  intermed_table;
  • MERGE statement: Inserts and updates can be done with a single MERGE statement and it can be used to apply changes in the intermediate table to the final table with one SQL statement.
MERGE into final_target_table t1 using intermed_delta_table t2 on =
WHEN matched then update set value = t2.value
WHEN not matched then INSERT (id, value) values (, t2.value);

Easier Way to move data from PostgreSQL to Snowflake

Using Hevo, official Snowflake ETL partner you can easily load data from PostgreSQL to Snowflake with just 3 simple steps. This does not need you to write any code and will provide you with an error-free, fully managed setup to move data in minutes.

Check out some of the cool features of Hevo:

  • Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
  • Real-time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
  • 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources that can help you scale your data infrastructure as required.
  • 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support call.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.


High-performing Data Warehouse solutions like Snowflake are getting more adoption and are becoming an integral part of a modern analytics pipeline. Migrating data from various data sources to this kind of cloud-native solution i.e from Postgresql to Snowflake, requires expertise in the cloud, data security, and many other things like metadata management.

If you are looking for an ETL tool that facilitates the automatic transformation of data, then Hevo is the right choice for you. Hevo is a No-code Data Pipeline. It supports pre-built integration from 100+ data sources at a reasonable price. With Hevo, you can perfect, modify and enrich your data conveniently.

visit our website to explore hevo

SIGN UP for a 14-day free trial and see the difference!

Have any further queries about PostgreSQL to Snowflake? Get in touch with us in the comments section below.

No-code Data Pipeline for Snowflake