Oracle to Snowflake ETL: Data Migration in 2 Easy Methods

Migrating from Oracle to Snowflake? This guide outlines two straightforward methods to move your data. Learn how to leverage Snowflake’s cloud architecture to access insights from your Oracle databases.

Table of Contents

Ultimately, you can choose the best of both methods based on your business requirements. Read along to learn how to migrate data seamlessly from Oracle to Snowflake.

Overview of Oracle

Oracle Database is a robust relational database management system (RDBMS) known for its scalability, reliability, and advanced features like high availability and security. Oracle offers an integrated portfolio of cloud services featuring IaaS, PaaS, and SaaS, posing competition to big cloud providers. The company also designs and markets enterprise software solutions in the areas of ERP, CRM, SCM, and HCM, addressing a wide range of industries such as finance, health, and telecommunication institutions.

Overview of Snowflake

Snowflake is a cloud-based data warehousing platform designed for modern data analytics and processing. Snowflake separates compute, storage, and services. Therefore, they may scale independently with a SQL data warehouse for querying and analyzing structured and semi-structured data stored in Amazon S3 or Azure Blob Storage.

Advantages of Snowflake

Scalability: Using Snowflake, you can automatically scale the compute and storage resources to manage varying workloads without any human intervention.
Supports Concurrency: Snowflake delivers high performance when dealing with multiple users supporting mixed workloads without performance degradation.
Efficient Performance: You can achieve optimized query performance through the unique architecture of Snowflake, with particular techniques applied in columnar storage, query optimization, and caching.

Why Choose Snowflake over Oracle?

Here, I have listed some reasons why Snowflake is chosen over Oracle.

Scalability and Flexibility: Snowflake is intrinsically designed for the cloud to deliver dynamic scalability with near-zero manual tuning or infrastructure management. Horizontal and vertical scaling can be more complex and expensive in traditional Oracle on-premises architecture.
Concurrency and Performance: Snowflake’s architecture supports automatic and elastic scaling, ensuring consistent performance even under heavy workloads. Whereas Oracle’s monolithic architecture may struggle with scalability and concurrency challenges as data volumes grow.
Ease of Use: Snowflake’s platform is known for its simplicity and ease of use. Although quite robust, Oracle normally requires specialized skills and resources in configuration, management, and optimization.

Common Challenges of Migration from Oracle to Snowflake

Let us also discuss what are the common challenges you might face while migrating your data from Oracle to Snowflake.

Architectural Differences: Oracle has a traditional on-premises architecture, while Snowflake has a cloud-native architecture. This makes the adaptation of existing applications and workflows developed for one environment into another quite challenging.
Compatibility Issues: There are differences in SQL dialects, data types, and procedural languages between Oracle and Snowflake that will have to be changed in queries, scripts, and applications to be migrated for compatibility and optimal performance.
Performance Tuning: Optimizing performance in Snowflake to Oracle’s performance levels at a minimum requires knowledge of Snowflake’s capabilities and the tuning configurations it offers, among many other special features such as clustering keys and auto-scaling.

Method 1: Using Hevo Data to Set up Oracle to Snowflake Integration

Using Hevo Data, a No-code Data Pipeline, you can directly transfer data from Oracle to Snowflake and other Data Warehouses, BI tools, or a destination of your choice in a completely hassle-free & automated manner.

Method 2: Manual ETL Process to Set up Oracle to Snowflake Integration

In this method, you can convert your Oracle data to a CSV file using SQL plus and then transform it according to the compatibility. You then can stage the files in S3 and ultimately load them into Snowflake using the COPY command. This method can be time taking and can lead to data inconsistency.

Get Started with Hevo for Free

Methods to Set up Oracle to Snowflake Integration

There are many ways of loading data from Oracle to Snowflake. In this blog, you will be going to look into two popular ways. Also you can read our article on Snowflake Excel integration.

In the end, you will have a good understanding of each of these two methods. This will help you to make the right decision based on your use case:

Method 1: Using Hevo Data to Set up Oracle to Snowflake Integration

Hevo Data, a No-code Data Pipeline, helps you directly transfer data from Oracle to Snowflake and other Data Warehouses, BI tools, or a destination of your choice in a completely hassle-free & automated manner.

The steps to load data from Oracle to Snowflake using Hevo Data are as follow:

Step 1: Configure Oracle as your Source

Connect your Oracle account to Hevo’s platform. Hevo has an in-built Oracle Integration that connects to your account within minutes.
- Log in to your Hevo account, and in the Navigation Bar, click PIPELINES.
- Next, in the Pipelines List View, click + CREATE.
- On the Select Source Type page, select Oracle.
- Specify the required information in the Configure your Oracle Source page to complete the source setup.

Configuring Oracle as Source in Hevo Data

Step 2: Choose Snowflake as your Destination

Select Snowflake as your destination and start moving your data.
- If you don’t already have a Snowflake account, read the documentation to know how to create one.
- Log in to your Snowflake account and configure your Snowflake warehouse by running this script.
- Next, obtain your Snowflake URL from your Snowflake warehouse by clicking on Admin > Accounts > LOCATOR.
- On your Hevo dashboard, click DESTINATIONS > + CREATE.
- Select Snowflake as the destination in the Add Destination page.
- Specify the required details in the Configure your Snowflake Warehouse page.
- Click TEST CONNECTION > SAVE & CONTINUE.

Configuring Snowflake as Destination in Hevo Data

With this, you have successfully set up Oracle to Snowflake Integration using Hevo Data.

For more details on Oracle to Snowflake integration, refer the Hevo documentation:

Learn how to set up Oracle as a source.
Learn how to set up Snowflake as a destination.

Here’s what the data scientist at Hornblower, a global leader in experiences and transportation, has to say about Hevo Data.

Data engineering is like an orchestra where you need the right people to play each instrument of their own, but Hevo Data is like a band on its own. So, you don’t need all the players.
– Karan Singh Khanuja, Data Scientist, Hornblower

Using Hevo as a solution to their data movement needs, they could easily migrate data to the warehouse without spending much on engineering resources. You can read the full story here.

Integrate Oracle to Snowflake

Get a Demo Try it

Integrate Oracle to BigQuery

Get a Demo Try it

Integrate Oracle to PostgreSQL

Get a Demo Try it

Method 2: Manual ETL Process to Set up Oracle to Snowflake Integration

Oracle and Snowflake are two distinct data storage options since their structures are very dissimilar. Although there is no direct way to load data from Oracle to Snowflake, using a mediator that connects to both Oracle and Snowflake can ease the process. Steps to move data from Oracle to Snowflake can be categorized as follows:

Step 1: Extract Data from Oracle to CSV using SQL*Plus
Step 2: Data Type Conversion and Other Transformations
Step 3: Staging Files to S3
Step 4: Finally, Copy Staged Files to the Snowflake Table

Let us go through these steps to connect Oracle to Snowflake in detail.

Step 1: Extract data from Oracle to CSV using SQL*Plus

SQL*Plus is a query tool installed with every Oracle Database Server or Client installation. It can be used to query and redirect the result of an SQL query to a CSV file. The command used for this is: Spool

Eg :

-- Turn on the spool

    spool spool_file.txt

 -- Run your Query

    select  * from dba_table;

 -- Turn of spooling

    spool off;

The spool file will not be visible until the command is turned off
If the Spool file doesn’t exist already, a new file will be created. If it exists, it will be overwritten by default. There is an append option from Oracle 10g which can be used to append to an existing file.

Most of the time the data extraction logic will be executed in a Shell script. Here is a very basic example script to extract full data from an Oracle table:

#!/usr/bin/bash

FILE="students.csv"


sqlplus -s user_name/password@oracle_db  <<EOF


SET PAGESIZE 35000

SET COLSEP "|"

SET LINESIZE 230

SET FEEDBACK OFF


SPOOL $FILE

SELECT * FROM EMP;


SPOOL OFF

EXIT

EOF#!/usr/bin/bash


FILE="emp.csv"


sqlplus -s scott/tiger@XE  <<EOF


SET PAGESIZE 50000

SET COLSEP ","

SET LINESIZE 200

SET FEEDBACK OFF


SPOOL $FILE


SELECT * FROM STUDENTS;


SPOOL OFF

EXIT

EOF

SET PAGESIZE – The number of lines per page. The header line will be there on every page.
SET COLSEP – Setting the column separator.
SET LINESIZE – The number of characters per line. The default is 80. You can set this to a value in a way that the entire record comes within a single line.
SET FEEDBACK OFF – In order to prevent logs from appearing in the CSV file, the feedback is put off.
SPOOL $FILE – The filename where you want to write the results of the query.
SELECT * FROM STUDENTS – The query to be executed to extract data from the table.
SPOOL OFF – To stop writing the contents of the SQL session to the file.

Incremental Data Extract

As discussed in the above section, once Spool is on, any SQL can be run and the result will be redirected to the specified file. To extract data incrementally, you need to generate SQL with proper conditions to select only records that are modified after the last data pull.

Eg:

select * from students where last_modified_time > last_pull_time and last_modified_time <= sys_time.

Now the result set will have only changed records after the last pull.

Step 2: Data type conversion and formatting

While transferring data from Oracle to Snowflake, data might have to be transformed as per business needs. Apart from such use case-specific changes, there are certain important things to be noted for smooth data movement. Also, check out Oracle to MySQL Integration.

Many errors can be caused by character sets mismatch in source and target. Note that Snowflake supports all major character sets including UTF-8 and UTF-16. The full list can be found here.
While moving data from Oracle to Big Data systems most of the time data integrity might be compromised due to lack of support for SQL constraints. Fortunately, Snowflake supports all SQL constraints like UNIQUE, PRIMARY KEY, FOREIGN KEY, NOT NULL constraints which is a great help for making sure data has moved as expected.
Snowflake’s type system covers most primitive and advanced data types which include nested data structures like struct and array. Below is the table with information on Oracle data types and the corresponding Snowflake counterparts.

Oracle Data Type	Snowflake Data Type
NUMBER	BYTEINT
NUMBER	SMALLINT
NUMBER	INTEGER
NUMBER	BIGINT
NUMBER	DECIMAL
FLOAT	FLOAT
NUMBER	NUMERIC
FLOAT	CHAR(Up to 16MB)
VARCHAR2/NVARCHAR2	VARCHAR(Up to 16MB)
CHAR(n)	CHAR VARYING(n)
FLOAT	REAL
DATE	DATE
TIMESTAMP(Only HH:MI:SS)	TIME
TIMESTAMP with TIMEZONE	TIMESTAMP_LTZ Aliases: TIMESTAMPLTZ, TIMESTAMP WITH LOCAL TIME ZONE
BINARY_FLOAT/BINARY_DOUBLE	BINARY(Up to 8M)
BINARY_FLOAT/BINARY_DOUBLE	VARBINARY
CLOB / VARCHAR2	VARIANT
CLOB / VARCHAR2	ARRAY

Often, date and time formats require a lot of attention while creating data pipelines. Snowflake is quite flexible here as well. If a custom format is used for dates or times in the file to be inserted into the table, this can be explicitly specified using “File Format Option”. The complete list of date and time formats can be found here.

Step 3: Stage Files to S3

To load data from Oracle to Snowflake, it has to be uploaded to a cloud staging area first. If you have your Snowflake instance running on AWS, then the data has to be uploaded to an S3 location that Snowflake has access to. This process is called staging. The snowflake stage can be either internal or external.

Internal Stage

If you chose to go with this option, each user and table will be automatically assigned to an internal stage which can be used to stage data related to that user or table. Internal stages can be even created explicitly with a name.

For a user, the default internal stage will be named as ‘@~’.
For a table, the default internal stage will have the same name as the table.
There is no option to alter or drop an internal default stage associated with a user or table.
Unlike named stages file format options cannot be set to default user or table stages.

If an internal stage is created explicitly by the user using SQL statements with a name, many data loading options can be assigned to the stage like file format, date format, etc. When data is loaded to a table through this stage those options are automatically applied.

Note: The rest of this document discusses many Snowflake commands. Snowflake comes with a very intuitive and stable web-based interface to run SQL and commands. However, if you prefer to work with a lightweight command-line utility to interact with the database you might like SnowSQL – a CLI client available in Linux/Mac/Windows to run Snowflake commands. Read more about the tool and options here.

Now let’s have a look at commands to create a stage:

Create a named internal stage my_oracle_stage and assign some default options:

create or replace stage my_oracle_stage

copy_options= (on_error='skip_file')

file_format= (type = 'CSV' field_delimiter = ',' skip_header = 1);

PUT is the command used to stage files to an internal Snowflake stage. The syntax of the PUT command is:

PUT file://path_to_your_file/your_filename internal_stage_name

Eg:

Upload a file items_data.csv in the /tmp/oracle_data/data/ directory to an internal stage named oracle_stage.

put file:////tmp/oracle_data/data/items_data.csv @oracle_stage;

While uploading the file you can set many configurations to enhance the data load performance like the number of parallelisms, automatic compression, etc. Complete information can be found here.

External Stage

Let us now look at the external staging option and understand how it differs from the internal stage. Snowflake supports any accessible Amazon S3 or Microsoft Azure as an external staging location. You can create a stage to pointing to the location data that can be loaded directly to the Snowflake table through that stage. No need to move the data to an internal stage.

If you want to create an external stage pointing to an S3 location, IAM credentials with proper access permissions are required. If data needs to be decrypted before loading to Snowflake, proper keys are to be provided. Here is an example to create an external stage:

create or replace stage oracle_ext_stage url='s3://snowflake_oracle/data/load/files/'

credentials=(aws_key_id='1d318jnsonmb5#dgd4rrb3c' aws_secret_key='aii998nnrcd4kx5y6z');

encryption=(master_key = 'eSxX0jzskjl22bNaaaDuOaO8=');

Once data is extracted from Oracle it can be uploaded to S3 using the direct upload option or using AWS SDK in your favorite programming language. Python’s boto3 is a popular one used under such circumstances. Once data is in S3, an external stage can be created to point to that location.

Step 4: Copy staged files to Snowflake table

So far – you have extracted data from Oracle, uploaded it to an S3 location, and created an external Snowflake stage pointing to that location. The next step is to copy data to the table. The command used to do this is COPY INTO. Note: To execute the COPY INTO command, compute resources in Snowflake virtual warehouses are required and your Snowflake credits will be utilized.

Eg:

To load from a named internal stage

copy into oracle_table

from @oracle_stage;

Loading from the external stage. Only one file is specified.

copy into my_ext_stage_table

from @oracle_ext_stage/tutorials/dataloading/items_ext.csv;

You can even copy directly from an external location without creating a stage:

copy into oracle_table

from s3://mybucket/oracle_snow/data/files

credentials=(aws_key_id='$AWS_ACCESS_KEY_ID' aws_secret_key='$AWS_SECRET_ACCESS_KEY')

encryption=(master_key = 'eSxX009jhh76jkIuLPH5r4BD09wOaO8=')

file_format = (format_name = csv_format);

Files can be specified using patterns

copy into oracle_pattern_table

from @oracle_stage

file_format = (type = 'TSV')

pattern='.*/.*/.*[.]csv[.]gz';

Some commonly used options for CSV file loading using the COPY command are:

DATE_FORMAT – Specify any custom date format you used in the file so that Snowflake can parse it properly.
TIME_FORMAT – Specify any custom date format you used in the file.
COMPRESSION – If your data is compressed, specify algorithms used to compress.
RECORD_DELIMITER – To mention lines separator character.
FIELD_DELIMITER – To indicate the character separating fields in the file.
SKIP_HEADER – This is the number of header lines to skipped while inserting data into the table.

Update Snowflake Table

We have discussed how to extract data incrementally from the Oracle table. Once data is extracted incrementally, it cannot be inserted into the target table directly. There will be new and updated records that have to be treated accordingly.

Earlier in this document, we mentioned that Snowflake supports SQL constraints. Adding to that, another surprising feature from Snowflake is support for row-level data manipulations which makes it easier to handle delta data load. The basic idea is to load incrementally extracted data into an intermediate or temporary table and modify records in the final table with data in the intermediate table. The three methods mentioned below are generally used for this.

1. Update the rows in the target table with new data (with the same keys). Then insert new rows from the intermediate or landing table which are not in the final table.

UPDATE oracle_target_table t  SET t.value = s.value  FROM  landing_delta_table in  WHERE t.id = in.id;

INSERT INTO oracle_target_table (id, value) 
SELECT id, value FROM  landing_delta_table WHERE NOT id IN (SELECT id FROM oracle_target_table);

2. Delete rows from the target table which are also in the landing table. Then insert all rows from the landing table to the final table. Now, the final table will have the latest data without duplicates

DELETE .oracle_target_table f  WHERE f.id IN (SELECT id from landing_table);  INSERT oracle_target_table (id, value)  
SELECT id, value FROM  landing_table;

3. MERGE Statement – Standard SQL merge statement which combines Inserts and updates. It is used to apply changes in the landing table to the target table with one SQL statement

MERGE into oracle_target_table t1 using landing_delta_table t2 on t1.id = t2.id 
WHEN matched then update set value = t2.value 
WHEN not matched then 
INSERT (id, value) values (t2.id, t2.value);

This method of connecting Oracle to Snowflake works when you have a comfortable project timeline and a pool of experienced engineering resources that can build and maintain the pipeline. However, the method mentioned above comes with a lot of coding and maintenance overhead.

Limitations of Manual ETL Process

Here are some of the challenges of migrating from Oracle to Snowflake.

Cost: The cost of hiring an ETL Developer to construct an oracle to Snowflake ETL pipeline might not be favorable in terms of expenses. Method 1 is not a cost-efficient option.
Maintenance: Maintenance is very important for the data processing system; hence your ETL codes need to be updated regularly due to the fact that development tools upgrade their dependencies and industry standards change. Also, maintenance consumes precious engineering bandwidth which might be utilized elsewhere.
Scalability: Indeed, scalability is paramount! ETL systems can fail over time if conditions for processing fails. For example, what if incoming data increases 10X, can your processes handle such a sudden increase in load? A question like this requires serious thinking while opting for the manual ETL Code approach.

Benefits of Replicating Data from Oracle to Snowflake

Many business applications are replicating data from Oracle to Snowflake, not only because of the superior scalability but also because of the other advantages that set Snowflake apart from traditional Oracle environments. Many businesses use an Oracle to Snowflake converter to help facilitate this data migration.

Some of the benefits of data migration from Oracle to Snowflake include:

Snowflake promises high computational power. In case there are many concurrent users running complex queries, the computational power of the Snowflake instance can be changed dynamically. This ensures that there is less waiting time for complex query executions.
The agility and elasticity offered by the Snowflake Cloud Data warehouse solution are unmatched. This gives you the liberty to scale only when you needed and pay for what you use.
Snowflake is a completely managed service. This means you can get your analytics projects running with minimal engineering resources.
Snowflake gives you the liberty to work seamlessly with Semi-structured data. Analyzing this in Oracle is super hard.

Conclusion

In this article, you have learned about two different approaches to set up Oracle to Snowflake Integration. The manual method involves the use of SQL*Plus and also staging the files to Amazon S3 before copying them into the Snowflake Data Warehouse. This method requires more effort and engineering bandwidth to connect Oracle to Snowflake. Whereas, if you require real-time data replication and looking for a fully automated real-time solution, then Hevo is the right choice for you. The many benefits of migrating from Oracle to Snowflake make it an attractive solution.

Learn more about Hevo

Want to try Hevo?

FAQs to connect Oracle to Snowflake

1. How do you migrate from Oracle to Snowflake?

To migrate from Oracle to Snowflake, export data from Oracle using tools like Oracle Data Pump or SQL Developer, transform it as necessary, then load it into Snowflake using Snowflake’s COPY command or bulk data loading tools like SnowSQL or third-party ETL tools like Hevo Data.

2. What is the most efficient way to load data into Snowflake?

The most efficient way to load data into Snowflake is through its bulk loading options like Snowflake’s COPY command, which supports loading data in parallel directly from cloud storage (e.g., AWS S3, Azure Blob Storage) into tables, ensuring fast and scalable data ingestion.

3. Why move from SQL Server to Snowflake?

Moving from SQL Server to Snowflake offers advantages such as scalable cloud architecture with separate compute and storage, eliminating infrastructure management, and enabling seamless integration with modern data pipelines and analytics tools for improved performance and cost-efficiency.

Faisal K K Freelance Technical Content Writer, Hevo Data

Faisal loves data science and combines his problem-solving ability and passion for writing to help data teams in solving complex business problems.

Oracle to Snowflake: Data Migration in 2 Easy Methods

Overview of Oracle

Overview of Snowflake

Advantages of Snowflake

Why Choose Snowflake over Oracle?

Common Challenges of Migration from Oracle to Snowflake

Methods to Set up Oracle to Snowflake Integration

Method 1: Using Hevo Data to Set up Oracle to Snowflake Integration

Step 1: Configure Oracle as your Source

Step 2: Choose Snowflake as your Destination

Method 2: Manual ETL Process to Set up Oracle to Snowflake Integration

Step 1: Extract data from Oracle to CSV using SQL*Plus

Incremental Data Extract

Step 2: Data type conversion and formatting

Step 3: Stage Files to S3

Internal Stage

External Stage

Step 4: Copy staged files to Snowflake table

Update Snowflake Table

Limitations of Manual ETL Process

Benefits of Replicating Data from Oracle to Snowflake

Conclusion

FAQs to connect Oracle to Snowflake

1. How do you migrate from Oracle to Snowflake?

2. What is the most efficient way to load data into Snowflake?

3. Why move from SQL Server to Snowflake?

Continue Reading