Migrating from Oracle to Snowflake can be a game-changer for businesses looking to modernize their data infrastructure. While Oracle has long been a reliable choice for on-premise databases, Snowflake offers a cloud-native solution that’s designed for scalability, flexibility, and cost-efficiency.

In this blog, I’ll walk you through the need for Oracle and Snowflake and then explore two methods for migration: manual migration and using automated tools. By the end, you’ll have a clear understanding of how to seamlessly transition from Oracle to Snowflake, no matter your expertise level. Let’s dive in!

Getting to Know Oracle

Oracle Logo

Oracle Database is a robust relational database management system (RDBMS) known for its scalability, reliability, and advanced features like high availability and security. Oracle offers an integrated portfolio of cloud services featuring IaaS, PaaS, and SaaS, posing competition to big cloud providers. The company also designs and markets enterprise software solutions in the areas of ERP, CRM, SCM, and HCM, addressing a wide range of industries such as finance, health, and telecommunication institutions.

Key Features of Oracle

  1. Multimodel database: It has the facility to facilitate multiple data models, including relational, document, graph, and key-value, hence giving flexible facilities in managing diversified kinds of data.
  2. ACID Compliance: Guarantees transaction integrity through Atomicity, Consistency, Isolation, and Durability; hence, it is reliable for critical business operations.
  3. High Availability: Advanced features such as Real Application Clusters (RAC), Data Guard, and Flashback Technology ensure continuous uptime, disaster recovery, and data protection.
Solve your data replication problems with Hevo’s reliable, no-code, automated pipelines with 150+ connectors.
Get your free trial right away!

Getting to Know Snowflake

Snowflake Logo

Snowflake is a cloud-based data warehousing platform designed for modern data analytics and processing. Snowflake separates compute, storage, and services. Therefore, they may scale independently with a SQL data warehouse for querying and analyzing structured and semi-structured data stored in Amazon S3 or Azure Blob Storage.

Advantages of Snowflake

  • Scalability: Using Snowflake, you can automatically scale the compute and storage resources to manage varying workloads without human intervention.
  • Supports Concurrency: Snowflake delivers high performance when dealing with multiple users supporting mixed workloads without performance degradation.
  • Efficient Performance: Snowflake’s unique architecture allows for optimized query performance, with particular techniques applied in columnar storage, query optimization, and caching.

Why Snowflake Over Oracle?

  • Cloud-native architecture: Snowflake is designed specifically for the cloud, offering superior scalability, flexibility, and performance.
  • Separation of storage and compute: Snowflake allows independent scaling of storage and compute resources, enabling cost-effective usage and significant savings.
  • Cost efficiency: You only pay for the compute and storage you actually use, avoiding unnecessary expenses.
  • Simplicity and ease of use: Snowflake requires minimal maintenance, making it ideal for large-scale data operations.
  • Support for diverse data types: Snowflake can handle a wide variety of data types, enhancing its versatility for modern data needs.
  • Data sharing and collaboration: Snowflake’s robust data-sharing capabilities enable secure and seamless collaboration across teams.
  • Strong security features: Snowflake provides enterprise-grade security to ensure safe data management and compliance.

Choosing between data platforms is crucial, especially when integrating Oracle with databases such as Snowflake or Databricks to enhance your data architecture.

Integrate Oracle with Snowflake in a hassle-free manner.

Method 1: Using Hevo Data to Set up Oracle to Snowflake Integration

Using Hevo Data, a No-code Data Pipeline, you can directly transfer data from Oracle to Snowflake and other Data Warehouses, BI tools, or a destination of your choice in a completely hassle-free & automated manner.

Method 2: Manual ETL Process to Set up Oracle to Snowflake Integration

In this method, you can convert your Oracle data to a CSV file using SQL plus and then transform it according to the compatibility. You then can stage the files in S3 and ultimately load them into Snowflake using the COPY command. This method can be time taking and can lead to data inconsistency.

Get Started with Hevo for Free

What are the Methods to Set up Oracle to Snowflake Integration?

There are many ways of loading data from Oracle to Snowflake. In this blog, you will be going to look into two popular ways. Also you can read our article on Snowflake Excel integration.

In the end, you will have a good understanding of each of these two methods. This will help you to make the right decision based on your use case:

Method 1: Using Hevo Data to Set up Oracle to Snowflake Integration

Step 1: Configure Oracle as your Source

Configuring Oracle as Source in Hevo Data

Step 2: Choose Snowflake as your Destination

Configuring Snowflake as Destination in Hevo Data

With this, you have successfully set up Oracle to Snowflake Integration using Hevo Data.

For more details on Oracle to Snowflake integration, refer the Hevo documentation:

Here’s what the data scientist at Hornblower, a global leader in experiences and transportation, has to say about Hevo Data.

Data engineering is like an orchestra where you need the right people to play each instrument of their own, but Hevo Data is like a band on its own. So, you don’t need all the players.

– Karan Singh Khanuja, Data Scientist, Hornblower

Using Hevo as a solution to their data movement needs, they could easily migrate data to the warehouse without spending much on engineering resources. You can read the full story here.

Integrate Oracle to Snowflake
Integrate Oracle to BigQuery
Integrate Oracle to PostgreSQL

Method 2: Manual ETL Process to Set up Oracle to Snowflake Integration

Oracle and Snowflake are two distinct data storage options since their structures are very dissimilar. Although there is no direct way to load data from Oracle to Snowflake, using a mediator that connects to both Oracle and Snowflake can ease the process. Steps to move data from Oracle to Snowflake can be categorized as follows:

Let us go through these steps to connect Oracle to Snowflake in detail.

Step 1: Extract data from Oracle to CSV using SQL*Plus

SQL*Plus is a query tool installed with every Oracle Database Server or Client installation. It can be used to query and redirect the result of an SQL query to a CSV file. The command used for this is: Spool

Eg : 

-- Turn on the spool

    spool spool_file.txt

 -- Run your Query

    select  * from dba_table;

 -- Turn of spooling

    spool off;
  • The spool file will not be visible until the command is turned off
  • If the Spool file doesn’t exist already, a new file will be created. If it exists, it will be overwritten by default. There is an append option from Oracle 10g which can be used to append to an existing file.

Most of the time the data extraction logic will be executed in a Shell script. Here is a very basic example script to extract full data from an Oracle table:

#!/usr/bin/bash

FILE="students.csv"


sqlplus -s user_name/password@oracle_db  <<EOF


SET PAGESIZE 35000

SET COLSEP "|"

SET LINESIZE 230

SET FEEDBACK OFF


SPOOL $FILE

SELECT * FROM EMP;


SPOOL OFF

EXIT

EOF#!/usr/bin/bash


FILE="emp.csv"


sqlplus -s scott/tiger@XE  <<EOF


SET PAGESIZE 50000

SET COLSEP ","

SET LINESIZE 200

SET FEEDBACK OFF


SPOOL $FILE


SELECT * FROM STUDENTS;


SPOOL OFF

EXIT

EOF
  • SET PAGESIZE –  The number of lines per page. The header line will be there on every page. 
  • SET COLSEP  – Setting the column separator.
  • SET LINESIZE  – The number of characters per line. The default is 80. You can set this to a value in a way that the entire record comes within a single line.
  • SET FEEDBACK OFF – In order to prevent logs from appearing in the CSV file, the feedback is put off.
  • SPOOL $FILE – The filename where you want to write the results of the query.
  • SELECT * FROM STUDENTS – The query to be executed to extract data from the table.
  • SPOOL OFF – To stop writing the contents of the SQL session to the file.

Incremental Data Extract

As discussed in the above section, once Spool is on, any SQL can be run and the result will be redirected to the specified file. To extract data incrementally, you need to generate SQL with proper conditions to select only records that are modified after the last data pull.

Eg:

select * from students where last_modified_time > last_pull_time and last_modified_time <= sys_time.

Now the result set will have only changed records after the last pull. 

Step 2: Data type conversion and formatting

While transferring data from Oracle to Snowflake, data might have to be transformed as per business needs. Apart from such use case-specific changes, there are certain important things to be noted for smooth data movement. Also, check out Oracle to MySQL Integration.

  • Many errors can be caused by character sets mismatch in source and target. Note that Snowflake supports all major character sets including UTF-8 and UTF-16. The full list can be found here.
  • While moving data from Oracle to Big Data systems most of the time data integrity might be compromised due to lack of support for SQL constraints. Fortunately, Snowflake supports all SQL constraints like UNIQUE, PRIMARY KEY, FOREIGN KEY, NOT NULL constraints which is a great help for making sure data has moved as expected.
  • Snowflake’s type system covers most primitive and advanced data types which include nested data structures like struct and array. Below is the table with information on Oracle data types and the corresponding Snowflake counterparts.
Oracle Data TypeSnowflake Data Type
NUMBERBYTEINT
NUMBERSMALLINT
NUMBERINTEGER
NUMBERBIGINT
NUMBERDECIMAL
FLOATFLOAT
NUMBERNUMERIC
FLOATCHAR(Up to 16MB)
VARCHAR2/NVARCHAR2VARCHAR(Up to 16MB)
CHAR(n)CHAR VARYING(n)
FLOATREAL
DATEDATE
TIMESTAMP(Only HH:MI:SS)TIME
TIMESTAMP with TIMEZONETIMESTAMP_LTZ
Aliases:
TIMESTAMPLTZ,
TIMESTAMP WITH LOCAL TIME ZONE
BINARY_FLOAT/BINARY_DOUBLEBINARY(Up to 8M)
BINARY_FLOAT/BINARY_DOUBLEVARBINARY
CLOB / VARCHAR2VARIANT
CLOB / VARCHAR2ARRAY
  • Often, date and time formats require a lot of attention while creating data pipelines. Snowflake is quite flexible here as well. If a custom format is used for dates or times in the file to be inserted into the table, this can be explicitly specified using “File Format Option”. The complete list of date and time formats can be found here.

Step 3: Stage Files to S3

To load data from Oracle to Snowflake, it has to be uploaded to a cloud staging area first. If you have your Snowflake instance running on AWS, then the data has to be uploaded to an S3 location that Snowflake has access to. This process is called staging. The snowflake stage can be either internal or external.

A. Internal Stage

If you chose to go with this option, each user and table will be automatically assigned to an internal stage which can be used to stage data related to that user or table. Internal stages can be even created explicitly with a name.

  • For a user, the default internal stage will be named as ‘@~’.
  • For a table, the default internal stage will have the same name as the table.
  • There is no option to alter or drop an internal default stage associated with a user or table.
  • Unlike named stages file format options cannot be set to default user or table stages.

If an internal stage is created explicitly by the user using  SQL statements with a name, many data loading options can be assigned to the stage like file format, date format, etc. When data is loaded to a table through this stage those options are automatically applied.

Note: The rest of this document discusses many Snowflake commands. Snowflake comes with a very intuitive and stable web-based interface to run SQL and commands. However, if you prefer to work with a lightweight command-line utility to interact with the database you might like SnowSQL – a CLI client available in Linux/Mac/Windows to run Snowflake commands. Read more about the tool and options here.

Now let’s have a look at commands to create a stage:

Create a named internal stage my_oracle_stage  and assign some default options:

create or replace stage my_oracle_stage

copy_options= (on_error='skip_file')

file_format= (type = 'CSV' field_delimiter = ',' skip_header = 1);

PUT is the command used to stage files to an internal Snowflake stage. The syntax of the PUT command is:

PUT file://path_to_your_file/your_filename internal_stage_name

Eg:

Upload a file items_data.csv in the /tmp/oracle_data/data/ directory to an internal stage named oracle_stage.

put file:////tmp/oracle_data/data/items_data.csv @oracle_stage;

While uploading the file you can set many configurations to enhance the data load performance like the number of parallelisms, automatic compression, etc. Complete information can be found here

B. External Stage

Let us now look at the external staging option and understand how it differs from the internal stage. Snowflake supports any accessible Amazon S3 or Microsoft Azure as an external staging location. You can create a stage to pointing to the location data that can be loaded directly to the Snowflake table through that stage. No need to move the data to an internal stage.

If you want to create an external stage pointing to an S3 location, IAM credentials with proper access permissions are required. If data needs to be decrypted before loading to Snowflake, proper keys are to be provided. Here is  an example to create an external stage:

create or replace stage oracle_ext_stage url='s3://snowflake_oracle/data/load/files/'

credentials=(aws_key_id='1d318jnsonmb5#dgd4rrb3c' aws_secret_key='aii998nnrcd4kx5y6z');

encryption=(master_key = 'eSxX0jzskjl22bNaaaDuOaO8=');

Once data is extracted from Oracle it can be uploaded to S3 using the direct upload option or using AWS SDK in your favorite programming language. Python’s boto3 is a popular one used under such circumstances. Once data is in S3, an external stage can be created to point to that location. 

Step 4: Copy staged files to the Snowflake table

So far – you have extracted data from Oracle, uploaded it to an S3 location, and created an external Snowflake stage pointing to that location. The next step is to copy data to the table. The command used to do this is COPY INTO. Note: To execute the COPY INTO command, compute resources in Snowflake virtual warehouses are required and your Snowflake credits will be utilized.

Eg:

A. To load from a named internal stage

copy into oracle_table

from @oracle_stage;

B. Loading from the external stage. Only one file is specified.

copy into my_ext_stage_table

from @oracle_ext_stage/tutorials/dataloading/items_ext.csv;

C. You can even copy directly from an external location without creating a stage:

copy into oracle_table

from s3://mybucket/oracle_snow/data/files

credentials=(aws_key_id='$AWS_ACCESS_KEY_ID' aws_secret_key='$AWS_SECRET_ACCESS_KEY')

encryption=(master_key = 'eSxX009jhh76jkIuLPH5r4BD09wOaO8=')

file_format = (format_name = csv_format);

D. Files can be specified using patterns

copy into oracle_pattern_table

from @oracle_stage

file_format = (type = 'TSV')

pattern='.*/.*/.*[.]csv[.]gz';

Some commonly used options for CSV file loading using the COPY command are:

  • DATE_FORMAT – Specify any custom date format you used in the file so that Snowflake can parse it properly.
  • TIME_FORMAT – Specify any custom date format you used in the file.
  • COMPRESSION –  If your data is compressed, specify algorithms used to compress. 
  • RECORD_DELIMITER –  To mention lines separator character.
  • FIELD_DELIMITER – To indicate the character separating fields in the file.
  • SKIP_HEADER – This is the number of header lines to skipped while inserting data into the table.

Step 5: Update Snowflake Table

We have discussed how to extract data incrementally from the Oracle table. Once data is extracted incrementally, it cannot be inserted into the target table directly. There will be new and updated records that have to be treated accordingly.

Earlier in this document, we mentioned that Snowflake supports SQL constraints. Adding to that, another surprising feature from Snowflake is support for row-level data manipulations which makes it easier to handle delta data load. The basic idea is to load incrementally extracted data into an intermediate or temporary table and modify records in the final table with data in the intermediate table. The three methods mentioned below are generally used for this.

1. Update the rows in the target table with new data (with the same keys). Then insert new rows from the intermediate or landing table which are not in the final table.

UPDATE oracle_target_table t  SET t.value = s.value  FROM  landing_delta_table in  WHERE t.id = in.id;
INSERT INTO oracle_target_table (id, value) 
SELECT id, value FROM  landing_delta_table WHERE NOT id IN (SELECT id FROM oracle_target_table);

2. Delete rows from the target table which are also in the landing table. Then insert all rows from the landing table to the final table. Now, the final table will have the latest data without duplicates

DELETE .oracle_target_table f  WHERE f.id IN (SELECT id from landing_table);  INSERT oracle_target_table (id, value)  
SELECT id, value FROM  landing_table;

3. MERGE Statement – Standard SQL merge statement which combines Inserts and updates. It is used to apply changes in the landing table to the target table with one SQL statement

MERGE into oracle_target_table t1 using landing_delta_table t2 on t1.id = t2.id 
WHEN matched then update set value = t2.value 
WHEN not matched then 
INSERT (id, value) values (t2.id, t2.value);

This method of connecting Oracle to Snowflake works when you have a comfortable project timeline and a pool of experienced engineering resources that can build and maintain the pipeline. However, the method mentioned above comes with a lot of coding and maintenance overhead.

Limitations of Manual ETL Process

Here are some of the challenges of migrating from Oracle to Snowflake.

  • Cost: The cost of hiring an ETL Developer to construct an oracle to Snowflake ETL pipeline might not be favorable in terms of expenses. Method 1 is not a cost-efficient option.
  • Maintenance: Maintenance is very important for the data processing system; hence your ETL codes need to be updated regularly due to the fact that development tools upgrade their dependencies and industry standards change. Also, maintenance consumes precious engineering bandwidth which might be utilized elsewhere.
  • Scalability: Indeed, scalability is paramount! ETL systems can fail over time if conditions for processing fails. For example, what if incoming data increases 10X, can your processes handle such a sudden increase in load? A question like this requires serious thinking while opting for the manual ETL Code approach.

If you are using tools like Oracle SQL Developer, then you can export data to CSV using Oracle SQL Developer and use it to migrate it to Snowflake.

Benefits of Replicating Data from Oracle to Snowflake

Many business applications are replicating data from Oracle to Snowflake, not only because of the superior scalability but also because of the other advantages that set Snowflake apart from traditional Oracle environments. Many businesses use an Oracle to Snowflake converter to help facilitate this data migration. 

Some of the benefits of data migration from Oracle to Snowflake include:

  • Snowflake promises high computational power. In case there are many concurrent users running complex queries, the computational power of the Snowflake instance can be changed dynamically. This ensures that there is less waiting time for complex query executions.
  • The agility and elasticity offered by the Snowflake Cloud Data warehouse solution are unmatched. This gives you the liberty to scale only when you needed and pay for what you use. 
  • Snowflake is a completely managed service. This means you can get your analytics projects running with minimal engineering resources. 
  • Snowflake gives you the liberty to work seamlessly with Semi-structured data. Analyzing this in Oracle is super hard.

Oracle to Snowflake Migration Best Practices

  1. Assessment and Planning: Begin by thoroughly assessing your Oracle database. Understand the data volume, complexity, and dependencies. This will help you plan the migration more effectively.
  2. Data Cleansing: Before migration, ensure your data is clean and consistent. It’s easier to handle data issues before they enter Snowflake.
  3. Incremental Migration: Instead of moving everything at once, consider an incremental approach. This allows you to validate the data at each stage and ensures a smoother transition.
  4. Data Transformation: Leverage Snowflake’s capabilities to transform your data during migration. This could involve normalizing data types, applying business logic, or re-architecting the data model for better performance in Snowflake.
  5. Testing and Validation: After migrating, testing and validating your data is crucial. This step ensures that the data has been accurately transferred and performs well in Snowflake.
  6. Optimize and Automate: Post-migration, look for opportunities to optimize your Snowflake setup. Automate regular tasks like data loading and transformation using Snowflake’s built-in features and third-party tools.
  7. Use CDC: Make sure Oracle CDC is enabled while replicating data to snowflake to ensure real-time data sync for faster insights.

By following these best practices, you can ensure a seamless migration from Oracle to Snowflake, setting your data up for success in its new environment.

Common Challenges of Migration from Oracle to Snowflake

Let us also discuss the common challenges you might face while migrating your data from Oracle to Snowflake.

  1. Performance Tuning: Optimizing Snowflake performance to Oracle’s performance levels requires knowledge of Snowflake’s capabilities and the tuning configurations it offers, among many other special features such as clustering keys and auto-scaling.
  2. Architectural Differences: Oracle has a traditional on-premises architecture, while Snowflake has a cloud-native architecture. This makes adapting existing applications and workflows developed for one environment into another quite challenging.
  3. Compatibility Issues: Oracle and Snowflake have differences in SQL dialects, data types, and procedural languages that will have to be changed in queries, scripts, and applications to be migrated for compatibility and optimal performance.

Conclusion

In this article, you have learned about two different approaches to set up Oracle to Snowflake Integration. The manual method involves the use of SQL*Plus and also staging the files to Amazon S3 before copying them into the Snowflake Data Warehouse. This method requires more effort and engineering bandwidth to connect Oracle to Snowflake. Whereas, if you require real-time data replication and looking for a fully automated real-time solution, then Hevo is the right choice for you. The many benefits of migrating from Oracle to Snowflake make it an attractive solution. 

Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

FAQs to connect Oracle to Snowflake

1. How do you migrate from Oracle to Snowflake?

To migrate from Oracle to Snowflake, export data from Oracle using tools like Oracle Data Pump or SQL Developer, transform it as necessary, then load it into Snowflake using Snowflake’s COPY command or bulk data loading tools like SnowSQL or third-party ETL tools like Hevo Data.

2. What is the most efficient way to load data into Snowflake?

The most efficient way to load data into Snowflake is through its bulk loading options like Snowflake’s COPY command, which supports loading data in parallel directly from cloud storage (e.g., AWS S3, Azure Blob Storage) into tables, ensuring fast and scalable data ingestion.

3. Why move from SQL Server to Snowflake?

Moving from SQL Server to Snowflake offers advantages such as scalable cloud architecture with separate compute and storage, eliminating infrastructure management, and enabling seamless integration with modern data pipelines and analytics tools for improved performance and cost-efficiency.

mm
Freelance Technical Content Writer, Hevo Data

Faisal loves data science and combines his problem-solving ability and passion for writing to help data teams in solving complex business problems.