Migrating from Oracle to Snowflake? This guide outlines two straightforward methods to move your data. Learn how to leverage Snowflake’s cloud architecture to access insights from your Oracle databases.
Ultimately, you can choose the best of both methods based on your business requirements. Read along to learn how to migrate data seamlessly from Oracle to Snowflake.
Overview of Oracle
Oracle Database is a robust relational database management system (RDBMS) known for its scalability, reliability, and advanced features like high availability and security. Oracle offers an integrated portfolio of cloud services featuring IaaS, PaaS, and SaaS, posing competition to big cloud providers. The company also designs and markets enterprise software solutions in the areas of ERP, CRM, SCM, and HCM, addressing a wide range of industries such as finance, health, and telecommunication institutions.
Key Features of Oracle
- Multimodel database: It has the facility to facilitate multiple data models, including relational, document, graph, and key-value, hence giving flexible facilities in managing diversified kinds of data.
- ACID Compliance: Guarantees transaction integrity through Atomicity, Consistency, Isolation, and Durability; hence, it is reliable for critical business operations.
- High Availability: Advanced features such as Real Application Clusters (RAC), Data Guard, and Flashback Technology ensure continuous uptime, disaster recovery, and data protection.
Overview of Snowflake
Snowflake is a cloud-based data warehousing platform designed for modern data analytics and processing. Snowflake separates compute, storage, and services. Therefore, they may scale independently with a SQL data warehouse for querying and analyzing structured and semi-structured data stored in Amazon S3 or Azure Blob Storage.
Advantages of Snowflake
- Scalability: Using Snowflake, you can automatically scale the compute and storage resources to manage varying workloads without any human intervention.
- Supports Concurrency: Snowflake delivers high performance when dealing with multiple users supporting mixed workloads without performance degradation.
- Efficient Performance: You can achieve optimized query performance through the unique architecture of Snowflake, with particular techniques applied in columnar storage, query optimization, and caching.
Why Snowflake Over Oracle?
When choosing a cloud data platform, you might find yourself weighing Snowflake against Oracle. So, why should you consider Snowflake over Oracle? Snowflake’s architecture is built for the cloud from the ground up, providing unmatched scalability, flexibility, and performance. Unlike Oracle, Snowflake separates storage and compute, allowing you to scale these resources independently. This means you only pay for what you use, leading to significant cost savings.
Moreover, Snowflake’s simplicity is a huge advantage. It’s easy to use with minimal maintenance, which can be a game-changer, especially if you’re dealing with large-scale data operations. It supports a variety of data types, making it a versatile choice for modern data needs. On top of that, Snowflake’s data-sharing capabilities and robust security features provide a secure and seamless way to collaborate and manage your data.
Common Challenges of Migration from Oracle to Snowflake
Let us also discuss the common challenges you might face while migrating your data from Oracle to Snowflake.
- Architectural Differences: Oracle has a traditional on-premises architecture, while Snowflake has a cloud-native architecture. This makes adapting existing applications and workflows developed for one environment into another quite challenging.
- Compatibility Issues: Oracle and Snowflake have differences in SQL dialects, data types, and procedural languages that will have to be changed in queries, scripts, and applications to be migrated for compatibility and optimal performance.
- Performance Tuning: Optimizing Snowflake performance to Oracle’s performance levels requires knowledge of Snowflake’s capabilities and the tuning configurations it offers, among many other special features such as clustering keys and auto-scaling.
Methods to Set up Oracle to Snowflake Integration
There are many ways of loading data from Oracle to Snowflake. In this blog, you will be going to look into two popular ways. Also you can read our article on Snowflake Excel integration.
In the end, you will have a good understanding of each of these two methods. This will help you to make the right decision based on your use case:
Method 1: Using Hevo Data to Set up Oracle to Snowflake Integration
Hevo Data, a No-code Data Pipeline, helps you directly transfer data from Oracle to Snowflake and other Data Warehouses, BI tools, or a destination of your choice in a completely hassle-free & automated manner.
The steps to load data from Oracle to Snowflake using Hevo Data are as follow:
Step 1: Configure Oracle as your Source
- Connect your Oracle account to Hevo’s platform. Hevo has an in-built Oracle Integration that connects to your account within minutes.
- Log in to your Hevo account, and in the Navigation Bar, click PIPELINES.
- Next, in the Pipelines List View, click + CREATE.
- On the Select Source Type page, select Oracle.
- Specify the required information in the Configure your Oracle Source page to complete the source setup.
Step 2: Choose Snowflake as your Destination
- Select Snowflake as your destination and start moving your data.
- If you don’t already have a Snowflake account, read the documentation to know how to create one.
- Log in to your Snowflake account and configure your Snowflake warehouse by running this script.
- Next, obtain your Snowflake URL from your Snowflake warehouse by clicking on Admin > Accounts > LOCATOR.
- On your Hevo dashboard, click DESTINATIONS > + CREATE.
- Select Snowflake as the destination in the Add Destination page.
- Specify the required details in the Configure your Snowflake Warehouse page.
- Click TEST CONNECTION > SAVE & CONTINUE.
With this, you have successfully set up Oracle to Snowflake Integration using Hevo Data.
For more details on Oracle to Snowflake integration, refer the Hevo documentation:
Here’s what the data scientist at Hornblower, a global leader in experiences and transportation, has to say about Hevo Data.
Data engineering is like an orchestra where you need the right people to play each instrument of their own, but Hevo Data is like a band on its own. So, you don’t need all the players.
– Karan Singh Khanuja, Data Scientist, Hornblower
Using Hevo as a solution to their data movement needs, they could easily migrate data to the warehouse without spending much on engineering resources. You can read the full story here.
Integrate Oracle to Snowflake
Integrate Oracle to BigQuery
Integrate Oracle to PostgreSQL
Method 2: Manual ETL Process to Set up Oracle to Snowflake Integration
Oracle and Snowflake are two distinct data storage options since their structures are very dissimilar. Although there is no direct way to load data from Oracle to Snowflake, using a mediator that connects to both Oracle and Snowflake can ease the process. Steps to move data from Oracle to Snowflake can be categorized as follows:
Let us go through these steps to connect Oracle to Snowflake in detail.
Step 1: Extract data from Oracle to CSV using SQL*Plus
SQL*Plus is a query tool installed with every Oracle Database Server or Client installation. It can be used to query and redirect the result of an SQL query to a CSV file. The command used for this is: Spool
Eg :
-- Turn on the spool
spool spool_file.txt
-- Run your Query
select * from dba_table;
-- Turn of spooling
spool off;
- The spool file will not be visible until the command is turned off
- If the Spool file doesn’t exist already, a new file will be created. If it exists, it will be overwritten by default. There is an append option from Oracle 10g which can be used to append to an existing file.
Most of the time the data extraction logic will be executed in a Shell script. Here is a very basic example script to extract full data from an Oracle table:
#!/usr/bin/bash
FILE="students.csv"
sqlplus -s user_name/password@oracle_db <<EOF
SET PAGESIZE 35000
SET COLSEP "|"
SET LINESIZE 230
SET FEEDBACK OFF
SPOOL $FILE
SELECT * FROM EMP;
SPOOL OFF
EXIT
EOF#!/usr/bin/bash
FILE="emp.csv"
sqlplus -s scott/tiger@XE <<EOF
SET PAGESIZE 50000
SET COLSEP ","
SET LINESIZE 200
SET FEEDBACK OFF
SPOOL $FILE
SELECT * FROM STUDENTS;
SPOOL OFF
EXIT
EOF
- SET PAGESIZE – The number of lines per page. The header line will be there on every page.
- SET COLSEP – Setting the column separator.
- SET LINESIZE – The number of characters per line. The default is 80. You can set this to a value in a way that the entire record comes within a single line.
- SET FEEDBACK OFF – In order to prevent logs from appearing in the CSV file, the feedback is put off.
- SPOOL $FILE – The filename where you want to write the results of the query.
- SELECT * FROM STUDENTS – The query to be executed to extract data from the table.
- SPOOL OFF – To stop writing the contents of the SQL session to the file.
Incremental Data Extract
As discussed in the above section, once Spool is on, any SQL can be run and the result will be redirected to the specified file. To extract data incrementally, you need to generate SQL with proper conditions to select only records that are modified after the last data pull.
Eg:
select * from students where last_modified_time > last_pull_time and last_modified_time <= sys_time.
Now the result set will have only changed records after the last pull.
Integrate your data seamlessly
No credit card required
Step 2: Data type conversion and formatting
While transferring data from Oracle to Snowflake, data might have to be transformed as per business needs. Apart from such use case-specific changes, there are certain important things to be noted for smooth data movement. Also, check out Oracle to MySQL Integration.
- Many errors can be caused by character sets mismatch in source and target. Note that Snowflake supports all major character sets including UTF-8 and UTF-16. The full list can be found here.
- While moving data from Oracle to Big Data systems most of the time data integrity might be compromised due to lack of support for SQL constraints. Fortunately, Snowflake supports all SQL constraints like UNIQUE, PRIMARY KEY, FOREIGN KEY, NOT NULL constraints which is a great help for making sure data has moved as expected.
- Snowflake’s type system covers most primitive and advanced data types which include nested data structures like struct and array. Below is the table with information on Oracle data types and the corresponding Snowflake counterparts.
Oracle Data Type | Snowflake Data Type |
NUMBER | BYTEINT |
NUMBER | SMALLINT |
NUMBER | INTEGER |
NUMBER | BIGINT |
NUMBER | DECIMAL |
FLOAT | FLOAT |
NUMBER | NUMERIC |
FLOAT | CHAR(Up to 16MB) |
VARCHAR2/NVARCHAR2 | VARCHAR(Up to 16MB) |
CHAR(n) | CHAR VARYING(n) |
FLOAT | REAL |
DATE | DATE |
TIMESTAMP(Only HH:MI:SS) | TIME |
TIMESTAMP with TIMEZONE | TIMESTAMP_LTZ Aliases: TIMESTAMPLTZ, TIMESTAMP WITH LOCAL TIME ZONE |
BINARY_FLOAT/BINARY_DOUBLE | BINARY(Up to 8M) |
BINARY_FLOAT/BINARY_DOUBLE | VARBINARY |
CLOB / VARCHAR2 | VARIANT |
CLOB / VARCHAR2 | ARRAY |
- Often, date and time formats require a lot of attention while creating data pipelines. Snowflake is quite flexible here as well. If a custom format is used for dates or times in the file to be inserted into the table, this can be explicitly specified using “File Format Option”. The complete list of date and time formats can be found here.
Step 3: Stage Files to S3
To load data from Oracle to Snowflake, it has to be uploaded to a cloud staging area first. If you have your Snowflake instance running on AWS, then the data has to be uploaded to an S3 location that Snowflake has access to. This process is called staging. The snowflake stage can be either internal or external.
Internal Stage
If you chose to go with this option, each user and table will be automatically assigned to an internal stage which can be used to stage data related to that user or table. Internal stages can be even created explicitly with a name.
- For a user, the default internal stage will be named as ‘@~’.
- For a table, the default internal stage will have the same name as the table.
- There is no option to alter or drop an internal default stage associated with a user or table.
- Unlike named stages file format options cannot be set to default user or table stages.
If an internal stage is created explicitly by the user using SQL statements with a name, many data loading options can be assigned to the stage like file format, date format, etc. When data is loaded to a table through this stage those options are automatically applied.
Note: The rest of this document discusses many Snowflake commands. Snowflake comes with a very intuitive and stable web-based interface to run SQL and commands. However, if you prefer to work with a lightweight command-line utility to interact with the database you might like SnowSQL – a CLI client available in Linux/Mac/Windows to run Snowflake commands. Read more about the tool and options here.
Now let’s have a look at commands to create a stage:
Create a named internal stage my_oracle_stage and assign some default options:
create or replace stage my_oracle_stage
copy_options= (on_error='skip_file')
file_format= (type = 'CSV' field_delimiter = ',' skip_header = 1);
PUT
is the command used to stage files to an internal Snowflake stage. The syntax of the PUT command is:
PUT file://path_to_your_file/your_filename internal_stage_name
Eg:
Upload a file items_data.csv in the /tmp/oracle_data/data/ directory to an internal stage named oracle_stage.
put file:////tmp/oracle_data/data/items_data.csv @oracle_stage;
While uploading the file you can set many configurations to enhance the data load performance like the number of parallelisms, automatic compression, etc. Complete information can be found here.
External Stage
Let us now look at the external staging option and understand how it differs from the internal stage. Snowflake supports any accessible Amazon S3 or Microsoft Azure as an external staging location. You can create a stage to pointing to the location data that can be loaded directly to the Snowflake table through that stage. No need to move the data to an internal stage.
If you want to create an external stage pointing to an S3 location, IAM credentials with proper access permissions are required. If data needs to be decrypted before loading to Snowflake, proper keys are to be provided. Here is an example to create an external stage:
create or replace stage oracle_ext_stage url='s3://snowflake_oracle/data/load/files/'
credentials=(aws_key_id='1d318jnsonmb5#dgd4rrb3c' aws_secret_key='aii998nnrcd4kx5y6z');
encryption=(master_key = 'eSxX0jzskjl22bNaaaDuOaO8=');
Once data is extracted from Oracle it can be uploaded to S3 using the direct upload option or using AWS SDK in your favorite programming language. Python’s boto3 is a popular one used under such circumstances. Once data is in S3, an external stage can be created to point to that location.
Step 4: Copy staged files to Snowflake table
So far – you have extracted data from Oracle, uploaded it to an S3 location, and created an external Snowflake stage pointing to that location. The next step is to copy data to the table. The command used to do this is COPY INTO.
Note: To execute the COPY INTO
command, compute resources in Snowflake virtual warehouses are required and your Snowflake credits will be utilized.
Eg:
To load from a named internal stage
copy into oracle_table
from @oracle_stage;
Loading from the external stage. Only one file is specified.
copy into my_ext_stage_table
from @oracle_ext_stage/tutorials/dataloading/items_ext.csv;
You can even copy directly from an external location without creating a stage:
copy into oracle_table
from s3://mybucket/oracle_snow/data/files
credentials=(aws_key_id='$AWS_ACCESS_KEY_ID' aws_secret_key='$AWS_SECRET_ACCESS_KEY')
encryption=(master_key = 'eSxX009jhh76jkIuLPH5r4BD09wOaO8=')
file_format = (format_name = csv_format);
Files can be specified using patterns
copy into oracle_pattern_table
from @oracle_stage
file_format = (type = 'TSV')
pattern='.*/.*/.*[.]csv[.]gz';
Some commonly used options for CSV file loading using the COPY command are:
- DATE_FORMAT – Specify any custom date format you used in the file so that Snowflake can parse it properly.
- TIME_FORMAT – Specify any custom date format you used in the file.
- COMPRESSION – If your data is compressed, specify algorithms used to compress.
- RECORD_DELIMITER – To mention lines separator character.
- FIELD_DELIMITER – To indicate the character separating fields in the file.
- SKIP_HEADER – This is the number of header lines to skipped while inserting data into the table.
Update Snowflake Table
We have discussed how to extract data incrementally from the Oracle table. Once data is extracted incrementally, it cannot be inserted into the target table directly. There will be new and updated records that have to be treated accordingly.
Earlier in this document, we mentioned that Snowflake supports SQL constraints. Adding to that, another surprising feature from Snowflake is support for row-level data manipulations which makes it easier to handle delta data load. The basic idea is to load incrementally extracted data into an intermediate or temporary table and modify records in the final table with data in the intermediate table. The three methods mentioned below are generally used for this.
1. Update the rows in the target table with new data (with the same keys). Then insert new rows from the intermediate or landing table which are not in the final table.
UPDATE oracle_target_table t SET t.value = s.value FROM landing_delta_table in WHERE t.id = in.id;
INSERT INTO oracle_target_table (id, value)
SELECT id, value FROM landing_delta_table WHERE NOT id IN (SELECT id FROM oracle_target_table);
2. Delete rows from the target table which are also in the landing table. Then insert all rows from the landing table to the final table. Now, the final table will have the latest data without duplicates
DELETE .oracle_target_table f WHERE f.id IN (SELECT id from landing_table); INSERT oracle_target_table (id, value)
SELECT id, value FROM landing_table;
3. MERGE Statement – Standard SQL merge statement which combines Inserts and updates. It is used to apply changes in the landing table to the target table with one SQL statement
MERGE into oracle_target_table t1 using landing_delta_table t2 on t1.id = t2.id
WHEN matched then update set value = t2.value
WHEN not matched then
INSERT (id, value) values (t2.id, t2.value);
This method of connecting Oracle to Snowflake works when you have a comfortable project timeline and a pool of experienced engineering resources that can build and maintain the pipeline. However, the method mentioned above comes with a lot of coding and maintenance overhead.
Limitations of Manual ETL Process
Here are some of the challenges of migrating from Oracle to Snowflake.
- Cost: The cost of hiring an ETL Developer to construct an oracle to Snowflake ETL pipeline might not be favorable in terms of expenses. Method 1 is not a cost-efficient option.
- Maintenance: Maintenance is very important for the data processing system; hence your ETL codes need to be updated regularly due to the fact that development tools upgrade their dependencies and industry standards change. Also, maintenance consumes precious engineering bandwidth which might be utilized elsewhere.
- Scalability: Indeed, scalability is paramount! ETL systems can fail over time if conditions for processing fails. For example, what if incoming data increases 10X, can your processes handle such a sudden increase in load? A question like this requires serious thinking while opting for the manual ETL Code approach.
Benefits of Replicating Data from Oracle to Snowflake
Many business applications are replicating data from Oracle to Snowflake, not only because of the superior scalability but also because of the other advantages that set Snowflake apart from traditional Oracle environments. Many businesses use an Oracle to Snowflake converter to help facilitate this data migration.
Some of the benefits of data migration from Oracle to Snowflake include:
- Snowflake promises high computational power. In case there are many concurrent users running complex queries, the computational power of the Snowflake instance can be changed dynamically. This ensures that there is less waiting time for complex query executions.
- The agility and elasticity offered by the Snowflake Cloud Data warehouse solution are unmatched. This gives you the liberty to scale only when you needed and pay for what you use.
- Snowflake is a completely managed service. This means you can get your analytics projects running with minimal engineering resources.
- Snowflake gives you the liberty to work seamlessly with Semi-structured data. Analyzing this in Oracle is super hard.
Oracle to Snowflake Migration Best Practices
- Assessment and Planning: Begin by thoroughly assessing your Oracle database. Understand the data volume, complexity, and dependencies. This will help you plan the migration more effectively.
- Data Cleansing: Before migration, ensure your data is clean and consistent. It’s easier to handle data issues before they enter Snowflake.
- Incremental Migration: Instead of moving everything at once, consider an incremental approach. This allows you to validate the data at each stage and ensures a smoother transition.
- Data Transformation: Leverage Snowflake’s capabilities to transform your data during migration. This could involve normalizing data types, applying business logic, or re-architecting the data model for better performance in Snowflake.
- Testing and Validation: After migrating, testing and validating your data is crucial. This step ensures that the data has been accurately transferred and performs well in Snowflake.
- Optimize and Automate: Post-migration, look for opportunities to optimize your Snowflake setup. Automate regular tasks like data loading and transformation using Snowflake’s built-in features and third-party tools.
By following these best practices, you can ensure a seamless migration from Oracle to Snowflake, setting your data up for success in its new environment.
Conclusion
In this article, you have learned about two different approaches to set up Oracle to Snowflake Integration. The manual method involves the use of SQL*Plus and also staging the files to Amazon S3 before copying them into the Snowflake Data Warehouse. This method requires more effort and engineering bandwidth to connect Oracle to Snowflake. Whereas, if you require real-time data replication and looking for a fully automated real-time solution, then Hevo is the right choice for you. The many benefits of migrating from Oracle to Snowflake make it an attractive solution.
Learn more about Hevo
Want to try Hevo?
Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.
FAQs to connect Oracle to Snowflake
1. How do you migrate from Oracle to Snowflake?
To migrate from Oracle to Snowflake, export data from Oracle using tools like Oracle Data Pump or SQL Developer, transform it as necessary, then load it into Snowflake using Snowflake’s COPY command or bulk data loading tools like SnowSQL or third-party ETL tools like Hevo Data.
2. What is the most efficient way to load data into Snowflake?
The most efficient way to load data into Snowflake is through its bulk loading options like Snowflake’s COPY command, which supports loading data in parallel directly from cloud storage (e.g., AWS S3, Azure Blob Storage) into tables, ensuring fast and scalable data ingestion.
3. Why move from SQL Server to Snowflake?
Moving from SQL Server to Snowflake offers advantages such as scalable cloud architecture with separate compute and storage, eliminating infrastructure management, and enabling seamless integration with modern data pipelines and analytics tools for improved performance and cost-efficiency.
Faisal loves data science and combines his problem-solving ability and passion for writing to help data teams in solving complex business problems.