Snowflake is great if you have big data needs. It offers scalable computing and limitless size in a traditional SQL and Data Warehouse setting. If you have a relatively small dataset or low concurrency/load then you won’t see the benefits of Snowflake.

Simply put, Snowflake has a friendly UI, and unlimited storage capacity, along with the control, security, and performance you’d expect for a Data Warehouse, something SQL Server is not. Snowflake’s unique Cloud Architecture enables unlimited scale and concurrency without resource contention, the ‘Holy Grail’ of Data Warehousing.

One of the biggest challenges of migrating data from SQL server to Snowflake is choosing from all the different options available. This blog post covers the detailed steps of 4 methods that you need to follow for SQL Server to Snowflake migration. Read along and decide, which method suits you the best!

What is MS SQL Server?

SQL Server Logo

Microsoft SQL Server (MS SQL Server) is a relational database management system (RDBMS) developed by Microsoft. It is used to store and retrieve data as requested by other software applications, which may run either on the same computer or on another computer across a network. MS SQL Server is designed to handle a wide range of data management tasks and supports various transaction processing, business intelligence, and analytics applications.

What is Snowflake?

Snowflake Logo

Snowflake is a cloud-based data warehousing platform that is designed to handle large-scale data storage, processing, and analytics. It stands out due to its architecture, which separates compute, storage, and services, offering flexibility, scalability, and performance improvements over traditional data warehouses.

Load your data from MS SQL Server to Snowflake
Load your data from Salesforce to Snowflake
Load your data from MongoDB to Snowflake

Methods to Connect SQL Server to Snowflake

The following 4 methods can be used to transfer data from Microsoft SQL server to Snowflake easily:

Method 1: Using SnowSQL to Connect Microsoft SQL Server to Snowflake

To migrate data from Microsoft SQL Server to Snowflake, you must perform the following steps:

  • Step 1: Export data from SQL server using SQL Server Management Studio
  • Step 2: Upload the CSV file to an Amazon S3 Bucket using the web console
  • Step 3: Upload data to Snowflake From S3

Step 1: Export Data from SQL Server Using SQL Server Management Studio

SQL Server Management Studio is a data management and administration software application that launched with SQL Server.

You will use it to extract data from a SQL database and export it to CSV format. The steps to achieve this are:

  1. Install SQL Server Management Studio if you don’t have it on your local machine. 
  2. Launch the SQL Server Management Studio and connect to your SQL Server. 
  3. From the Object Explorer window, select the database you want to export and right-click on the context menu in the Tasks sub-menu and choose the Export data option to export table data in CSV.
  4. The SQL Server Import and Export Wizard welcome window will pop up. At this point, you need to select the Data source you want to copy from the drop-down menu.
  5. After that, you need to select SQL Server Native Client 11.0 as the data source. 
  6. Select an SQL Server instance from the drop-down input box. 
  7. Under Authentication, select “Use Windows Authentication”.
  8. Just below that, you get a Database drop-down box, and from here you select the database from which data will be copied. 
  9. Once you’re done filling out all the inputs, click on the Next button. 
  10. The next window is the Choose a Destination window. Under the destination drop-down box, select the Flat File Destination for copying data from SQL Server to CSV.
  11. Under File name, select the CSV file that you want to write to and click on the Next button.
  12. In the next screen, select Copy data from one or more tables or views and click Next to proceed.
  13. A “Configure Flat File Destination”  screen will appear, and here you are going to select the table from the Source table or view. This action will export the data to the CSV file. Click Next to continue.
  14. You don’t want to change anything on the Save and Run Package window so just click Next.
  15. The next window is the Complete Wizard window which shows a list of choices that you have selected during the exporting process. Counter-check everything and if everything checks out, click the Finish button to begin exporting your SQL database to CSV.
  16. The final window shows you whether the exporting process was successful or not. If the exporting process is finished successfully, you will see a similar output to what’s shown below.

Step 2: Upload the CSV File to an Amazon S3 Bucket Using the Web Console

After completing the exporting process to your local machine, the next step in the data transfer process from SQL Server to Snowflake is to transfer the CSV file to Amazon S3.

Steps to upload a CSV file to Amazon S3:

  1. Start by creating a storage bucket.
    1. Go to the AWS S3 Console
    2. Click the Create Bucket button and enter a unique name for your bucket on the form.
    3. Choose the AWS Region where you’d like to store your data.
    4. Create a new S3 bucket.
  2. Create the directory that will hold your CSV file.
    1. In the Buckets pane, click on the name of the bucket that you created.
    2. Click on the Actions button, and select the Create Folder option.
    3. Enter a unique name for your new folder and click Create.
  3. Upload the CSV file to your S3 bucket.
    1. Select the folder you’ve just created in the previous step.
    2. Select Files wizard and then click on the Add Files button in the upload section.
    3. Next, a file selection dialog box will open. Here you will select the CSV file you exported earlier and then click Open.
    4. Click on the Start Upload button and you are done!

Step 3: Upload Data to Snowflake From S3

Since you already have an Amazon Web Services (AWS) account and you are storing your data files in an S3 bucket, you can leverage your existing bucket and folder paths for bulk loading into Snowflake.

To allow Snowflake to read data from and write data to an Amazon S3 bucket, you first need to configure a storage integration object to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity.

Step 3.1: Define Read-Write Access Permissions for the AWS S3 Bucket

Allow the following actions:

  • “s3:PutObject”
  • “s3:GetObject”
  • “s3:GetObjectVersion”
  • “s3:DeleteObject”
  • “s3:DeleteObjectVersion”
  • “s3:ListBucket”

The following sample policy grants read-write access to objects in your S3 bucket.

{
     "Version": "2012-10-17",
     "Statement": [
         {
             "Sid": "AllowListingOfUserFolder",
             "Action": [
                 "s3:ListBucket",
                 "s3:GetBucketLocation"
             ],
             "Effect": "Allow",
             "Resource": [
                 "arn:aws:s3:::bucket_name"
             ]
         },
         {
             "Sid": "HomeDirObjectAccess",
             "Effect": "Allow",
             "Action": [
                 "s3:PutObject",
                 "s3:GetObject",
                 "s3:DeleteObjectVersion",
                 "s3:DeleteObject",
                 "s3:GetObjectVersion"
             ],
             "Resource": "arn:aws:s3:::bucket_name/*"
         }
     ]
 }


For a detailed explanation of how to grant access to your S3 bucket, check out this link.

Step 3.2: Create an AWS IAM Role and record your IAM Role ARN value located on the role summary page because we are going to need it later on.

Step 3.3: Create a cloud storage integration using the STORAGE INTEGRATION command. 

CREATE STORAGE INTEGRATION <integration_name>
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = S3
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = '<iam_role>'
  STORAGE_ALLOWED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/')
  [ STORAGE_BLOCKED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') ]

Where:

  • <integration_name> is the name of the new integration.
  • <iam_role is> the Amazon Resource Name (ARN) of the role you just created.
  • <bucket> is the name of an S3 bucket that stores your data files.
  • <path> is an optional path that can be used to provide granular control over objects in the bucket.

Step 3.4: Recover the AWS IAM User for your Snowflake Account

  1. Execute the DESCRIBE INTEGRATION command to retrieve the ARN for the AWS IAM user that was created automatically for your Snowflake account:
    DESC INTEGRATION <integration_name>;
  2. Record the following values:
ValueDescription
STORAGE_AWS711/_IAM_USER_ARNThe AWS IAM user created for your Snowflake account.
STORAGE_AWS_EXTERNAL_IDThe external ID that is needed to establish a trust relationship.

Step 3.5: Grant the IAM User Permissions to Access Bucket Objects

  1. Log into the AWS Management Console and from the console dashboard, select IAM.
  2. Navigate to the left-hand navigation pane and select Roles and choose your IAM Role.
  3. Select Trust Relationships followed by Edit Trust Relationship.
  4. Modify the policy document with the IAM_USER_ARNand STORAGE_AWS_EXTERNAL_ID output values you recorded in the previous step.
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "<IAM_USER_ARN>"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "<STORAGE_AWS_EXTERNAL_ID>"
            }
          }
        }
      ]
    }
  5. Click the Update Trust Policy button to save the changes.

Step 3.6: Create an External Stage that references the storage integration you created

grant create stage on schema public to role <IAM_ROLE>;

grant usage on integration s3_int to role <IAM_ROLE>;

use schema mydb.public;

create stage my_s3_stage
  storage_integration = s3_int
  url = 's3://bucket1/path1'
  file_format = my_csv_format;

Step 3.7: Execute COPY INTO <table> SQL command to load data from your staged files into the target table using the Snowflake client, SnowSQL.

Seeing that we have already configured an AWS IAM role with the required policies and permissions to access your external S3 bucket, we have already created an S3 stage. Now that we have a stage built in Snowflake pulling this data into your tables will be extremely simple.

 copy into mytable
  from s3://mybucket credentials=(aws_key_id='$AWS_ACCESS_KEY_ID' aws_secret_key='$AWS_SECRET_ACCESS_KEY')
  file_format = (type = csv field_delimiter = '|' skip_header = 1);

This SQL command loads data from all files in the S3 bucket to your Snowflake Warehouse.

SQL Server to Snowflake: Limitations and Challenges of Using Custom Code Method

The above method of connecting SQL Server to Snowflake comes along with the following limitations:

  • This method is only intended for files that do not exceed 160GB. Anything above that will require you to use the Amazon S3 REST API. 
  • This method doesn’t support real-time data streaming from SQL Server into your Snowflake DW.
  • If your organization has a use case for Change Data Capture (CDC), then you could create a data pipeline using Snowpipe.
  • Also, although this is one of the most popular methods of connecting SQL Server to Snowflake, there are a lot of steps that you need to get right to achieve a seamless migration. Some of you might even go as far as to consider this approach to be cumbersome and error-prone.

Method 2: Using Custom ETL Scripts

Custom ETL scripts are programs that extract, transform, and load data from SQL Server to Snowflake. They require coding skills and knowledge of both databases.

To use custom ETL scripts, you need to:

1. Install the Snowflake ODBC driver or a client library for your language (e.g., Python, Java, etc.).

2. Get the connection details for Snowflake (e.g., account name, username, password, warehouse, database, schema, etc.).

3. Choose a language and set up the libraries to interact with SQL Server and Snowflake.

4. Write a SQL query to extract the data you want from SQL Server. Use this query in your script to pull the data. 

Drawbacks of Utilizing ETL Scripts

While employing custom ETL scripts to transfer data from SQL Server to Snowflake offers advantages, it also presents potential drawbacks:

  1. Complexity and Maintenance Burden: Custom scripts demand more resources for development, testing, and upkeep compared to user-friendly ETL tools, particularly as data sources or requirements evolve.
  2. Limited Scalability: Custom scripts may struggle to efficiently handle large data volumes or intricate transformations, potentially resulting in performance challenges unlike specialized ETL tools.
  3. Security Risks: Managing credentials and sensitive data within scripts requires meticulous attention to security. Storing passwords directly within scripts can pose significant security vulnerabilities if not adequately safeguarded.
  4. Minimal Monitoring and Logging Capabilities: Custom scripts may lack advanced monitoring and logging features, necessitating additional development effort to establish comprehensive tracking mechanisms.
  5. Extended Development Duration: Developing custom scripts often takes longer compared to configuring ETL processes within visual tools.

Method 3: Using Hevo Data to Connect SQL Server to Snowflake

Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates flexible data pipelines to your needs. With integration with 150+ Data Sources (40+ free sources), we help you not only export data from sources & load data to the destinations but also transform & enrich your data, & make it analysis-ready.

The following steps are required to connect Microsoft SQL Server to Snowflake using Hevo’s Data Pipeline:

Step 1: Connect to your Microsoft SQL Server source.

  • Click PIPELINES in the Navigation Bar.
  • Click + CREATE in the Pipelines List View.
  • Select SQL Server as your source.
  • In the Configure your SQL Server Source page, specify the following:
SQL Server to Snowflake

You can read more about using SQL server as a source connector for Hevo here.

Step 2: Configure your Snowflake Data Warehouse as Destination

  • Click DESTINATIONS in the Navigation Bar.
  • Click + CREATE in the Destinations List View.
  • In the Add Destination page, select Snowflake as the Destination type.
  • In the Configure your Snowflake Warehouse page, specify the following:
Configuring Snowflake Warehouse

This is how simple it can be to load data from SQL Server to Snowflake using Hevo.

Method 4: SQL Server to Snowflake Using Snowpipe

Snowpipe is a feature of Snowflake that allows you to load data from external sources into Snowflake tables automatically and continuously. Here are the steps involved in this method:

1. Create an external stage in Snowflake that points to an S3 bucket where you will store the CSV file.

2. Create an external stage in Snowflake that points to an S3 bucket where you will store the CSV file.

3. Create a pipe in Snowflake that copies data from the external stage to the table. Enable auto-ingest and specify the file format as CSV.

4. Enable Snowpipe with the below command

ALTER ACCOUNT SET PIPE_EXECUTION_PAUSED = FALSE;

5. Install the Snowpipe JDBC driver on your local machine and create a batch file to export data from SQL Server to CSV File.

6. Schedule the batch file to run regularly using a tool like Windows Task Scheduler or Cron. Check out this documentation for more details. 

Drawbacks of Snowpipe Method

Here are some key limitations of using Snowpipe for data migration from SQL Server to Snowflake: 

  • File Size Restrictions: Snowflake imposes a per-file size limit for direct ingestion (around 160GB). Files exceeding this necessitate additional steps like splitting them or using the S3 REST API, adding complexity.
  • Real-Time/CDC Challenges: Snowpipe is ideal for micro-batches and near real-time ingestion. But, it isn’t built for true real-time continuous data capture (CDC) of every single change happening in your SQL Server.
  • Error Handling: Error handling for failed file loads through Snowpipe can become a bit nuanced. You need to configure options like ON_ERROR = CONTINUE in your COPY INTO statements to prevent individual file failures from stopping the entire load process.
  • Transformation Limitations: Snowpipe primarily handles loading data into Snowflake. For complex transformations during the migration process, you may need a separate ETL/ELT tool to work with the Snowpipe-loaded data within Snowflake.

Why migrate data from MS SQL Server to Snowflake?

  • Enhanced Scalability and Elasticity: MSSQL Server, while scalable, often requires manual infrastructure provisioning for scaling compute resources. Snowflake’s cloud-based architecture offers elastic scaling, allowing you to easily adjust compute power up or down based on workload demands. You only pay for the resources you use, leading to potentially significant cost savings.
  • Reduced Operational Burden: Managing and maintaining on-premises infrastructure associated with MSSQL Server can be resource-intensive. Snowflake handles all infrastructure management, freeing up your IT team to focus on core data initiatives.
  • Performance and Concurrency: Snowflake’s architecture is designed to handle high concurrency and provide fast query performance, making it suitable for demanding analytical workloads and large-scale data processing.

Additional Resources on SQL Server to Snowflake

Conclusion

  • The article introduced you to how to migrate data from SQL server to Snowflake. It also provided a step-by-step guide of 4 methods using which you can connect your Microsoft SQL Server to Snowflake easily.
  • The article also talked about the limitations and benefits associated with these methods. The manual method using SnowSQL works fine when it comes to transferring data from Microsoft SQL Server to Snowflake, but there are still numerous limitations to it.

What are your thoughts about the different approaches to moving data from Microsoft SQL Server to Snowflake? Let us know in the comments.

Talha
Software Developer, Hevo Data

Talha is a Software Developer with over eight years of experience in the field. He is currently driving advancements in data integration at Hevo Data, where he has been instrumental in shaping a cutting-edge data integration platform for the past four years. Prior to this, he spent 4 years at Flipkart, where he played a key role in projects related to their data integration capabilities. Talha loves to explain complex information related to data engineering to his peers through writing. He has written many blogs related to data integration, data management aspects, and key challenges data practitioners face.

All your customer data in one place.