Microsoft SQL Server to Snowflake: 2 Easy Methods

on Tutorial, Data Integration, Data Warehouse, Database, ETL, MySQL, snowflake • February 21st, 2020 • Write for Hevo

Microsoft SQL Server is kind of a swiss army knife for most SME needs and workloads. However there are a handful of things that SQL Server will be better at, and there’s a handful of things Snowflake will be better at.

Snowflake is great if you have big data needs. It offers scalable compute and limitless size in a traditional SQL and Data Warehouse setting. If you have a relatively small dataset or low concurrency/load then you won’t see the benefits of Snowflake.

Simply put, Snowflake has a friendly UI,  unlimited storage capacity, along with the control, security, and performance you’d expect for a Data Warehouse, something SQL Server is not. Snowflake’s unique Cloud Architecture enables unlimited scale and concurrency without resource contention ,the ‘Holy Grail’ of Data Warehousing.

One of the biggest challenges of migrating data from Microsoft SQL Server to Snowflake is choosing from all the different options available. This blog post covers the detailed steps of 2 methods which you need to follow to migrate data from Microsoft SQL Server to Snowflake. Read along and decide, which method suits you the best!

Table of Contents

Introduction to Microsoft SQL Server

Microsoft SQL Logo
Image Source

Microsoft’s SQL Server is a database server, and as such, it primarily stores and retrieves data. It is an implementation of both the Relational Database Management System (RDBMS) and the Structured Query Language (SQL). Today, many specialized versions of SQL Server exist catering to a wide spectrum of workloads and demands. For example, there is a data center version tailored to higher levels of application support and scale, and a scaled-down version available as freeware.

To know more about Microsoft SQL Server, visit here.

Introduction to Snowflake

Snowflake Logo
Image Source

Snowflake, on the other hand, is an analytics database built for the Cloud and delivered as a Data Warehouse-as-a-Service (DWaaS). Snowflake runs on AWS, the world’s most popular cloud provider. Similar to other Databases, you can load and query any structured relational data in Snowflake tables using standard SQL data types e.g. NUMBER, BOOLEAN, VARCHAR, TIMESTAMPS, etc. 

To learn more about, Snowflake visit here.

Methods to Connect Microsoft SQL Server to Snowflake

Method 1: Using SnowSQL to Connect Microsoft SQL Server to Snowflake

SnowSQL can be efficiently used to transfer data from Microsoft SQL Server to Snowflake. The Snowflake Data Warehouse platform on AWS is backed by EC2 instances for computing and S3 buckets for storage, so it makes sense that you will be staging your data in an AWS S3 bucket.

Method 2: Using Hevo Data to Connect Microsoft SQL Server to Snowflake

Hevo Data provides a hassle-free solution and helps you directly transfer data from Microsoft SQL Server to Snowflake and numerous other Databases/Data Warehouses or destinations of your choice without any intervention in an effortless manner. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Hevo’s pre-built integration with Snowflake along with 100+ other data sources (including 30+ free data sources) will take full charge of the data transfer process, allowing you to focus on key business activities for free.

Get Started with Hevo for Free

Methods to Connect Microsoft SQL Server to Snowflake

The following 2 methods can be used to transfer data from Microsoft SQL Server to Snowflake easily:

Method 1: Using SnowSQL to Connect Microsoft SQL Server to Snowflake

To migrate data from Microsoft SQL Server to Snowflake, you must perform the following steps:

Step 1: Export Data from SQL Server Using SQL Server Management Studio

SQL Server Management Studio is a data management and administration software application that launched with SQL Server.

You will use it to extract data from a SQL database and export it to CSV format. The steps to achieve this are:

  1. Install SQL Server Management Studio if you don’t have it on your local machine. 
  2. Launch the SQL Server Management Studio and connect to your SQL Server. 
  3. From the Object Explorer window, select the database you want to export and right-click on the context menu in the Tasks sub-menu and choose the Export data option to export table data in CSV.

  4. The SQL Server Import and Export Wizard welcome window will pop up. At this point, you need to select the Data source you want to copy from the drop-down menu.
  5. After that, you need to select SQL Server Native Client 11.0 as the data source. 
  6. Select an SQL Server instance from the drop-down input box. 
  7. Under Authentication, select “Use Windows Authentication”.
  8. Just below that, you get a Database drop-down box, and from here you select the database from which data will be copied. 
  9. Once you’re done filling out all the inputs, click on the Next button. 
  10. The next window is the Choose a Destination window. Under the destination drop-down box, select the Flat File Destination for copying data from SQL Server to CSV.
  11. Under File name, select the CSV file that you want to write to and click on the Next button.
  12. In the next screen, select Copy data from one or more tables or views and click Next to proceed.
  13. A “Configure Flat File Destination”  screen will appear, and here you are going to select the table from the Source table or view. This action will export the data to the CSV file. Click Next to continue.
  14. You don’t want to change anything on the Save and Run Package window so just click Next.
  15. The next window is the Complete Wizard window which shows a list of choices that you have selected during the exporting process. Counter check everything and if everything checks out, click the Finish button to begin exporting your SQL database to CSV.
  16. The final window shows you whether the exporting process was successful or not. If the exporting process is finished successfully, you will see a similar output to what’s shown below.

Step 2: Upload the CSV File to an Amazon S3 Bucket Using the Web Console

After completing the exporting process to your local machine, the next step in the data transfer process from Microsoft SQL Server to Snowflake is to transfer the CSV file to Amazon S3.

Steps to upload a CSV file to Amazon S3:

  1. Start by creating a storage bucket.
    1. Go to the AWS S3 Console https://console.aws.amazon.com/s3/.
    2. Click the Create Bucket button and enter a unique name for your bucket on the form.
    3. Choose the AWS Region where you’d like to store your data.
    4. Create a new S3 bucket.
  2. Create the directory that will hold your CSV file.
    1. In the Buckets pane, click on the name of the bucket that you created.
    2. Click on the Actions button, and select the Create Folder option.
    3. Enter a unique name for your new folder and click Create.
  3. Upload the CSV file to your S3 bucket.
    1. Select the folder you’ve just created in the previous step.
    2. Select Files wizard and then click on the Add Files button in the upload section.
    3. Next, a file selection dialog box will open. Here you will select the CSV file you exported earlier and then click Open.
    4. Click on the Start Upload button and you are done!

Step 3: Upload Data to Snowflake From S3

Since you already have an Amazon Web Services (AWS) account and you are storing your data files in an S3 bucket, you can leverage your existing bucket and folder paths for bulk loading into Snowflake.

To allow Snowflake to read data from and write data to an Amazon S3 bucket, you first need to configure a storage integration object to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity.

Step 3.1: Define Read-Write Access Permissions for the AWS S3 Bucket

Allow the following Actions:

  • “s3:PutObject”
  • “s3:GetObject”
  • “s3:GetObjectVersion”
  • “s3:DeleteObject”
  • “s3:DeleteObjectVersion”
  • “s3:ListBucket”

The following sample policy grants read-write access to objects in your S3 bucket.

{
     "Version": "2012-10-17",
     "Statement": [
         {
             "Sid": "AllowListingOfUserFolder",
             "Action": [
                 "s3:ListBucket",
                 "s3:GetBucketLocation"
             ],
             "Effect": "Allow",
             "Resource": [
                 "arn:aws:s3:::bucket_name"
             ]
         },
         {
             "Sid": "HomeDirObjectAccess",
             "Effect": "Allow",
             "Action": [
                 "s3:PutObject",
                 "s3:GetObject",
                 "s3:DeleteObjectVersion",
                 "s3:DeleteObject",
                 "s3:GetObjectVersion"
             ],
             "Resource": "arn:aws:s3:::bucket_name/*"
         }
     ]
 }


For a detailed explanation on how to grant access to your S3 bucket, check out this link.

Step 3.2: Create an AWS IAM Role and record your IAM Role ARN value located on the role summary page because we are going to need it later on.

Step 3.3: Create a cloud storage integration using the STORAGE INTEGRATION command. 

CREATE STORAGE INTEGRATION <integration_name>
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = S3
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = '<iam_role>'
  STORAGE_ALLOWED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/')
  [ STORAGE_BLOCKED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') ]

Where:

  • <integration_name> is the name of the new integration.
  • <iam_role is> the Amazon Resource Name (ARN) of the role you just created.
  • <bucket> is the name of an S3 bucket that stores your data files.
  • <path> is an optional path that can be used to provide granular control over objects in the bucket.

Step 3.4: Recover the AWS IAM User for your Snowflake Account

  1. Execute the DESCRIBE INTEGRATION command to retrieve the ARN for the AWS IAM user that was created automatically for your Snowflake account:

     

    DESC INTEGRATION <integration_name>;
  2. Record the following values:
ValueDescription
STORAGE_AWS711/_IAM_USER_ARNThe AWS IAM user created for your Snowflake account.
STORAGE_AWS_EXTERNAL_IDThe external ID that is needed to establish a trust relationship.

Step 3.5: Grant the IAM User Permissions to Access Bucket Objects

  1. Log into the AWS Management Console and from the console dashboard, select IAM.
  2. Navigate to the left-hand navigation pane and select Roles and choose your IAM Role.
  3. Select Trust Relationships followed by Edit Trust Relationship.
  4. Modify the policy document with the IAM_USER_ARNand STORAGE_AWS_EXTERNAL_ID output values you recorded in the previous step.
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "<IAM_USER_ARN>"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "<STORAGE_AWS_EXTERNAL_ID>"
            }
          }
        }
      ]
    }
  5. Click the Update Trust Policy button to save the changes.

Step 3.6: Create an External Stage that references the storage integration you created

grant create stage on schema public to role <IAM_ROLE>;

grant usage on integration s3_int to role <IAM_ROLE>;

use schema mydb.public;

create stage my_s3_stage
  storage_integration = s3_int
  url = 's3://bucket1/path1'
  file_format = my_csv_format;

Step 3.7: Execute COPY INTO <table> SQL command to load data from your staged files into the target table using the Snowflake client, SnowSQL.

Seeing that we have already configured an AWS IAM role with the required policies and permissions to access your external S3 bucket, we have already created an S3 stage. Now that we have a stage built in Snowflake pulling this data into your tables will be extremely simple.

 copy into mytable
  from s3://mybucket credentials=(aws_key_id='$AWS_ACCESS_KEY_ID' aws_secret_key='$AWS_SECRET_ACCESS_KEY')
  file_format = (type = csv field_delimiter = '|' skip_header = 1);

This SQL command loads data from all files in the S3 bucket to your Snowflake Warehouse.

SQL Server to Snowflake: Limitations and Challenges of Using Custom Code Method

The above method of connecting Microsoft SQL Server to Snowflake comes along with the following limitations:

  • This method is only intended for files that do not exceed 160GB. Anything above that will require you to use the Amazon S3 REST API. 
  • This method doesn’t support real-time data streaming from SQL Server into your Snowflake DW.
  • If your organization has a use case for Change Data Capture (CDC), then you could create a data pipeline using Snowpipe.
  • Also, although this is one of the most popular methods of connecting Microsoft SQL Server to Snowflake, there are a lot of steps that you need to get right to achieve a seamless migration. Some of you might even go as far as to consider this approach to be cumbersome and error-prone.

Method 2: Using Hevo Data to Connect Microsoft SQL Server to Snowflake

Microsoft SQl Server to Snowflake: Hevo Logo
Image Source

Hevo Data, a No-code Data Pipeline, helps you directly transfer data from Microsoft SQL Server and 100+ other data sources to Snowflake, Databases, BI tools, or a destination of your choice in a completely hassle-free & automated manner for free. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Hevo Data takes care of all your data preprocessing needs and lets you focus on key business activities and draw a much powerful insight on how to generate more leads, retain customers, and take your business to new heights of profitability. It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination.

Sign up here for a 14-Day Free Trial!

The following steps are required to connect Microsoft SQL Server to Snowflake using Hevo’s Data Pipeline:

  • Connect to your Microsoft SQL Server source.
  • Select the replication mode: (i) Full dump and load (ii) Incremental load for append-only data (iii) Incremental load for mutable data.
  • Configure your Snowflake Data Warehouse as shown in the below image.
Configuring Snowflake Warehouse
Image Source

Conclusion

The article introduced you to Microsoft SQl and Snowflake. It also provided a step-by-step guide of 2 methods using which you can connect your Microsoft SQL Server to Snowflake easily. The article also talked about the limitations and benefits associated with these methods. The manual method using SnowSQL, works fine when it comes to transferring data from Microsoft SQL Server to Snowflake, but there are still numerous limitations to it.

That’s where a proprietary Data Integration Platform like Hevo comes into play. Hevo is able to simplify this process and shorten it to just minutes. It is a faster, easier to use, data pipeline solution. The best thing is that you don’t even have to use SQL queries or complex JSON manifests.

Visit our Website to Explore Hevo

Hevo Data offers a faster way to move data from 100+ data sources such as SaaS applications or Databases such as Microsoft SQL Server to Snowflake Data Warehouse to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

What are your thoughts about the different approaches to moving data from Microsoft SQL Server to Snowflake? Let us know in the comments.

No-code Data Pipeline for Snowflake