Microsoft SQL Server is kind of a swiss army knife for most SME needs and workloads. However there are a handful of things that SQL Server will be better at, and there’s a handful of things Snowflake will be better at.
Snowflake is great if you have big data needs. It offers scalable computing and limitless size in a traditional SQL and Data Warehouse setting. If you have a relatively small dataset or low concurrency/load then you won’t see the benefits of Snowflake.
Simply put, Snowflake has a friendly UI, and unlimited storage capacity, along with the control, security, and performance you’d expect for a Data Warehouse, something SQL Server is not. Snowflake’s unique Cloud Architecture enables unlimited scale and concurrency without resource contention, the ‘Holy Grail’ of Data Warehousing.
One of the biggest challenges of migrating data from SQL server to Snowflake is choosing from all the different options available. This blog post covers the detailed steps of 2 methods that you need to follow for SQL Server to Snowflake migration. Read along and decide, which method suits you the best!
Introduction to Microsoft SQL Server
Microsoft SQL Server
Microsoft’s SQL Server is a database server, and as such, it primarily stores and retrieves data. It is an implementation of both the relational database management system (RDBMS) and the structured query language (SQL). Today, many specialized versions of SQL servers exist catering to a wide spectrum of workloads and demands. For example, there is a data center version tailored to higher levels of application support and scale, and a scaled-down version available as freeware.
To know more about Microsoft SQL Server, visit here.
Introduction to Snowflake
Snowflake Logo
Snowflake, on the other hand, is an analytics database built for the Cloud and delivered as a Data Warehouse-as-a-Service (DWaaS). Snowflake runs on AWS, the world’s most popular cloud provider. Similar to other databases, you can load and query any structured relational data in Snowflake tables using standard SQL data types e.g. NUMBER, BOOLEAN, VARCHAR, TIMESTAMPS, etc.
To learn more about, Snowflake visit here.
Did you know that 75-90% of data sources you will ever need to build pipelines for are already available off-the-shelf with No-Code Data Pipeline Platforms like Hevo?
Ambitious data engineers who want to stay relevant for the future automate repetitive ELT work and save more than 50% of their time that would otherwise be spent on maintaining pipelines. Instead, they use that time to focus on non-mediocre work like optimizing core data infrastructure, scripting non-SQL transformations for training algorithms, and more.
Step off the hamster wheel and opt for an automated data pipeline like Hevo. With a no-code intuitive UI, Hevo lets you set up pipelines in minutes. Its fault-tolerant architecture ensures zero maintenance. Moreover, data replication happens in near real-time from 150+ sources to the destination of your choice including Snowflake, BigQuery, Redshift, Databricks, and Firebolt.
Start saving those 20 hours with Hevo today.
Get started for Free with Hevo!
Methods to Connect SQL Server to Snowflake
The following 2 methods can be used to transfer data from Microsoft SQL server to Snowflake easily:
- Method 1: Using SnowSQL to connect SQL server to Snowflake
- Method 2: Using Hevo Data to connect Microsoft SQL Server to Snowflake
Method 1: Using SnowSQL to Connect Microsoft SQL Server to Snowflake
To migrate data from Microsoft SQL Server to Snowflake, you must perform the following steps:
- Step 1: Export data from SQL server using SQL Server Management Studio
- Step 2: Upload the CSV file to an Amazon S3 Bucket using the web console
- Step 3: Upload data to Snowflake From S3
Step 1: Export Data from SQL Server Using SQL Server Management Studio
SQL Server Management Studio is a data management and administration software application that launched with SQL Server.
You will use it to extract data from a SQL database and export it to CSV format. The steps to achieve this are:
- Install SQL Server Management Studio if you don’t have it on your local machine.
- Launch the SQL Server Management Studio and connect to your SQL Server.
- From the Object Explorer window, select the database you want to export and right-click on the context menu in the Tasks sub-menu and choose the Export data option to export table data in CSV.
- The SQL Server Import and Export Wizard welcome window will pop up. At this point, you need to select the Data source you want to copy from the drop-down menu.
- After that, you need to select SQL Server Native Client 11.0 as the data source.
- Select an SQL Server instance from the drop-down input box.
- Under Authentication, select “Use Windows Authentication”.
- Just below that, you get a Database drop-down box, and from here you select the database from which data will be copied.
- Once you’re done filling out all the inputs, click on the Next button.
- The next window is the Choose a Destination window. Under the destination drop-down box, select the Flat File Destination for copying data from SQL Server to CSV.
- Under File name, select the CSV file that you want to write to and click on the Next button.
- In the next screen, select Copy data from one or more tables or views and click Next to proceed.
- A “Configure Flat File Destination” screen will appear, and here you are going to select the table from the Source table or view. This action will export the data to the CSV file. Click Next to continue.
- You don’t want to change anything on the Save and Run Package window so just click Next.
- The next window is the Complete Wizard window which shows a list of choices that you have selected during the exporting process. Counter-check everything and if everything checks out, click the Finish button to begin exporting your SQL database to CSV.
- The final window shows you whether the exporting process was successful or not. If the exporting process is finished successfully, you will see a similar output to what’s shown below.
Step 2: Upload the CSV File to an Amazon S3 Bucket Using the Web Console
After completing the exporting process to your local machine, the next step in the data transfer process from SQL Server to Snowflake is to transfer the CSV file to Amazon S3.
Steps to upload a CSV file to Amazon S3:
- Start by creating a storage bucket.
- Go to the AWS S3 Console https://console.aws.amazon.com/s3/.
- Click the Create Bucket button and enter a unique name for your bucket on the form.
- Choose the AWS Region where you’d like to store your data.
- Create a new S3 bucket.
- Create the directory that will hold your CSV file.
- In the Buckets pane, click on the name of the bucket that you created.
- Click on the Actions button, and select the Create Folder option.
- Enter a unique name for your new folder and click Create.
- Upload the CSV file to your S3 bucket.
- Select the folder you’ve just created in the previous step.
- Select Files wizard and then click on the Add Files button in the upload section.
- Next, a file selection dialog box will open. Here you will select the CSV file you exported earlier and then click Open.
- Click on the Start Upload button and you are done!
Step 3: Upload Data to Snowflake From S3
Since you already have an Amazon Web Services (AWS) account and you are storing your data files in an S3 bucket, you can leverage your existing bucket and folder paths for bulk loading into Snowflake.
To allow Snowflake to read data from and write data to an Amazon S3 bucket, you first need to configure a storage integration object to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity.
Step 3.1: Define Read-Write Access Permissions for the AWS S3 Bucket
Allow the following actions:
- “s3:PutObject”
- “s3:GetObject”
- “s3:GetObjectVersion”
- “s3:DeleteObject”
- “s3:DeleteObjectVersion”
- “s3:ListBucket”
The following sample policy grants read-write access to objects in your S3 bucket.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowListingOfUserFolder",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::bucket_name"
]
},
{
"Sid": "HomeDirObjectAccess",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObjectVersion",
"s3:DeleteObject",
"s3:GetObjectVersion"
],
"Resource": "arn:aws:s3:::bucket_name/*"
}
]
}
For a detailed explanation of how to grant access to your S3 bucket, check out this link.
Step 3.2: Create an AWS IAM Role and record your IAM Role ARN value located on the role summary page because we are going to need it later on.
Step 3.3: Create a cloud storage integration using the STORAGE INTEGRATION command.
CREATE STORAGE INTEGRATION <integration_name>
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = S3
ENABLED = TRUE
STORAGE_AWS_ROLE_ARN = '<iam_role>'
STORAGE_ALLOWED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/')
[ STORAGE_BLOCKED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') ]
Where:
- <integration_name> is the name of the new integration.
- <iam_role is> the Amazon Resource Name (ARN) of the role you just created.
- <bucket> is the name of an S3 bucket that stores your data files.
- <path> is an optional path that can be used to provide granular control over objects in the bucket.
Step 3.4: Recover the AWS IAM User for your Snowflake Account
- Execute the DESCRIBE INTEGRATION command to retrieve the ARN for the AWS IAM user that was created automatically for your Snowflake account:
DESC INTEGRATION <integration_name>;
- Record the following values:
Value | Description |
STORAGE_AWS711/_IAM_USER_ARN | The AWS IAM user created for your Snowflake account. |
STORAGE_AWS_EXTERNAL_ID | The external ID that is needed to establish a trust relationship. |
Step 3.5: Grant the IAM User Permissions to Access Bucket Objects
- Log into the AWS Management Console and from the console dashboard, select IAM.
- Navigate to the left-hand navigation pane and select Roles and choose your IAM Role.
- Select Trust Relationships followed by Edit Trust Relationship.
- Modify the policy document with the IAM_USER_ARNand STORAGE_AWS_EXTERNAL_ID output values you recorded in the previous step.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "<IAM_USER_ARN>"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<STORAGE_AWS_EXTERNAL_ID>"
}
}
}
]
}
- Click the Update Trust Policy button to save the changes.
Step 3.6: Create an External Stage that references the storage integration you created
grant create stage on schema public to role <IAM_ROLE>;
grant usage on integration s3_int to role <IAM_ROLE>;
use schema mydb.public;
create stage my_s3_stage
storage_integration = s3_int
url = 's3://bucket1/path1'
file_format = my_csv_format;
Step 3.7: Execute COPY INTO <table> SQL command to load data from your staged files into the target table using the Snowflake client, SnowSQL.
Seeing that we have already configured an AWS IAM role with the required policies and permissions to access your external S3 bucket, we have already created an S3 stage. Now that we have a stage built in Snowflake pulling this data into your tables will be extremely simple.
copy into mytable
from s3://mybucket credentials=(aws_key_id='$AWS_ACCESS_KEY_ID' aws_secret_key='$AWS_SECRET_ACCESS_KEY')
file_format = (type = csv field_delimiter = '|' skip_header = 1);
This SQL command loads data from all files in the S3 bucket to your Snowflake Warehouse.
SQL Server to Snowflake: Limitations and Challenges of Using Custom Code Method
The above method of connecting SQL Server to Snowflake comes along with the following limitations:
- This method is only intended for files that do not exceed 160GB. Anything above that will require you to use the Amazon S3 REST API.
- This method doesn’t support real-time data streaming from SQL Server into your Snowflake DW.
- If your organization has a use case for Change Data Capture (CDC), then you could create a data pipeline using Snowpipe.
- Also, although this is one of the most popular methods of connecting SQL Server to Snowflake, there are a lot of steps that you need to get right to achieve a seamless migration. Some of you might even go as far as to consider this approach to be cumbersome and error-prone.
Method 2: Using Hevo Data to Connect SQL Server to Snowflake
Hevo Data Logo
Hevo Data, a No-code Data Pipeline, helps you directly transfer data from Microsoft SQL Server and 100+ other data sources to Snowflake, Databases, BI tools, or a destination of your choice in a completely hassle-free & automated manner for free. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.
Hevo Data takes care of all your data preprocessing needs and lets you focus on key business activities and draw a much more powerful insight on how to generate more leads, retain customers, and take your business to new heights of profitability. It provides a consistent & reliable solution to manage data in real-time and always has analysis-ready data in your desired destination.
Sign up here for a 14-Day Free Trial!
The following steps are required to connect Microsoft SQL Server to Snowflake using Hevo’s Data Pipeline:
- Connect to your Microsoft SQL Server source.
- Select the replication mode: (i) Full dump and load (ii) Incremental load for append-only data (iii) Incremental load for mutable data.
- Configure your Snowflake Data Warehouse as shown in the below image.
Configuring Snowflake
Conclusion
The article introduced you to how to migrate data from SQL server to Snowflake. It also provided a step-by-step guide of 2 methods using which you can connect your Microsoft SQL Server to Snowflake easily. The article also talked about the limitations and benefits associated with these methods. The manual method using SnowSQL works fine when it comes to transferring data from Microsoft SQL Server to Snowflake, but there are still numerous limitations to it.
That’s where a proprietary Data Integration Platform like Hevo comes into play. Hevo is able to simplify this process and shorten it to just minutes. It is a faster, easier-to-use, data pipeline solution. The best thing is that you don’t even have to use SQL queries or complex JSON manifests.
Visit our Website to Explore Hevo
Hevo Data offers a faster way to move data from 100+ data sources such as SaaS applications or Databases such as Microsoft SQL Server to Snowflake Data Warehouse to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.
Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.
What are your thoughts about the different approaches to moving data from Microsoft SQL Server to Snowflake? Let us know in the comments.