With growing volumes of data, is your SQL Server getting slow for analytical queries? Are you simply migrating data from MS SQL Server to Redshift? Whatever your use case, we appreciate your smart move to transfer data from MS SQL Server to Redshift.
This article, in detail, covers the various approaches you could use to perform data replication from MS SQL Server to Redshift.
This article covers the steps involved in writing custom code to load data from SQL Server to Amazon Redshift. Towards the end, the blog also covers the limitations of this approach.
Understanding the Methods to Connect SQL Server to Redshift
These are the two methods you can implement to set up a connection from SQL Server to Redshift in a seamless fashion:
Download the Cheatsheet on How to Set Up High-performance ETL to Redshift
Learn the best practices and considerations for setting up high-performance ETL to Redshift
Method 1: Using Custom ETL Scripts to Connect SQL Server to Redshift
As a pre-requisite to this process, you will need to have installed Microsoft BCP command-line utility. If you have not installed it, here is the link to download it.
For demonstration, let us assume that we need to move the ‘orders’ table from the ‘sales’ schema into Redshift. This table is populated with the customer orders that are placed daily.
There might be two cases you will consider while transferring data.
- Move data for one time into Redshift.
- Incrementally load data into Redshift. (when the data volume is high)
Let us look at both scenarios:
One Time Load
You will need to generate the .txt file of the required SQL server table using the BCP command as follows :
Open the command prompt and go to the below path to run the BCP command
C:Program Files <x86>Microsoft SQL ServerClient SDKODBC130ToolsBinn
Run BCP command to generate the output file of the SQL server table Sales
bcp "sales.orders" out D:outorders.txt -S "ServerName" -d Demo -U UserName -P Password -c
Note: There might be several transformations required before you load this data into Redshift. Achieving this using code will become extremely hard. A tool like Hevo, which provides an easy environment to write transformations, might be the right thing for you. Here are the steps you can use in this step:
Step 1: Upload Generated Text File to S3 Bucket
We can upload files from local machines to AWS using several ways. One simple way is to upload it using the file upload utility of S3. This is a more intuitive alternative.
You can also achieve this AWS CLI, which provides easy commands to upload it to the S3 bucket from the local machine.
As a pre-requisite, you will need to install and configure AWS CLI if you have not already installed and configured it. You can refer to the user guide to know more about installing AWS CLI.
Run the following command to upload the file into S3 from the local machine
aws s3 cp D:orders.txt s3://s3bucket011/orders.txt
Step 2: Create Table Schema
CREATE TABLE sales.orders (order_id INT,
After running the above query, a table structure will be created within Redshift with no records in it. To check this, run the following query:
Select * from sales.orders
Step 3: Load the Data from S3 to Redshift Using the Copy Command
COPY dev.sales.orders FROM 's3://s3bucket011/orders.txt'
iam_role 'Role_ARN' delimiter 't';
You will need to confirm if the data has loaded successfully. You can do that by running the query.
Select count(*) from sales.orders
This should return the total number of records inserted.
Hevo provides a hassle-free solution and helps you move data from SQL Server to Redshift without any intervention in an effortless manner. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code.
Get Started with Hevo for Free
Hevo’s pre-built integration with SQL Server (among 150+ Sources) will take full charge of the data transfer process, allowing you to focus on key business activities.
Limitations of using Custom ETL Scripts to Connect SQL Server to Redshift
- In cases where data needs to be moved once or in batches only, the custom ETL script method works well. This approach becomes extremely tedious if you have to copy data from MS SQL to Redshift in real-time.
- In case you are dealing with huge amounts of data, you will need to perform incremental load. Incremental load (change data capture) becomes hard as there are additional steps that you need to follow to achieve it.
- Transforming data before you load it into Redshift will be extremely hard to achieve.
- When you write code to extract a subset of data often those scripts break as the source schema keeps changing or evolving. This can result in data loss.
The process mentioned above is frail, erroneous, and often hard to implement and maintain. This will impact the consistency and availability of your data into Amazon Redshift.
Method 2: Using Hevo Data to Connect SQL Server to Redshift
Hevo helps you directly transfer data from SQL Server and various other sources to a Data Warehouse, such as Redshift, or a destination of your choice in a completely hassle-free & automated manner. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled securely and consistently with zero data loss.
Sign up here for a 14-Day Free Trial!
Hevo takes care of all your data preprocessing to set up SQL Server Redshift migration and lets you focus on key business activities and draw a much powerful insight on how to generate more leads, retain customers, and take your business to new heights of profitability. It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination.
- Step 1: Connect to your MS SQL Server by providing the SQL Server hostname, Pipeline Name, SQL Server Port, SQL Server Username, password, database name, and schema name.
- Step 2: Select the replication mode: (a) Full Dump and Load (b) Incremental load for append-only data (c) Incremental load for mutable data.
- Step 3: For each table in MS SQL Server choose a table name in Redshift where it should be copied. Complete the connection by providing your Redshift databases credentials like Database Port, Username, Password, Name, Schema, and Cluster Identifier along with the Destination Name.
As can be seen, you are simply required to enter the corresponding credentials to implement this fully automated data pipeline without using any code.
Check out what makes Hevo amazing:
- Real-Time Data Transfer: Hevo with its strong Integration with 100+ sources, allows you to transfer data quickly & efficiently. This ensures efficient utilization of bandwidth on both ends.
- Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer.
- Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. You need to edit the event object’s properties received in the transform method as a parameter to carry out the transformation. Hevo also offers drag and drop transformations like Date and Control Functions, JSON, and Event Manipulation to name a few. These can be configured and tested before putting them to use.
- Connectors: Hevo supports 100+ integrations to SaaS platforms, files, databases, analytics, and BI tools. It supports various destinations including Amazon Redshift, Snowflake Data Warehouses; Amazon S3 Data Lakes; and MySQL, SQL Server, TokuDB, DynamoDB, PostgreSQL databases to name a few.
- Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled securely and consistently with zero data loss.
- Tremendous Connector Availability: Hevo houses a large variety of connectors and lets you bring in data from numerous Marketing & SaaS applications, databases, etc. such as Google Analytics 4, Google Firebase, Airflow, HubSpot, Marketo, MongoDB, Oracle, Salesforce, Redshift, etc. in an integrated and analysis-ready form.
- Simplicity: Using Hevo is easy and intuitive, ensuring that your data is exported in just a few clicks.
- Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.
- Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
That’s it! You are all set. Hevo will take care of fetching your data incrementally and will upload that seamlessly from MS SQL Server to Redshift via a real-time data pipeline.
Extracting complex data from a diverse set of data sources can be a challenging task and this is where Hevo saves the day!
Visit our Website to Explore Hevo
Hevo offers a faster way to move data from Databases or SaaS applications like SQL Server into your Data Warehouse like Redshift to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.
Sign Up for a 14-day free trial to try Hevo for free. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.