6 Best Snowflake ETL Tools For 2020

on Data Integration, Data Warehouse • May 18th, 2020 • Write for Hevo

Introduction

Are you having trouble setting up and maintaining a reliable ETL process for your organization? Are you trying to better understand the plethora of ETL tools available in the market so as to see if any of them fits your bill? Are you a Snowflake customer (or planning on becoming one) looking to extract and load data from a variety of sources? If any of the above questions apply to you, read on as this article compares some of the popular ETL service providers based on multiple features commonly used for evaluation. This article is geared specifically towards comparing and evaluating ETL service providers that have the ability to move data into the Snowflake data warehouse, among other destinations.

Here’s a snapshot of what this post will take you through:

Introduction to Snowflake

Snowflake is a fully managed, cloud data warehouse available to customers in the form of Software-as-a-Service (SaaS). Snowflake querying adheres to the standard ANSI SQL protocol supporting fully structured as well as semi-structured data like JSON, Parquet, XML, etc. It is highly scalable in terms of the number of users that can be supported as well as the computing power. It offers pre-purchasable packaged pricing plans as well as flexible, pay-as-you-go pricing charged at per second levels of resource usage.

What is Snowflake ETL?

ETL stands for Extract, Transform, and Load. It is the process by which data is extracted from one or more sources, transformed into compatible formats, and loaded into a target database or data warehouse. The sources may include flat files, third-party applications, databases, etc. Snowflake ETL means applying the process of ETL to load data into Snowflake data warehouse. This comprises the extraction of relevant data from data sources, making necessary transformations to make the data analysis-ready and then loading it into Snowflake.

In more recent times, a variant of the traditional ETL method, ELT, has started becoming increasingly popular. As the acronym suggests, the major difference with the ELT approach is that the data undergoes required transformations after it is loaded into the target database or data warehouse. Needless to say, ELT is also a popular approach to move data into Snowflake.

Why ETL your data into Snowflake?

In case you are pondering on investing in a new data warehouse, Snowflake is a proven solution that comes with a lot of handy features. Here are some of them:

  • Decoupled architecture: Snowflake architecture consists of three layers – storage, compute, and cloud services. Because they are decoupled, it allows for independent scaling up/down of these layers. As a result, it removes any requirement to pre-commit to a set of resources, as is the case with the traditional, unified architecture.
  • JSON using SQL: The ability to work with JSON data is a lot like querying traditional structured data using a set of types and functions like variant, parse_json, etc.
  • UNDROP and Fast Clone: Using the UNDROP SQL command, you are able to bring back a dropped table without having to wait for it to be restored from a backup. Fast Clone is a feature that lets you clone a table or an entire database, typically in a matter of seconds, at no additional service cost.
  • Encryption: Snowflake comes with many encryption mechanisms such as end-to-end encryption, client-side encryption, etc. ensuring a high level of data security at no additional cost.
  • Query optimization: There are query optimization engines that run in the background to understand and automatically improve the query performances. This lets the SQL scripters to not worry about the optimization practices such as indexing, partitioning, etc. 

Factors to consider while evaluating Snowflake ETL tools

There are several plug-and-play as well as heavily customizable ETL products to move data from a variety of data sources into Snowflake. Every business needs to prioritize certain things over others in making a decision to invest in the right ETL product for its operations. Here are some factors that need to be considered for evaluating such products:

  • Paid or open-source: Cost is always a concern – the choice here would be between in-house custom development or utilizing the expertise of a reputed ETL service provider.
  • Ease of use: This can vary from simple drag and drop GUIs to writing SQL or Python scripts to enable complex transformations in the ETL process.
  • Ability to move data from a wide array of data sources: Ideally, you would want one service provider to service all your data engineering and ETL needs. Hence, in terms of the number of data sources, the more the merrier.
  • Option for adding/modifying data sources: Most ETL service providers support a fixed set of data sources. In case you need to leave room for custom additions of new sources, you need to make sure that this is an option.
  • Ability to transform the data: There are some tools that focus on extracting and loading data and may have zero to very little transformation options. Hence, it is important to understand the level of data transformation supported by the ETL product.
  • Pricing: Price depends on a range of factors and use cases. It is important to clearly understand your ETL requirements while evaluating different service providers to maximize the bang for your buck.
  • Product documentation: Even when reliable customer support is available, it can be useful to have access to detailed documentation for in-house engineers to tweak or troubleshoot something quickly.
  • Customer support: It goes without saying that timely, efficient, and multi-channel customer support is quite important in this whole process.

6 Best Snowflake ETL Tools

Following are 6 best ETL tools for Snowflake:

Blendo

Key features

  • Blendo is easy to set up with no coding required.
  • It supports more than 45 data sources that include many SaaS platforms, cloud storages and databases. Here’s the complete list: https://www.blendo.co/integrations/.
  • It offers data monitoring and notification features to get alerted for data pipeline breakdowns.
  • Customer support is available via Intercom online chat and email. 
  • Product documentation is available as a knowledge base on the company website.
Snowflake ETL Tools: Blendo

Limitations

  • The product focuses heavily on the extraction and loading components of the ETL process, hence this tool may not be the ideal choice for use cases involving data transformations.
  • Users do not have the ability to add a new data source (or tweak an existing one) on their own.

Pricing

  • Blendo offers a 14-day fully-featured free trial.
  • The basic plan starts at $150 per month. More details on pricing are available at https://www.blendo.co/pricing/.

Best suited use case

Blendo is a good option for people looking for a relatively simple and straightforward ETL service that can be quickly set up to run data loads from a bunch of different data sources. It has one of the cheapest pricing plans available.

Matillion

Key features

  • It comes with two product offerings: Data Loader and Matillion ETL. Data Loader is an easy to use, GUI based cloud solution to load data into data warehouses. Matillion ETL includes data transformation options for the source data before loading into the data warehouse.
  • The data transformations can be accomplished via custom SQL, or by creating transformation components using the GUI.
  • It supports more than 70 data sources that include databases, CRM platforms, social networks, etc. Here’s the complete list: https://www.matillion.com/technology/integrations/#matillion-etl.
  • Customer support is available through an online ticketing system as well as over the phone.
  • Documentation is available as articles tailored towards specific data warehouses as well as for the Data Loader product.
Snowflake ETL Tools: Matillion

Limitations

  • Live chat support is not available.
  • Users do not have the ability to add a new data source (or tweak an existing one) on their own.

Pricing

  • Data Loader is free of charge, and Matillion ETL comes with a 14-day free trial.
  • The basic plan for Matillion ETL is priced at an approximate annual cost of $12000. Here’s the detailed information on pricing: https://www.matillion.com/pricing/#matillion-etl.

Best suited use case

Matillion offers the flexibility of two versions of its product, one being free of cost to use. Matillion ETL is relatively expensive however it supports an extensive list of input sources covering all major databases, popular social media platforms and an array of SaaS products.

Hevo Data

Key features

  • Hevo, a No-code Data Pipeline, is easy to set up with no coding required; there is practically zero maintenance required from the end-user as this is a fully managed cloud offering.
  • It provides robust, plug-and-play data transformation functionality.
  • It supports seamless near real-time data replication.
  • It has a fault-tolerant architecture that handles any data disruptions gracefully while providing complete transparency into the operational status of real-time loads.
  • It supports 100+ data sources including SaaS platforms, cloud storage, databases, BI tools, etc. Here’s the complete list of integrations that Hevo supports: https://hevodata.com/integrations/.
  • It offers unparalleled customer support through online Intercom chat, email, and over the phone.
  • Comprehensive documentation, including video resources on the product, is available on the company website.
  • It provides reliable data replication with zero data loss.

Best suited use case

Hevo finds itself in a unique position of offering a long list of integrations while providing the most value for your money. It also provides a quick set up and an intuitive interface to customize your ETL process. 

Pricing

  • It comes with a 14-day free trial.
  • The basic plan starts at $149 per month. Detailed pricing information can be found at https://hevodata.com/pricing/.

StreamSets

Key features

  • Like Matillion, StreamSets is also available in two versions: Data Collector (focused on moving data from source to destination) and Transformer (to perform comprehensive ETL, powered by Apache Spark clusters).
  • It provides a drag and drop GUI to perform transformations such as lookup, add, remove, typecast, etc. before loading data into the destination.
  • It allows customers to add new data sources on their own. Custom data processors can be written in JavaScript, Groovy, Scala, etc.
  • It supports more than 50 data sources including databases and streaming sources such as Kafka and MapR. The full list is available at https://streamsets.com/support/connectors/.
  • Customer support is available through an online ticketing system as well as over call.
  • Extensive product and operational documentation is available on the company website.
Snowflake ETL Tools: StreamSets

Limitations

  • Live customer chat support is not available.
  • It lacks extensive coverage of SaaS input sources.

Pricing

  • It offers a 30-day free trial.
  • Basic pricing options are not directly available on the company website. You can get in touch with their team to know more about pricing.

Best suited use case

StreamSets is particularly well suited for users with a lot of event and file streaming sources. It also provides options for users to make changes to the input sources unlike other completely off-the-shelf products, so this aligns well with teams that can work to technically customize their ETL process.

Etleap

Key features

  • It provides an intuitive GUI to create data pipelines for extract, transform and load as separate steps.
  • Data transformation can be done via GUI as well as custom SQL.
  • It supports more than 50 data sources that include databases, SaaS, file and event streams, and BI tools.
  • In-app and online customer chat support are available.
Snowflake ETL Tools: Etleap

Limitations

  • Users do not have the ability to add a new data source (or tweak an existing one) on their own.
  • The company website does not have a separate documentation section.

Pricing

  • A 30-day free trial is available after a demo with the sales team.
  • Pricing options are not directly available on the company website. You can request a demo or get in touch with their team to know more.

Best suited use case

Etleap is a nice blend of setting up your ETL using an intuitive GUI as well as providing an option to add your custom logic for data transformations. The company also focuses on communicating its value proposition via product demonstrations.

Apache Airflow

Key features

  • Apache Airflow is an open-source product available to download and use for free.
  • It lets people build data workflows as Directed Acyclic Graphs (DAGs) to facilitate ETL.
  • Python code is utilized to add functionality to Airflow workflows.
  • Technically, it has the ability to source from and load data into any system through custom code, or a pre-built module/plugin.
Snowflake ETL Tools: Apache Airflow

Limitations

  • Unlike other off-the-shelf ETL products, this is quite user intensive involving a lot of scripting and Python code for setup and operations.

Pricing

  • Open-source, licensed under Apache License Version 2.0.
  • Detailed online documentation is available for setup and troubleshooting.
  • Support is available through an Airflow Slack community as well.

Best suited use case

Apache Airflow is a typical open-source product, the use of which involves complex coding for set up. For companies looking to develop and manage a custom Snowflake ETL tool in-house using a fairly mature open-source product, Airflow is definitely worth checking out. 

Conclusion

Apart from the ones discussed above, there are even more Snowflake ETL tools available in the market. This is a clear indicator of a huge market for ETL and that many companies are comfortable in outsourcing their ETL needs to these providers. Companies want to invest more time and resources in running analytics and generating insights from their data and less on moving data from one place to another. Hevo, a No-code Data Pipeline, is a proven player in this space with multiple years of service and many happy clients.

Hevo, a No-code Data Pipeline

Hevo helps you load data from any source to Snowflake in real-time without writing a single line of code. Hevo’s intuitive and user-friendly interface allows you to create data pipelines from a variety of sources, set up the required transformations, and start moving data, all within a matter of few clicks. No wonder, it is the solution of choice for anyone looking to make their Snowflake ETL process hassle-free.

What is your preferred ETL tool to move data to Snowflake? What was your experience of moving data to Snowflake? Please let us know in the comments section.

No-code Data Pipeline for Snowflake