Snowflake Pricing: A Comprehensive Guide for 2022

on Data Warehouse, Data Warehouses, ETL, Snowflake, SQL • November 17th, 2021 • Write for Hevo

Understanding Snowflake Pricing - Featured Image

Snowflake is a popular Cloud Data Warehousing solution that has been implemented by scores of well-known firms, including Fortune 500 companies, as their Data Warehouse provider and manager. However, the process of understanding Snowflake Pricing is not straightforward. 

This article describes the many aspects of Snowflake Pricing that one should be aware of before going ahead with the implementation. Specifically, the article delves into the different usage-related cost accruals at the data storage level as well as the computational resources level. Furthermore, it explains the different Pricing plans offered by Snowflake as well as the strategies around making the right call in terms of going about deciding on a particular plan.

Table of Contents

What is Snowflake?

Snowflake Logo
Image Source

Snowflake is the leading Cloud-based Data Warehouse that has steadily grown and become popular in the past few years. Snowflake provides a scalable Cloud-based platform for enterprises and developers and supports advanced Data Analytics. There are multiple data stores available, but Snowflake’s architectural capabilities and data sharing capabilities are unique. Snowflake’s architecture enables storage and computing to scale independently, so customers can use storage and computing separately and pay for it.

The best property of Snowflake is that it provides separate storage and calculation options for data. Snowflake is designed to ensure that users do not require minimal effort or interaction to perform performance or maintenance-related activities. The minimum and maximum group size and scaling occurs automatically in this area at a very high speed

To learn more about Snowflake, visit here.

Simplify ETL with Hevo’s No-code Pipeline

Hevo Data, a No-Code Data Pipeline, helps you transfer data from 100+ sources to Snowflake. Hevo is fully managed and completely automates the process of not only exporting data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that data is handled in a secure, consistent manner with zero data loss.

Get Started with Hevo for Free

Check out some amazing features of Hevo (Official Snowflake ETL Partner):

  • Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
  • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to export. 
  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of Schema management & automatically detects the Schema of incoming data and maps it to the destination Schema.
  • Minimal Learning: Hevo with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
Sign up here for a 14-Day Free Trial!

Snowflake Pricing Factors

The pricing philosophy adopted by Snowflake is not very different from that of some other popular Cloud Data Warehouse services such as Amazon Redshift, in that, the emphasis is on providing maximum flexibility for the user, in terms of usability of the different services provided. To elaborate on this, Snowflake pricing depends on how the following services are being utilized by the user:

Snowflake Pricing Factor 1: Virtual Warehouses

Virtual Warehouse
Image Source

These are a set of servers, together called a compute cluster, that can carry out operations like query execution and data loading. Depending on the size of your data and the number of users tasked with the data warehousing and data management operations, Snowflake offers the following set of computing clusters categorized by their sizes (the number of servers in the cluster):

Type# of servers
X-Small1
Small2
Medium4
Large8
X-Large16
2X-Large32
3X-Large64
4X-Large128

The usage activity for these servers is tracked and converted to what is known as Snowflake credits. Hence, for availing any of these warehouse-related services one has to purchase a bunch of credits that can then be used to keep the servers operational, as well as for utilizing the services described in the upcoming sections – data storage and cloud services. There are two different ways to purchase credits; this will be covered in a later section.

In terms of virtual warehouses, the cluster size is directly related to the usage credits. For example – The size 2 cluster requires 0.0006 credits per second (or 2 credits per hour) and the Size 32 cluster requires 0.0089 credits per second (or 32 credits per hour). Billing is done at the second level, hence a warehouse that was operational for 37 minutes and 12 seconds is only billed for those 37.12 minutes.

Another thing to be kept in mind is that Snowflake provides for the option of ‘suspending’ warehouses when they are not in operation. ‘Suspended’ warehouses are not billed or, in other words, they do not accrue usage credits.

The warehouse activity can be monitored in a couple of ways:

  • Using the web interface: Account -> Billing & Usage
  • Using the SQL table function: WAREHOUSE_METERING_HISTORY
Download the Cheatsheet on How to Set Up ETL to Snowflake
Download the Cheatsheet on How to Set Up ETL to Snowflake
Download the Cheatsheet on How to Set Up ETL to Snowflake
Learn the best practices and considerations for setting up high-performance ETL to Snowflake

Snowflake Pricing Factor 2: Data Storage

Data Storage Logo
Image Source

Data is stored and managed in Snowflake under the following three cases : 

  • All the data is stored as internal stages which are used for data loading. This is typically part of the ETL process where data from an external source is first uploaded to Snowflake and stored as a stage which is later copied into a Snowflake table using bulk data loading.
  • All the storage space is occupied by the Snowflake tables. A point to note here is that Snowflake automatically compresses all the table data so that the actual physical space occupied by these tables is less than their combined raw sizes.
  • There is some data stored for historical fail-safe purposes.

Similar to usage monitoring for compute clusters, the data usage information for the account is available as either the web interface (same location) or table functions:

  • DATABASE_STORAGE_USAGE_HISTORY
  • STAGE_STORAGE_USAGE_HISTORY

In addition to data usage at the account level, admins can look into the data usage of specific tables via the following :

  • Using the web interface :Databases -> select db_name -> Tables
  • Using the SQL table function: TABLE_STORAGE_METRICS

Snowflake Pricing Factor 3: Cloud Services

Cloud Services Logo
Image Source

These are a set of administrative services to ensure the smooth handling and coordination of a bunch of Snowflake tasks. These tasks include: 

  • Authentication
  • Infrastructure management
  • Metadata management
  • Query parsing and optimization
  • Access control

Cloud services require a certain amount of computing and hence it consumes some credits for their operations. However, 10% of the actual compute (compute from the warehouse operations) is discounted from the compute credits used up by the cloud services at a daily level. So for instance, if the compute from the operational clusters = 100 credits and cloud services compute = 15, then the final compute of cloud services for that day = 15 – (10% of 100) = 5. 

Cloud services are generally not monitored for optimizing usage as much as it is done with data storage and virtual Data Warehouses, however, Snowflake provides for a couple of methods to do the same :

1) Query History

SQl Logo
Image Source

To understand the specific queries (by their type) that are consuming cloud service credits, the following SQL can be used – 

select 

  query_type, 

  sum(credits_used_cloud_services) as cloud_services_credits

from snowflake.account_usage.query_history

where 

  start_time >= '2020-01-01 00:00:01'

group by 1;

2) Warehouse History

Snowflake Warehouse Architecture
Image Source

To find out the virtual warehouses that use up cloud service credits, the following query can be used – 

select    warehouse_name,   sum(credits_used_cloud_services) credits_used_cloud_services,   sum(credits_used_compute) credits_used_compute,   sum(credits_used) credits_used from snowflake.account_usage.warehouse_metering_history where    start_time >= ‘2020-01-01 00:00:01’ group by 1;

Snowflake Pricing Purchase Plans

Now that you have an idea as to how the costs are incurred based on the credits accrued depending on the usage of the different Snowflake services, this section talks about the options for choosing a pricing plan: 

  • On-Demand
    This is similar to the pay-as-you-go pricing plans of other cloud providers such as Amazon Web Services where you only pay for what you consume. At the end of the month, a bill is generated with the details of usage for that month. There is a $25 minimum for every month, and for data storage, the rates are typically set to $40 per TB.
  • Pre-Purchased Capacity
    With this option, a customer can purchase a set amount or capacity of Snowflake resources in advance. The major advantage of going with this plan is that the packaged pre-purchase rates will be available at a lower price than the corresponding On Demand option.

A popular way of going about the pricing strategy, especially when you are new and unsure about this, is to first opt for the On-demand, and then switch to Pre-purchased. Once the On-Demand cycle starts, monitor the resource usage for a month or two, and once you have a good idea for your monthly data warehousing requirements, switch to a pre-purchased plan to optimize the recurring monthly charges.

Optimizing Snowflake Pricing

As pointed out in the sections before, there are many things to be dealt with in terms of understanding the usage of different Snowflake resources and how that translates into costs.

Here are a few things to be kept in mind that will help optimize these incurred costs:

  • Depending on your location, it is important to choose the cloud region (like US East, US West, etc. depending on the cloud provider) wisely, to minimize latency, to have access to the required set of features, etc. If you are to move your data to a different region, later on, there are data transfer costs associated with it at a per Terabyte scale. So the larger your data store, the more the costs you incur.
  • It can make quite a difference to the costs incurred by optimally managing the operational status of your compute clusters. The features such as ‘auto suspension’ and ‘auto resume’ should be made use of unless there is a better strategy to address this.
  • The workload/data usage monitoring at an account level, warehouse level, database, or table level is necessary to make sure there aren’t unnecessary query operations or data storage contributing to the overall monthly costs.
  • Make sure to have the data compressed before storage as much as possible. There are instances, such as storing database tables, where Snowflake automatically does a data compression, however, this is not always the case, so this is something to be mindful of and to be monitored regularly.
  • Snowflake works better with the date or timestamp columns stored as such rather than them being stored as type varchar.
  • Try to make more use of transient tables as they are not maintained in the history tables which in turn reduces the data storage costs for history tables.

Conclusion

The article introduced you to Snowflake and explained in detail the factors on which Snowflake Pricing depends. Moreover, it discussed the various Snowflake Pricing models and the ways in which you can optimize your cost-effectively. You can have a good working knowledge of Snowflake by understanding Snowflake Create Table.

Visit our Website to Explore Hevo

An easier solution for all your data handling needs can be to use an official Snowflake partner like Hevo Data. Hevo is a fully managed cloud platform that brings data from multiple sources to Snowflake easily in real-time.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

What are your thoughts on Snowflake Pricing? Let us know in the comments.

No-code Data Pipeline for Snowflake