Decoding Google BigQuery Pricing

on Data Warehouse • July 29th, 2019 • Write for Hevo

Google BigQuery is a fully managed data warehousing tool that abstracts you from any form of physical infrastructure so you can focus on tasks that matter to you. Hence, understanding Google BigQuery Pricing is pertinent if your business is to take full advantage of data warehousing tool’s offering.

However, the process of understanding Google BigQuery Pricing is not as simple as it may seem. The focus of this blog post will be to help you understand Google BigQuery’s Pricing set up in great detail. This would, in turn, help you tailor your data budget to fit your business needs. 

Table of Contents

Introduction to Google BigQuery

Google BigQuery Logo.

It is Google Cloud Platform’s enterprise data warehouse for analytics. Google BigQuery performs exceptionally even while analyzing huge amounts of data & quickly meets your Big Data processing requirements with offerings such as exabyte-scale storage and petabyte-scale SQL queries. It is a serverless Software as a Service (SaaS) application that supports querying using ANSI SQL & houses machine learning capabilities.

Some key features of Google BigQuery:

  • Scalability: Google BigQuery offers true scalability and consistent performance using its massively parallel computing and secure storage engine.
  • Data Ingestions Formats: Google BigQuery allows users to load data in various formats such as AVRO, CSV, JSON etc.
  • Built-in AI & ML: It supports predictive analysis using its auto ML tables feature, a codeless interface that helps develop models having best in class accuracy. Google BigQuery ML is another feature that supports algorithms such as K means, Logistic Regression etc.
  • Parallel Processing: It uses a cloud-based parallel query processing engine that reads data from thousands of disks at the same time.

For further information on Google BigQuery, you can check the official site here.

Hevo Data: A Smart Alternative for BigQuery ETL

Hevo Data, a No-code Data Pipeline helps to transfer data from multiple sources to BigQuery. Hevo is fully-managed and completely automates the process of not only exporting data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination. It allows you to focus on key business needs and perform insightful analysis using BI tools such as Tableau and many more. 

Check out some amazing features of Hevo:

  • Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
  • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to export. 
  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.

Simplify your data analysis with Hevo today! Sign up here for a 14-day free trial!

Factors that affect Google BigQuery Pricing

Google BigQuery uses a pay as you go pricing model, and thereby charges only for the resources they use. There are mainly two factors that affect the cost incurred on the user, the data that they store and amount of queries, users execute.

You can learn about these factors in depth, in the following sections:

Effect of Storage Cost on Google BigQuery Pricing

Storage costs are based on the amount of data you store in BigQuery. Storage costs are usually incurred based on:

  • Active Storage Usage: Charges that are incurred monthly for data stored in BigQuery tables or partitions that have some changes effected in the last 90 days.
  • Long Time Storage Usage: A considerably lower charge incurred if you have not effected any changes on your BigQuery tables or partitions in the last 90 days. 
  • BigQuery Storage API: Charges incur while suing the BigQuery storage APIs based on the size of the incoming data. Costs are calculated during the ReadRows streaming operations.
  • Streaming Usage: Google BigQuery charges users for every 200MB of streaming data they have ingested.

Data Size Calculation

Once your data is loaded into BigQuery you start incurring charges, the charge you incur is usually based on the amount of uncompressed data you stored in your BigQuery tables. The data size is calculated based on the data type of each individual columns of your tables.  Data size is calculated in Gigabytes(GB) where 1GB is 230 bytes or Terabytes(TB) where 1TB is 240 bytes(1024 GBs). The table shows the various data sizes for each data type supported by BigQuery.

DatatypeSize
Int 64/ Integer8 bytes
Float64/ Float8 bytes
Numeric16 bytes
Boolean1 byte
String 2 bytes + the UTF-8 encoded string size
Bytes 2 bytes + the number of bytes in the value
Date/ DateTime8 bytes
Timestamp 8 bytes
Struct/ Record0 + the bytes of the contained fields
Null value 0 bytes

For example, let’s say you have a table called New_table saved on BigQuery. The table contains 2 columns with 100 rows, Column A and B. Say column A contains integers and column B contains DateTime data type. The total size of our table will be (100 rows x 8 bytes) for column A + (100 rows x 8 bytes) for column B which will give us 1600 bytes.

BigQuery Storage Pricing

Google BigQuery pricing for both storage use cases is explained below. 

  1. Active Storage Pricing: Google BigQuery pricing for active storage usage is as follows:
    Region (U.S Multi-region)    
    Storage Type  Pricing Details
    Active Storage $0.020 per GB BigQuery offers free tier storage for the first 10 GB of data stored each month

    So if we store a table of 100GB for 1 month the cost will be (100 x 0.020) = $2 and the cost for half a month will $1.

    Be sure to pay close attention to your regions. Storage costs vary from region to region. For example, the storage cost for using Mumbai (South East Asia) is $0.023 per GB, while the cost of using the EU(multi-region) is $0.020 per GB.

  2. Long-term Storage Pricing:  Google BigQuery pricing for long-term storage usage is as follows:
    Region (U.S Multi-Region)    
    Storage Type Pricing Details
    Long-term storage $0.010 per GB BigQuery offers free tier storage for the first 10 GB of data stored each month

    The price for long term storage is considerably lower than that of the active storage and also varies from location to location. If you modify the data in your table, it 90 days timer reverts back to zero and starts all over again. Be sure to always keep that in mind.

  3. BigQuery Storage API: Storage API charge is incurred during ReadRows streaming operations where the cost accrued is based on incoming data sizes, not on the bytes of the transmitted data. BigQuery Storage API has two tiers for pricing they are:
    • On-demand pricing: These charges are incurred per usage. The charges are:
      Pricing  Details 
      $1.10 per TB data read BigQuery Storage API is not included in the free tier
    • Flat rate pricing: This pricing is available only to customers on flat-rate pricing. Customers on flat-rate pricing can read up to 300TB of data monthly at no cost. After exhausting 300TB free storage, the pricing reverts to on-demand.

  4. Streaming Usage: Pricing for streaming data into BigQuery is as follows:
    Operation Pricing  Details
    Ingesting streamed data $0.010 per 200MB Rows that successfully ingested are what you are charged for

Effect of Query Cost on Google BigQuery Pricing

This involves costs incurred for running SQL commands, user-defined functions, Data Manipulation Language (DML) and Data Definition Language (DDL) statements. DML are SQL statements that allow you to update, insert, delete data from your BigQuery tables. DDL statements, on the other hand, allows you to create, modify BigQuery resources using standard SQL syntax. 

BigQuery offers it’s customers two tiers of pricing from which they can choose from when running queries. The pricing tiers are:

  1. On-demand Pricing:  In this pricing model you are charged for the number of bytes processed by your query, the charges are not affected by your data source be it on BigQuery or an external data source. You are not charged for queries that return an error and queries loaded from cache. On-demand pricing information is given below:
    Operation Pricing Details
    Queries (on demand) $5 per TB 1st 1TB per month is not billed

    Prices also vary from location to location.

  2. Flat-rate Pricing: This pricing model is for customers who prefer a stable monthly cost to fit their budget. Flat-rate pricing requires its users to purchase BigQuery Slots. All queries executed are charged to your monthly flat rate price. Flat rate pricing is only available for query costs and not storage costs. Flat rate pricing has two tiers available for selection 
    • Monthly Flat-rate Pricing:  The monthly flat-rate pricing is given below:
      Monthly Costs Number of Slots
      $ 10,000 500
    • Annual Flat-rate Pricing: in this pricing model you buy slots for the whole year but you are billed monthly. Annual Flat-rate costs quite lower than the monthly flat-rate pricing system. An illustration is given below:
      Monthly Costs Number of Slots
      $8,500 500

Estimating BigQuery Storage and Query Costs

Now that you have a good idea of what different activities will cost you on BigQuery, the next step would be to estimate your costs. For that operation, Google Cloud Platform(GCP) has a tool called the GCP Price Calculator.

GCP Price Calculator.

In the next sections, let us look at how to estimate both Query and Storage Costs using the GCP Price Calculator:

Using the GCP Price Calculator to Estimate Query Cost

On-demand Pricing:

For customers on the on-demand pricing model, the steps to estimate your query costs using the GCP Price calculator are given below:

  • Login to your BigQuery console home page.
  • Enter the query you want to run, the query validator(the green tick)  will verify your query and give an estimate of the number of bytes processed. This estimate is what you will use to calculate your query cost in the GCP Price Calculator.
Query Validator.

From the image above, we can see that our Query validator will process 3.1 GB of data when the query is run. This value would be used to calculate the query cost on GCP Price calculator.

  • The next action is to open the GCP Price calculator
  • Select BigQuery as your product and choose on-demand as your mode of pricing.
  • Populate the on-screen form with all the required information, the image below gives an illustration.
On-demand Price Estimation.

From the image above the costs for running our query of 3.1GB is $0, this is because we have not exhausted our 1TB free tier for the month, once it is exhausted we will be charged accordingly.

Flat-rate Pricing:

The process for on-demand and flat-rate pricing is very similar to the above steps. The only difference is – when you are on the GCP Price Calculator page, you have to select the Flat-rate option and populate the form to view your charges. 

BigQuery Pricing - FlatRate

Using the GCP Price Calculator to Estimate Storage Cost

The steps to estimating your storage cost with the GCP price calculator are as follows:

  • Access the GCP Price Calculator home page.
  • Select BigQuery as your product.
  • Click on the on-demand tab (BigQuery does not have storage option for Flat rate pricing).
  • Populate the on-screen form with your table details and size of the data you want to store either in MB, GB or TB. (Remember the first 10GB of storage on BigQuery is free)
  • Click add to estimate to view your final cost estimate.

Optimizing BigQuery Cost

The following are some best practices that will prevent you from incurring unnecessary costs when using BigQuery:

  • Avoid using  SELECT * when running your queries, only query data that you need. 
  • Sample your data using the preview function on BigQuery, running a query just to sample your data is an unnecessary cost.
  • Always check the prices of your query and storage activities on GCP Price Calculator before executing them. 
  • Only use Streaming when you require your data readily available. Loading data in BigQuery is free.
  • If you are querying a large multi-stage data set, break your query into smaller bits this helps in reducing the amount of data that is read which in turn lowers cost.
  • Partition your data by date, this allows you to carry out queries on relevant sub-set of your data and in turn reduce your query cost.

Want to give Hevo a spin? Signing up for a 14-day free trial and experience the feature-rich Hevo suite first hand. Have a look at our unbeatable pricing that will help you choose the right plan for you.

We are all ears to hear about any other questions you may have on Google BigQuery Pricing. Let us know in comments?

No-code Data Pipeline for BigQuery