What are the 9 Important BigQuery Data Types?

Q: 2. What are the different types of tables in BigQuery?

BigQuery has several types of tables: Native Tables (standard tables for storing data), External Tables (linked to data stored outside BigQuery, like Google Cloud Storage), and Temporary Tables (used for session-specific queries that don’t need to be saved).

Q: 3. What is the preferred data format for BigQuery?

The preferred data formats for BigQuery are Avro and Parquet because they support efficient storage and compression, making them ideal for big data analytics. Additionally, CSV and JSON are also supported but may not be as efficient for large datasets.

Understanding the multitude of data types provided by BigQuery and effectively mapping data from source data types to BigQuery data types can be a daunting task. These data types help you store, structure, and process your data. To fully harness the potential capabilities and on-demand offerings of this warehousing tool, it is imperative to have a solid grasp of Google BigQuery Data types.

In this blog post, you will explore various BigQuery data types with a detailed explanation of their unique set of functions, operations, and specific procedures. As you move ahead through this detailed description, you will also gain valuable insights into how to leverage BigQuery’s data types effectively for your analytical journey.

Table of Contents

Understanding the Different Google BigQuery Data Types

Here is a list of the different Data Types you can find in Google BigQuery:

BigQuery Data Types: NUMERIC
BigQuery Data Types: BOOLEAN
BigQuery Data Types: STRINGS
BigQuery Data Types: BYTES
BigQuery Data Types: TIME AND DATE
BigQuery Data Types: TIMEZONE
BigQuery Data Types: GEOGRAPHY (GIS)
BigQuery Data Types: ARRAY
BigQuery Data Types: STRUCT

A fully managed No-code Data Pipeline platform like Hevo helps you integrate and load data from 150+ different sources to a destination of your choice in real-time like Google BigQuery effortlessly.

Check out some of the cool features of Hevo:

100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
Scalable Infrastructure: Hevo has in-built integrations for 150+ Data sources, that can help you scale your data infrastructure as required.

BigQuery Data Types: NUMERIC

BigQuery supports the following numeric data types:

Integer(INT 64): Represents numbers within the range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and numbers that do not have fractional components
Float (Float 64): Numbers with approximate numeric values and fractional components
Numeric: There is a data type called ‘NUMERIC’ which is similar to ‘Decimal’ which can store values with 38 decimal digits of precision and nine decimal digits of scale. Suitable for exact calculations

Functions supported by numeric BigQuery data types include:

Arithmetic/ Mathematical Functions

Numeric BigQuery data types support basic arithmetic operations such as addition, subtraction, multiplication, etc. The columns and rows that contain numeric type data can always be operated with mathematical functions. An SQL query that involves a mathematical operation is shown below:

With new_data AS (
SELECT 'Male' as Gender, 181  AS Height, 80.2 As Weight
UNION ALL SELECT ‘Female', 168, 74.2 
)
SELECT *, (height/weight) AS height_to_weight_ratio 
FROM new_data

In our query above, we performed an arithmetic operation, which is a division on an integer and a float type. The query yields:

Row	Gender	Height	Weight	Height_to_weight_ratio
1	Male	181	80.2	2.256857855
2	Female	168	74.2	2.264150943

Standard Compliant Floating Point Division

Applying the division operator when the denominator is zero will return a divide by zero error. Also, infinite inputs that would yield a non-finite output will return an overflow error. To avoid these errors from crashing your SQL queries, so you won’t have to be checking for zero values or infinite values beforehand, it is advised to use specific functions to execute the operation.

Specific functions like the IEEE_divide returns NaN (not-a-number) when a division by zero is attempted, this will enable your query to run without returning any errors. The example below gives an illustration of how you can use the IEEE_divide function:

With new_data AS (
SELECT 'Male' as Gender, 181  AS height, 80.2 As weight
UNION ALL SELECT 'Female', 168, 74.2 
UNION ALL SELECT 'Female', 170, 0)
SELECT *,(IEEE_Divide (height,weight)) AS height_to_weight_ratio 
FROM new_data

The output of the above query will be:

Row	Gender	Height	Weight	Height_to_weight_ratio
1	Male	181	80.2	2.256857855
2	Female	168	74.2	2.264150943
3	Female	170	0.0	Infinity

Safe Functions

Safe functions enable you to return NULL instead of returning an error for mathematical operations that are undefined. You can apply the Safe Function by prefixing your operation with the SAFE command. In the example below, we try the logarithm to a negative number which would have otherwise returned an error because the Log of a negative number is undefined, but with the SAFE command, it returns NULL.

SELECT SAFE.LOG(10, -5) AS Log_1, 

SAFE.LOG(10, 5) AS Log_2

The query yields:

Row 1	Log_1	Log_2
1	null	1.431

Precise Decimal Calculations with Numeric

Financial and accounting applications usually require exact values for calculations; this is why the numeric data type is an ideal type to utilize. Numeric data types in BigQuery use 16 Bytes for storage and can represent decimal figures accurately; this is the reason why it is an ideal type for financial calculations.

For illustration, let’s assume we want to carry out some financial calculations, and we need the results to be exact. The query below shows how vital Numeric data types are in our financial applications:

WITH new_data AS ( 
SELECT NUMERIC '5.42' AS Cost
UNION ALL SELECT NUMERIC '7.43'
UNION ALL SELECT NUMERIC '8.94'
)
SELECT 
SUM(Cost) AS Total_cost
FROM new_data

The query yields:

Row	Total_cost
1	21.79

From our query above if we had used the default Numerical datatype which is Float 64 to carry out our calculation, the tiny differences between how each number is represented in memory and the way we entered it will add up and made our final calculations inexact. Using the Numeric data type saves us from that problem.

BigQuery Data Types: BOOLEAN

Boolean data types are variables that are represented by the keyword True or Force. Boolean data type supports the following operations:

Logical Operations

Logical operators are commands that help to filter a result based on some set of conditions. Filtering is done using a WHERE clause and the logical operators set the conditions. Logical operators on BigQuery include AND, OR, and NOT. The result returned when using a WHERE clause and a logical Operator is based on if the WHERE condition is (TRUE or FALSE).

Conditional Operations

Booleans are also very useful in Conditional clauses. Conditional clauses simplify queries by providing an option and also providing an alternative if the clause is not satisfied. In the Query below, we want to compute a new column, which is the sum of the height and weight, but some instances have Null values. We can use a conditional clause like IF to enter a new value for all cases that have NULL values.

WITH new_data AS(
SELECT 181 AS Height, 80 AS Weight
UNION ALL SELECT NULL, 74
UNION ALL SELECT 174, NULL)
SELECT *, (IF(Height is NULL, 170,Height) +
IF(Weight is NULL, 80,Weight)) AS sum_height_weight
FROM new_data

The query yields:

Row	Height	Weight	sum_height_weight
1	181	80	261
2	null	74	244
3	174	null	254

The IF clause works in such a way that if the first parameter in the condition is TRUE, it returns the second parameter, and if FALSE, it returns the third parameter.

BigQuery Data Types: JSON

Google BigQuery supports the JSON data types for handling JSON-formatted or semi-structured data, which represents data in key:value pairs. JSON supports varying and flexible data structures, allowing you to store and manipulate data with different fields and nested structures without a fixed schema within your BigQuery tables.

Here’s an example of how you can create a column in BigQuery with the JSON data type:

CREATE TABLE table_name (
Id INT64;
json-data JSON;
);

BigQuery JSON also provides a set of functions and operators such as JSON_VALUE, JSON_QUERY, and JSON_ARRAY. This enables you to query and extract specific values from JSON data in your SQL queries.

BigQuery Data Types: STRINGS

Strings are Unicode characters with variable lengths. String manipulation is an essential requisite for working with data. BigQuery provides an arsenal of built-in functions that helps enhance your string manipulation tasks. Strings must be quoted either with single, double, or triple quotation marks. The functions supported by strings on BigQuery include:

Casting and Coercion

You can CAST a string as an INT 64 or FLOAT 64 to parse it; you can also cast an integer or float back a string. The CAST Command helps us convert from one data type to another.

SELECT 
CAST (42 AS STRING),
CAST('42' AS INT64)

In some instances, the CAST operation fails if the data conversion is unsuccessful, SAFE_CAST will return NULL, and the query will run successfully. The type of conversion done using SAFE_CAST is called coercion. In the query above, a numeric type is converted to a string and vice versa.

SELECT CAST('true' AS bool),
SAFE_CAST ('new' AS bool)

The query above will fail if we don’t use the SAFE_CAST command as it’s not possible to convert a string like ‘new’ to a boolean data type. With the SAFE_CAST command, NULL is returned instead.

The format of the data we are converting to can also be customized using the FORMAT command. The query below gives an illustration.

SELECT 
FORMAT ('%3d', 500)AS A, 
FORMAT ('%5.3f', 55.5555) AS B,
FORMAT ('%03d', 78) AS C

The query yields:

Row	A	B	C
1	500	55.556	078

With the FORMAT command, we have customized how the String data converted will appear.

Concatenation

The CONCAT command is a convenient tool used to manipulate strings. It is used to concatenate separate parts of different strings into one.

WITH example AS (
SELECT 'John@abc.com' AS email, 'India' as country
UNION ALL SELECT 'Sam@bca.com', 'Canada'
UNION ALL SELECT 'Lucy@cab.com', 'Germany'
)
SELECT email, country,
CONCAT(
SUBSTR(email, 1, STRPOS(email, '@')-1),
' from ', country) AS callers
FROM example

The query yields:

Row	email	country	callers
1	John@abc.com	India	John from India
2	Sam@bac.com	Canada	Sam from Canada
3	Lucy@cab.com	Germany	Lucy from Germany

From the query above, we concatenated the email and country columns and also added conjunction between the strings from the two columns. We used the SUBSTR command to extract the substrings and STRPOS to find the ‘@’ separator character before concatenating them.

BigQuery Data Types: BYTES

Bytes are distinct from strings and should not be interchanged, though most functions supported on strings are also supported on bytes. Bytes are also represented using either single, double, or triple quotation marks, but for bytes, you must ensure it begins with the prefix, letter B(b). The query below gives an illustration on working with Byte data type:

SELECT * FROM new_data 
WHERE  Height = b'16D6M7PN3w7Cn8mJyrmrUSZY9ummMf5QCGEMuiSmSlw

BigQuery Data Types: TIME AND DATE

BigQuery supports various data types for Time and Date instances they include:

DATE: refers to a valid calendar date independent of a time zone. So a particular DATE will represent a different period when recorded in different time zones
TIME: refers to a time, independent of a specific date.
DATETIME: This refers to a point in time; it is time you see when you look at your watch. It is usually when an event happens, any other precision is irrelevant
TIMESTAMP refers to an exact point in time with a microsecond precision regardless of location. Datetime is part of Timestamp with just a lower level of precision.

These data types are all quite interchangeable, just with different levels of precision. So we can extract DateTime from a timestamp and cast a timestamp to a DateTime. The query below gives an illustration:

SELECT
EXTRACT(TIME FROM CURRENT_TIMESTAMP()) as ct
EXTRACT(DATETIME FROM CURRENT_TIMESTAMP()) as dt, CAST(CURRENT_DATETIME() AS TIMESTAMP) as ts

After executing the above query, it’ll extract the Time and Datetime from the current Timestamp and CAST the current Datetime as Timestamp. This illustration gives an overview of how you can apply BigQuery’s Datetime and Timestamp modules in diverse scenarios.

BigQuery Data Types: TIMEZONE

It is also pertinent that we understand BigQuery’s timezone format. Timezones are used when parsing timestamps or formatting timestamps for display. Timezones are represented using these formats:

Offset from Coordinated Universal Time(UTC): When using this format, no spaces are allowed between the time zone and the rest of the Timestamp. An example is shown below:
```
2019-08-21 12:30:00.45+1:00 #where +1 is the time zone
```
Time zone name from tz database: When using this format, space is required between the name of the timezone and the rest of the time stamp. An example is shown below:
```
 2019-08-21 12:30:00.45 India/New_Delhi 
```

BigQuery Data Types: GEOGRAPHY (GIS)

The Geography data type is used to represent points, lines, and polygons on the surface of the earth. In BigQuery, all the geographical positions are represented in WGS84 reference ellipsoid which is the same ellipsoid used in the Global Positioning System (GPS), hence the longitude and latitude can be used directly in BigQuery to find the location of a place. Geography Functions on BigQuery are peculiar and are prefixed with ST_. signature. Functions supported by Geography data type on BigQuery are classified based on their use-cases. The classifications include:

Constructors: These functions build new Geography values from coordinates or existing Geography data. An example is:
```
 ST_GEOGPOINT(longitude, latitude) 
```
Parsers: These functions create Geography data types from external formats like WKT and GeoJSON. An example is:
```
ST_GEOGFROMTEXT(wkt_string)
```
Formatters: These are the opposite of Parsers, they extract Geography data types to an external format such as WKT and GeoJSON
```
ST_ASTEXT(geography_expression)
```
Transformations: These functions are used to generate a new Geography data type from other Geography data types. Examples include:
```
ST_INTERSECTION(geography_1, geography_2) ST_BOUNDARY(geography_expression)
```

Predicates: are used to filter clauses in Geography data types, they return TRUE or FALSE for relationships between Geographies or properties of Geography data types. For example, the ST_DWITHIN command is a predicate.

ST_DWITHIN(geography_1, geography_2),rangeThe query above returns TRUE if the distance between at least one point in geography_1 and one point in geography_2 is less than or equal to the distance given by the range argument.

Accessors: These functions provide unrestricted access to the property of geography. An example is ST_NUMPOINTS
```
ST_NUMPOINTS(geography_expression)
```
The query above returns the number of points, the number of line string vertices, and the number of polygon vertices from the geography_expression argument.
Measures: These functions compute the measurement of one or more Geography data types. For example the ST_Distance command:
```
ST_DISTANCE(geography_1, geography_2) 
```
The query above returns the shortest distance in meters between geograpphy_1 and geography_2 so far they are non-empty.

Aggregate Functions: These functions carry out Geography- specific aggregate functions. For example, the ST_UNION_AGG command:

ST_UNION_AGG(geography)The query above returns a Geography the represents the point set Union of all input Geography data types.

Integrate Adroll to BigQuery

Get a Demo Try it

Integrate Amazon DocumentDB to BigQuery

Get a Demo Try it

Integrate Amazon Ads to BigQuery

Get a Demo Try it

BigQuery Data Types: ARRAY

ARRAY data types are ordered lists of zero or more elements of any none-ARRAY type. ARRAYs are represented using the angle brackets (< and >). The table below shows declarations that are supported and unsupported by ARRAYs on BigQuery:

ARRAY Declaration	Meaning
ARRAY<Float 64>, ARRAY<Int 64>	ARRAYs support majorly all data types.
ARRAY<NULL>	BigQuery raises an error if a query is supposed to return an ARRAY that contains NULL, but you can use such ARRAYs inside your query.
ARRAY<ARRAY<INT 64>>	ARRAYs of ARRAYs are not supported on BigQuery
ARRAY<Struct<ARRAY<INT 64>>>	The way to declare an ARRAY of An ARRAY is by juxtaposing a Struct in between the ARRAY.

BigQuery Data Types: STRUCT

STRUCT types are containers of ordered fields with a data type that is required and an optional field name. Structs are also declared using the (< and >) angle brackets. The table below gives illustrations of possible declarations that are possible with Structs

STRUCT Declaration	Meaning
STRUCT<INT 64>	STRUCT support majorly all data types
STRUCT <x STRUCT <’y’ String, z INT 64>>	A STRUCT nested inside another STRUCT named x. The STRUCT x has two fields y which is a string and z which is an integer.
STRUCT <ARRAY<INT 64>>	A STRUCT that contains an ARRAY and the ARRAY holds an integer
STRUCT<NULL>	STRUCT supports NULL values or fields that have NULL values.

Conclusion

Hope this blog gives a detailed overview of Google BigQuery data types, variants, and usage. Understanding the intricacies of data types is crucial for optimizing queries, improving performance, and ensuring accurate analysis within the BigQuery ecosystem.

With integration with 150+ Data Sources (60+ free sources), Hevo helps you not only export data from sources & load data to the destinations but also transform & enrich your data, & make it analysis-ready.

You can have a look at the unbeatable Hevo pricing that will help you choose the right plan for your business needs. Sign Up for a 14-day free trial. Do you have any questions? Let us know in the comments.

FAQs

1. What type of data can be stored in BigQuery?

BigQuery can store various types of data, including structured data (like tables with rows and columns), semi-structured data (such as JSON and Avro), and unstructured data (like text files). It’s designed to handle large datasets efficiently for analysis and reporting.

2. What are the different types of tables in BigQuery?

BigQuery has several types of tables: Native Tables (standard tables for storing data), External Tables (linked to data stored outside BigQuery, like Google Cloud Storage), and Temporary Tables (used for session-specific queries that don’t need to be saved).

3. What is the preferred data format for BigQuery?

The preferred data formats for BigQuery are Avro and Parquet because they support efficient storage and compression, making them ideal for big data analytics. Additionally, CSV and JSON are also supported but may not be as efficient for large datasets.

Bukunmi I Freelance Technical Content Writer, Hevo Data

Bukunmi is curious about learning on complex concepts and latest trends in data science and combines his flair for writing to curate content for data teams to help them solve business challenges.

9 Google BigQuery Data Types: A Comprehensive Guide

Understanding the Different Google BigQuery Data Types

BigQuery Data Types: NUMERIC

Arithmetic/ Mathematical Functions

Standard Compliant Floating Point Division

Safe Functions

Precise Decimal Calculations with Numeric

BigQuery Data Types: BOOLEAN

Logical Operations

Conditional Operations

BigQuery Data Types: JSON

BigQuery Data Types: STRINGS

Casting and Coercion

Concatenation

BigQuery Data Types: BYTES

BigQuery Data Types: TIME AND DATE

BigQuery Data Types: TIMEZONE

BigQuery Data Types: GEOGRAPHY (GIS)

BigQuery Data Types: ARRAY

BigQuery Data Types: STRUCT

Conclusion

FAQs

1. What type of data can be stored in BigQuery?

2. What are the different types of tables in BigQuery?

3. What is the preferred data format for BigQuery?

Related articles

9 Google BigQuery Data Types: A Comprehensive Guide

Understanding the Different Google BigQuery Data Types

BigQuery Data Types: NUMERIC

Arithmetic/ Mathematical Functions

Standard Compliant Floating Point Division

Safe Functions

Precise Decimal Calculations with Numeric

BigQuery Data Types: BOOLEAN

Logical Operations

Conditional Operations

BigQuery Data Types: JSON

BigQuery Data Types: STRINGS

Casting and Coercion

Concatenation

BigQuery Data Types: BYTES

BigQuery Data Types: TIME AND DATE

BigQuery Data Types: TIMEZONE

BigQuery Data Types: GEOGRAPHY (GIS)

BigQuery Data Types: ARRAY

BigQuery Data Types: STRUCT

Conclusion

FAQs

1. What type of data can be stored in BigQuery?

2. What are the different types of tables in BigQuery?

3. What is the preferred data format for BigQuery?

Related Articles

Optimize your data integration with Hevo!

Related articles