Firestore Data Model: An Easy Guide

on Data Modelling, Firebase, firestore, Firestore Data Model • April 12th, 2022 • Write for Hevo

Firestore Data Model | Cover

Firestore is a NoSQL, document-oriented database modeling and relationship builder, which ultimately means, unlike the SQL databases, no tables or row exists. So, you store data in documents and later they are organized into collections. In short, it helps you optimize queries for performance, cost, and complexity. Using Firestore Data Models, you can seamlessly build responsive applications which work flawlessly even with low internet connectivity or network latency.

In this blog post, we will be talking about Firestore Data Model and techniques which you can employ to build better, faster data models. Let’s begin.

Table of Contents

  1. What is Cloud Firestore
  2. Firestore Data Model
  3. Techniques for Reading & Querying data 
  4. Normalization & Denormalization in Firestore Data Model
  5. Security in Firestore
  6. Conclusion

What is Cloud Firestore?

Firestore Data Model | firestore logo
Cloud Firestore

Firestore is a product that comes under the umbrella category of Google Cloud Platform — and is a flexible, scalable database for mobile, web, and server development.

Like Firebase’s real-time database, Firestore keeps the data in-sync across client apps through real-time listeners. It also offers offline support for mobile and web, so you can build responsive apps that work regardless of network latency or Internet connectivity. 

Cloud Firestores provides excellent and seamless integration with other Google products. 

Cloud Firestore is a NoSQL, document-oriented database, which means data is stored in documents rather than rows and columns. Each Document can be organized into collections.

Key Features of Firestore:

Cloud Firestore provides several benefits over Firebase, some of them are discussed below –

  • Optimized for App development – Cloud Firestores is optimized for app development and helps developers develop apps faster.
  • Synchronizes data between devices in real-time – Cloud Firestore syncs the data between Android, iOS, and other Java-based applications in real-time, enabling users to collaborate easily.
  • Offline Access – Cloud Firestore allows users to access data offline with an on-device database. When the device comes back online, it synchronizes the data with the Cloud Firestore. The above features save users’ data loss in case of network disconnectivity.
  • Fully Managed – Cloud Firestore is a fully managed product from Google. It is built from the ground up to scale up upon demand automatically.
  • High Availability – Cloud Firestore stores the data in multi-region and uses replication factors to ensure high availability of the data in the case of unexpected disasters.
  • Server SDKs – It also means that delivering a great server-side experience for backend developers is a top priority. We’re launching SDKs for Java, Go, Python, and Node.js today, with more languages coming in the future.

Load Data in Minutes Using Hevo’s No-Code Data Pipeline

Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from 100+ Data Sources (including 40+ free sources) straight into your Data Warehouse or any Databases. To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!


Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Transformation & Loading!

Firestore Data Model

Cloud Firestore is a cloud-hosted NoSQL database that is directly accessible via SDKs from iOs, Android, and any other web application. 

In Cloud Firestore, the data is stored in the form of documents which are then organized to form a collection. The collection acts as containers for documents that users can use to organize the data and build queries. 

The Firestore Data Model consists of three different terminologies:

Firestore Data Model | Different terminologies:
Image Source
  1. Document
  2. Collection
  3. Sub-Collection

Let’s discuss each of them in detail:


As Cloud Firestore is a NoSQL database, each entry (or row in terms of SQL) is called a Document. A document is a type of record that contains information in the form of key-value pair.

A document may support different data types, from simple strings and numbers to complex and nested objects. Below is a simple example of the Document representing a user – 

//Document 1

first : "Mac"
last : "Anthony"
born : 1992

A complex or nested document can look like the following:

// Document 2

first : "Mac"
last : "Anthony"
born : 1992
House_no : “2/1”
Street : “10 Belford street”
ZIP : 110012

A document looks like JSON and inherits all the JSON properties.


A collection is simply a container for documents. Several Document forms a collection. For example, you can have a user collection that holds the Document containing user information.


first : "Ada"
last : "Lovelace"
born : 1815
first : "Alan"
last : "Turing"
born : 1912

Firestore is a NoSQL database, which means the collections and documents are schemaless. You have complete freedom over what fields to put in a document and what documents you store in a particular collection. However, it’s a good idea to use the same fields and data types across multiple documents to query the documents more easily.

Now that we have understood what a collection is, there are some rules that you need to keep in mind while creating a collection and attaching a document with it.

  • A collection should only contain a Document. It cannot be a collection of strings, binaries, or anything else.
  • The documents attached to the collection should contain a unique name.
  • The Document should not contain any other documents.
  • The collection should be created before creating a document. 
  • The collection no longer exists, when all documents in the collection are deleted.
//Document 1  
first : "Mac"
last : "Anthony"
born : 1990

//Document 2  
first : "Abdul"
last : "Ahamed"
born : 1987


Sub-Collection is the way to store the hierarchical data. Consider a scenario where there is a requirement to store the data from the Chat app. The chat app contains chat messages and chat rooms, and you need to store them in a single collection and rooms. The correct way to store a message, in this scenario, is to use a sub-collection.

//Document 1
name: "my chat room"
//Sub-collection 1
from : "Shubham"
msg : "Hello Shubham"
//Sub-collection 2
from : "Mac"
msg : "Hello Mac"
//Document 2
name: "my chat room two"
//Sub-collection 1
from : "Sam"
msg : "Hello Sam"
//Sub-collection 2
from : "Mac"
msg : "Hello Mac"

A sub-collection is a collection associated with specific documents. However, there are certain rules that you need to keep in mind while creating a sub-collection. 

  • You cannot reference the collection and Documents in the collection. You can store hierarchical data in the sub-collection, making it easier to access data.
  • You can also have sub-collections in documents in sub-collections, allowing us to nest the data more.

Techniques For Reading & Querying Data in Firestore Data Model

Now that we have a basic understanding of the Firestore Data model. Let us see how we can read and query the data from the collections in Firestore Data Model. For ease of explanation, we will be using python language to read and query the data. However, Firestore Data Model supports several programming languages like C, Java, Kotlin, Node.js, Go, PHP, etc.

To read and query data from Firestore Data Model, you first need to install dependencies and authenticate your application via credentials. Follow the below steps to set up the framework in Python – 

Please note that for Python version 3.7 and later, Firestore Data Model is broken at the point of writing; hence we will be using Python 3.6 for this blog post.

  • Install the dependency.
pip install firebase_admin
  • Authenticate using credentials. 
    • Navigate to your Firebase console
    • Click on Project Settings from the Project Overview.
Firestore Data Model | Reading & Querying Data in Firestore Data Model
Image Source
  • Click on the service account and click on Generate new private key. Download the file and store it on your local machine. Name the file as – accountkey.json (or your preferred name)
  • Add the below code to your python script.
import firebase_admin
from firebase_admin import credentials, firestore

cred = credentials.Certificate("path/to/accountKey.json")
  • Now that the firebase admin is initialized, connect to the Firestore Data Model.
db = firestore.client() 

Now we will use the above created Firestore Data Model client to interact with documents and collections.


To add documents in Cloud Firestore: 

cities_collection = db.collection(u’cities’)

res = cities_collection.document(U’BJ’).set({
‘name’ : ‘Beijing’,
‘country’ : ‘China’,
‘population’ : 215000000


To query data from Document:

doc = db.collection(u’cities’).docuemnt(u’BJ’)

Output - 

‘name’ : ‘Beijing’,
‘country’ : ‘China’,
‘population’ : 215000000


To see all the data from the collection, execute the below Python command – 

users_ref = db.collection(u'users')


To get all the data from subcollection, execute the below python command:

room_a_ref = db.collection(u'rooms').document(u'roomA')
message_ref = room_a_ref.collection(u'messages').document(u'message1')

Normalization & Denormalization

Normalization is a technique used in Firebase to reduce data redundancy. Data redundancy means the repetition of information in a table, and normalization is the process that removes this redundancy.

The below example shows how the students and their attendance are normalized thereby removing the repetitive information:

    "students": {
        "students1": {
            "name": "john thomas"
    "attendance": {
        "students1": {
            "attendance1": {
                "total": "20",
                "absents": {
                    "leave1": "medical emergency",
                    "leave2": "not verified"
            "attendance2": {
                "total": "18",
                "absents": {
                    "leave1": "sports game",
                    "leave2": "verified"

Denormalization is the technique to add redundancy to the data by repeating the information within the data. Denormalization is applied when there is a requirement to maintain history, improve query performance(Denormalized form provides results with all the fields at one go), and speed up the reporting.

What Makes Hevo’s ETL Process Best-In-Class?

Providing a high-quality ETL solution can be a difficult task if you have a large volume of data. Hevo’s automated, No-code platform empowers you with everything you need to have for a smooth data replication experience.

Check out what makes Hevo amazing:

  • Fully Managed: Hevo requires no management and maintenance as it is a fully automated platform.
  • Data Transformation: Hevo provides a simple interface to perfect, modify, and enrich the data you want to transfer.
  • Faster Insight Generation: Hevo offers near real-time data replication so you have access to real-time insight generation and faster decision making. 
  • Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources (with 40+ free sources) that can help you scale your data infrastructure as required.
  • Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-day free trial!

Security In Firestore

In any cloud system, security is of utmost importance. Firestore Data Model provides various security rules that allow users to control access to documents and collections. 

Security in Cloud Firestore Data Model checks for all incoming requests and validates them against the criteria or rules, and those request which doesn’t align with the defined rules are simply rejected.

Security rules in Firestore Data Model provide access control and data validation in a simple and expressive format. To apply security rules, you need to write in below format:

service cloud.firestore {
  match /databases/{database}/documents {
    match /{document=**} {
      allow read, write: if true;

Let’s understand what these parameters mean – 

  • service cloud. firestore: This parameter defines the service, in which case it is the cloud.firestore.
  • match /databases/{database}/documents: This parameter defines the database.
  • match /uploads/{document=**}: Creates a new rule block to apply to the uploaded archive and all documents contained in it.
  • allow read: It allows the public to read access.
  • allow write: This parameter allows public write access


In this blog post, we have discussed Cloud Firestore and Firestore Data Model in detail. We have also discussed Normalization & Denormalization along with Security in Firestore Data Model.

On that more, with the complexity involved in manual and cumbersome processes, businesses today are leaning toward Automated data modeling practices. It is hassle-free, easy to operate, and does not require any technical background. In such a case you can also explore more of Hevo Data for ETL use cases. Hevo Data supports 100+ data sources.

Visit our Website to Explore Hevo

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Share your experience of working with the Firestore Data Model in the comments section below!

No Code Data Pipeline For Your Data Warehouse