Firestore is a NoSQL, document-oriented database modeling and relationship builder, which means no tables or rows exist, unlike SQL databases. So, you store data in documents that are later organized into collections. In short, it helps you optimize queries for performance, cost, and complexity. Using Firestore Data Models, you can seamlessly build responsive applications which work flawlessly even with low internet connectivity or network latency.

In this blog post, we will be talking about Firestore Data Model and techniques which you can employ to build better, faster data models. We’ll also be talking about Firestore database design. Let’s begin.

Firestore Data Model

Cloud Firestore is a cloud-hosted NoSQL database that is directly accessible via SDKs from iOs, Android, and any other web application. 

In Cloud Firestore, the data is stored in the form of documents which are then organized to form a collection. The collection acts as containers for documents that users can use to organize the data and build queries. 

Scale your data integration effortlessly with Hevo’s Fault-Tolerant No Code Data Pipeline

Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. With integration with 150+ Data Sources (40+ free sources), we help you not only export data from sources & load data to the destinations but also transform & enrich your data, & make it analysis-ready.

All of this combined with transparent pricing and 24×7 support makes us the most loved data pipeline software on review sites.

Take our 14-day free trial to experience a better way to manage data pipelines.

The Firestore Data Model consists of three different terminologies:

Firestore Data Model | Different terminologies:
Image Source
  1. Document
  2. Collection
  3. Sub-Collection

Let’s discuss each of them in detail:

Document

As Cloud Firestore is a NoSQL database, each entry (or row in terms of SQL) is called a Document. A document is a type of record that contains information in the form of key-value pair.

A document may support different data types, from simple strings and numbers to complex and nested objects. Below is a simple example of the Document representing a user – 

//Document 1

first : "Mac"
last : "Anthony"
born : 1992

A complex or nested document can look like the following:

// Document 2

name:
first : "Mac"
last : "Anthony"
born : 1992
address:
House_no : “2/1”
Street : “10 Belford street”
ZIP : 110012

A document looks like JSON and inherits all the JSON properties.

Collection

A collection is simply a container for documents. Several Document forms a collection. For example, you can have a user collection that holds the Document containing user information.

 

users
alovelace
first : "Ada"
last : "Lovelace"
born : 1815
aturing
first : "Alan"
last : "Turing"
born : 1912

Firestore is a NoSQL database, which means the collections and documents are schemaless. You have complete freedom over what fields to put in a document and what documents you store in a particular collection. However, it’s a good idea to use the same fields and data types across multiple documents to query the documents more easily.

Now that we have understood what a collection is, there are some rules that you need to keep in mind while creating a collection and attaching a document with it.

  • A collection should only contain a Document. It cannot be a collection of strings, binaries, or anything else.
  • The documents attached to the collection should contain a unique name.
  • The Document should not contain any other documents.
  • The collection should be created before creating a document. 
  • The collection no longer exists, when all documents in the collection are deleted.
//Colection  
users
//Document 1  
user1
first : "Mac"
last : "Anthony"
born : 1990

//Document 2  
user2
first : "Abdul"
last : "Ahamed"
born : 1987

Sub-Collection

Sub-Collection is the way to store hierarchical data. Consider a scenario where there is a requirement to store the data from the Chat app. The chat app contains chat messages and chat rooms, and you need to store them in a single collection and rooms. The correct way to store a message, in this scenario, is to use a sub-collection.

//Collection
Rooms
//Document 1
roomA
name: "my chat room"
messages
//Sub-collection 1
message1
from : "Shubham"
msg : "Hello Shubham"
//Sub-collection 2
message2
from : "Mac"
msg : "Hello Mac"
//Document 2
roomB
name: "my chat room two"
messages
//Sub-collection 1
message1
from : "Sam"
msg : "Hello Sam"
//Sub-collection 2
message2
from : "Mac"
msg : "Hello Mac"

A sub-collection is a collection associated with specific documents. However, there are certain rules that you need to keep in mind while creating a sub-collection. 

  • You cannot reference the collection and Documents in the collection. You can store hierarchical data in the sub-collection, making it easier to access data.
  • You can also have sub-collections in documents in sub-collections, allowing us to nest the data more.

References

You can identify any document by its location in the Cloud Firestore relational database. The below example shows a document alovelace within the collection users. You can create a reference to it to refer to this location in your code. 

A reference points to a location in your database. It is is a lightweight object; a reference can be created even if data does not exist, in which case, it does not perform any network operations.

Import {doc} from "firebase/firestore";
Const alovelaceDocumentRef = doc(db, 'users', 'alovelace');

You can also create references to collections:

import { collection } from "firebase/firestore";
const usersCollectionRef = collection(db, 'users');

References can also be created by specifying the path to a document or collection as a string. You can separate path components by a forward slash (/).Consider one of the Firestore data model examples of creating a reference to the alovelace document:

import { doc } from "firebase/firestore"; 
const alovelaceDocumentRef = doc(db, 'users/alovelace');

Hierarchical Data

Consider a Firestore Database example of chat app with messages and chat rooms to understand how hierarchical data structures work in Cloud Firestore.

To store different chat rooms, a collection called rooms can be created:

collections_bookmark rooms
class roomA
name : "my chat room"
class roomB
...

Once you have chat rooms, you have to decide how to store your messages. It is not suitable to store them in the chat room’s document. While using the Firestore data modeling tool, Documents in it should be lightweight, while a chat room may contain a large number of messages. Additional collections can be created within your chat room’s document, as subcollections.

Techniques For Reading & Querying Data in Firestore Data Model

Now that we have a basic understanding of the Firestore Data model. Let us see how we can read and query the data from the collections in Firestore Data Model. For ease of explanation, we will be using python language to read and query the data. However, Firestore Data Model supports several programming languages like C, Java, Kotlin, Node.js, Go, PHP, etc.

To read and query data from Firestore Data Model, you first need to install dependencies and authenticate your application via credentials. Follow the below steps to set up the framework in Python – 

Please note that for Python version 3.7 and later, Firestore Data Model is broken at the point of writing; hence we will be using Python 3.6 for this blog post.

  • Install the dependency.
pip install firebase_admin
  • Authenticate using credentials.
    • Navigate to your Firebase console
    • Click on Project Settings from the Project Overview.
Firestore Data Model | Reading & Querying Data in Firestore Data Model
Image Source
  • Click on the service account and click on Generate new private key. Download the file and store it on your local machine. Name the file as – accountkey.json (or your preferred name)
  • Add the below code to your python script.
import firebase_admin
from firebase_admin import credentials, firestore

cred = credentials.Certificate("path/to/accountKey.json")
firebase_admin.initialize_app(cred)
  • Now that the firebase admin is initialized, connect to the Firestore Data Model.
db = firestore.client() 

Now we will use the above created Firestore Data Model client to interact with documents and collections.

Document 

To add documents in Cloud Firestore: 

cities_collection = db.collection(u’cities’)

res = cities_collection.document(U’BJ’).set({
‘name’ : ‘Beijing’,
‘country’ : ‘China’,
‘population’ : 215000000
})

print(res)

To query data from Document:

doc = db.collection(u’cities’).docuemnt(u’BJ’)
print(doc.get().to_dict())

Output - 

{
‘name’ : ‘Beijing’,
‘country’ : ‘China’,
‘population’ : 215000000
}

Collection

To see all the data from the collection, execute the below Python command – 

users_ref = db.collection(u'users')

Sub-Collection 

To get all the data from subcollection, execute the below python command:

room_a_ref = db.collection(u'rooms').document(u'roomA')
message_ref = room_a_ref.collection(u'messages').document(u'message1')

Normalization & Denormalization

Normalization is a technique used in Firebase to reduce data redundancy. Data redundancy means the repetition of information in a table, and normalization is the process that removes this redundancy.

The below example shows how the students and their attendance are normalized thereby removing the repetitive information:

{
    "students": {
        "students1": {
            "name": "john thomas"
        }
    },
    "attendance": {
        "students1": {
            "attendance1": {
                "total": "20",
                "absents": {
                    "leave1": "medical emergency",
                    "leave2": "not verified"
                }
            },
            "attendance2": {
                "total": "18",
                "absents": {
                    "leave1": "sports game",
                    "leave2": "verified"
                }
            }
        }
    }
}

Denormalization is the technique to add redundancy to the data by repeating the information within the data. Denormalization is applied when there is a requirement to maintain history, improve query performance(Denormalized form provides results with all the fields at one go), and speed up the reporting.

All of the capabilities, none of the firefighting  -:

Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs.

Monitoring and Observability – Monitor pipeline health with intuitive dashboards that reveal every stat of pipeline and data flow. Bring real-time visibility into your ELT with Alerts and Activity Logs 

Stay in Total Control – When automation isn’t enough, Hevo offers flexibility – data ingestion modes, ingestion, and load frequency, JSON parsing, destination workbench, custom schema management, and much more – for you to have total control.    

Auto-Schema Management – Correcting improper schema after the data is loaded into your warehouse is challenging. Hevo automatically maps source schema with destination warehouse so that you don’t face the pain of schema errors.

Transparent Pricing – Hevo’s Transparent Pricing brings complete visibility to your ELT spend. Choose a plan based on your business needs. Stay in control with spend alerts and configurable credit limits for unforeseen spikes in the data flow. 

Get started for Free with Hevo!

Get Started for Free with Hevo’s 14-day Free Trial.

Security In Firestore

In any cloud system, security is of utmost importance. Firestore Data Model provides various security rules that allow users to control access to documents and collections. 

Security in Cloud Firestore Data Model checks for all incoming requests and validates them against the criteria or rules, and those request which doesn’t align with the defined rules are simply rejected.

Security rules in Firestore Data Model provide access control and data validation in a simple and expressive format. To apply security rules, you need to write in below format:

service cloud.firestore {
  match /databases/{database}/documents {
    match /{document=**} {
      allow read, write: if true;
    }
  }
}

Let’s understand what these parameters mean – 

  • service cloud. firestore: This parameter defines the service, in which case it is the cloud.firestore.
  • match /databases/{database}/documents: This parameter defines the database.
  • match /uploads/{document=**}: Creates a new rule block to apply to the uploaded archive and all documents contained in it.
  • allow read: It allows the public to read access.
  • allow write: This parameter allows public write access

Before wrapping up, let’s cover some basics.

What is Cloud Firestore?

Cloud Firestore

Firestore is a product that comes under the umbrella category of Google Cloud Platform — and is a flexible, scalable database for mobile, web, and server development.

Like Firebase’s real-time database, Firestore keeps the data in sync across client apps through real-time listeners. It also offers offline support for mobile and web, so you can build responsive apps that work regardless of network latency or Internet connectivity. 

Cloud Firestores provides excellent and seamless integration with other Google products. 

Cloud Firestore is a NoSQL, document-oriented database, which means data is stored in documents rather than rows and columns. Each Document can be organized into collections.

Key Features of Firestore:

Cloud Firestore provides several benefits over Firebase, some of them are discussed below –

  • Optimized for App development – Cloud Firestores is optimized for app development and helps developers develop apps faster.
  • Synchronizes data between devices in real-time – Cloud Firestore syncs the data between Android, iOS, and other Java-based applications in real-time, enabling users to collaborate easily.
  • Offline Access – Cloud Firestore allows users to access data offline with an on-device database. When the device comes back online, it synchronizes the data with the Cloud Firestore. The above features save users’ data loss in case of network disconnectivity.
  • Fully Managed – Cloud Firestore is a fully managed product from Google. It is built from the ground up to scale up upon demand automatically.
  • High Availability – Cloud Firestore stores the data in multi-region and uses replication factors to ensure high availability of the data in the case of unexpected disasters.
  • Server SDKs – It also means that delivering a great server-side experience for backend developers is a top priority. We’re launching SDKs for Java, Go, Python, and Node.js today, with more languages coming in the future.

Conclusion

In this blog post, we have discussed Cloud Firestore and Firestore Data Model in detail. We have also discussed Normalization & Denormalization along with Security in Firestore Data Model.  Understanding Firestore data structure and data modeling Firestore principles is crucial for building optimal and scalable data models.

On that more, with the complexity involved in manual and cumbersome processes, businesses today are leaning toward Automated data modeling practices. It is hassle-free, easy to operate, and does not require any technical background. In such a case you can also explore more of Hevo Data for ETL use cases. Hevo Data supports 150+ data sources.

Visit our Website to Explore Hevo

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Share your experience of working with the Firestore Data Model in the comments section below!

References

  1. Cloud Firestore data model
Vishal Agrawal
Technical Content Writer, Hevo Data

Vishal Agarwal is a Data Engineer with 10+ years of experience in the data field. He has designed scalable and efficient data solutions, and his expertise lies in AWS, Azure, Spark, GCP, SQL, Python, and other related technologies. By combining his passion for writing and the knowledge he has acquired over the years, he wishes to help data practitioners solve the day-to-day challenges they face in data engineering. In his article, Vishal applies his analytical thinking and problem-solving approaches to untangle the intricacies of data integration and analysis.

No Code Data Pipeline For Your Data Warehouse