In present data-driven times, businesses seek accurate historical data to map changes regarding trends and informed decision-making. Managing such evolving records, such as customer address updates, product price changes, or employee promotions, becomes a challenge if set at odds with existing data. This is where Slowly Changing Dimensions (SCDs) help.
For example, a customer has moved from one address to another. The SCD Type-2 maintains not just the new address but the old one as well and also timestamps the validity period. So, if one were to see this person’s behaviors, one would be able to tell the exact address that applied to any given point in time.
In this blog, let’s look at implementing dbt SCD Type 2 using dbt Snapshots, a powerful feature for facilitating the tracking of changes over time. No matter what type of data you’re going to need, whether that would be customer data, product details, or employee records, this method is sure to always keep your data in a true historical context.
Table of Contents
What is dbt?

dbt is open-source software that allows data analysts and engineers to transform raw data into clean datasets organized into the data warehouse based on SQL-based modeling. It also simplifies the transformation layer of ELT internalization by enabling users to write modular, version-controlled, and testable SQL code.
Seamlessly join, aggregate, modify your data on Snowflake. Automate dbt workflows, version control with Git, and preview changes in real-time. Build, test, and deploy transformations effortlessly, all in one place.
🔹 Instant Data Warehouse Integration – Connect in minutes, auto-fetch tables
🔹 Streamlined dbt Automation – Build, test, and run models with ease
🔹 Built-in Version Control – Collaborate seamlessly with Git integration
🔹 Faster Insights – Preview, transform, and push data instantly
What are SCDs?
A slowly changing dimension (SCD) is a dimensional view that does not change regularly but rather changes slowly over time. For example, the address of a customer, the marital status of a customer, or the product category of a particular product might vary infrequently but need proper recording when they do change; some examples of SCDs include the changes made to those attributes in a customer’s record over time. SCDs ensure historical information about those changes within a business’s life for evaluation purposes.
Importance of SCD Type 2 in Data Warehousing
The SCD type 2 is the one employed to track and maintain the historical changes. While SCD Type 1 overwrites the old data with new data, Type 2 will add again a record every time something changes in a dimension. All allows businesses to make queries on “how” it was during that time, which is crucial for analytics in the time perspective.
In a retail scenario, if a customer updates their address, it’s essential to retain the previous address in the data warehouse while storing the new one to maintain historical records.
Typically, there are three types of SCD:
In data warehousing, Slowly Changing Dimensions (SCDs) help track changes in historical data. There are three primary types, each serving different business needs:
1) SCD Type 1 – Overwrite (No History Kept)

This method simply updates the record, replacing the old value with the new one. No history is retained, meaning only the latest version of the data is available.
Example: Updating a customer’s email address.
Customer_ID | Name | E-mail ID |
101 | Alice | new@example.com |
Best for: When past data is irrelevant, such as correcting spelling errors or fixing incorrect values.
2) SCD Type 2 (Historical Tracking)

SCD Type 2 creates new records when data changes, thus storing both the old and new values. It is maintaining historical records, which is an important aspect of enabling trend analysis and reporting.
For example: Tracking one customer’s multiple addresses over time, like when a customer moves from New York to Los Angeles. Instead of updating the address, a new row is inserted, with validity timestamps to track when each record was active.
Customer_ID | Name | Address | Valid_From | Valid_To |
101 | Alice | New York | 2022-01-01 | 2023-06-15 |
101 | Alice | Los Angeles | 2023-06-16 | Null |
Best for: Trend analysis, historical reporting, and regulatory compliance.
3) SCD Type 3 (Limited History)
This method adds additional columns to store the prior value; typically, one or two prior versions are kept. It generally suffices when businesses need to track the smallest amounts of history information.
Example: Keeping in the same record the previous and current salary of the employee.
Employee_ID | Name | Current_salary | Previous_Salary |
201 | Bob | 60,000 | 50,000 |
Best for: When only recent changes matter, such as salary adjustments or position changes
Among these, SCD Type 2 is the most common because it provides a full history of changes, which is vital for accurate analysis and business decision-making.
Why is SCD Type 2 the Most Common?
On a brighter note, SCD Type 2 is the most preferred type to keep an entire history of the changes. This is necessary for:
- Customer behavior analysis (e.g., tracking address changes)
- Employee career pathing (e.g., promotions over time)
- Product price history (e.g., price changes over months/years)
Nevertheless, SCD Type 2 ensures that no past data is ever lost, making it very popular for decision-making and business intelligence.
Why SCD Type 2 Matters in Data Warehousing?
SCD Type 2 is considered effective since it is widely accepted in the business intelligence arena as far as historical tracking is concerned. Businesses need to analyze data as it was at some point in time a higher price change, employee roles, or customer details.
Example:
Prices in a retail store are tracked for the products. If a product goes from $10 to $12, and you want to retain information about the old price, and then update the new one.
Product_ID | Product_Name | Price | dbt_valid_from | dbt_valid_to |
101 | Sneakers | $10 | 2024-01-01 | 2024-02-15 |
101 | Sneakers | $12 | 2024-02-16 | NULL |
Here, the dbt_valid_from and dbt_valid_to columns help track when the price was valid.
Data in the recent data stack arena has had many changes, but dbt sits right on top. With it, new data workflows can be complicated; instead of letting analysts focus on insights and deeper analysis, they should be more concerned with writing complex SQL scripts.
Advantages of dbt Snapshots
Managing historical data changes. The primary advantages of dbt as the implementation method for an SCD Type 2 design are given below:
1. Automated History Tracking:
In this case, dbt Snapshots automate the recording of history. With the snapshot command, one does not have to manually track changes, making for a more efficient and reliable process.
2 . Flexibility:
Multiple snaps are supported by dbt. The timestamp strategy provides for SCD Type 2, whereby if the value of a timestamp column changes, the record is deemed to be updated; otherwise, a new record is inserted.
3. Seamless Integration:
With integration with modern data warehouses such as Snowflake, BigQuery, and Redshift, taking snapshots becomes effortless.
4. Simplicity:
In dbt, you don’t need to do any complex SQL updates or joins for change tracking. One can always rely on dbt for this, leaving you with more time for analysis.
5. Consistency & Reliability:
dbt has very strict tests and validations around its history so that your historical data is reliably updated. It ensures that your reports and analytics run on good data.
Step-by-Step Guide to Implementing SCD Type 2 in dbt
dbt Snapshots provides an automated and scalable solution to track historical changes to your data using Slowly Changing Dimension (SCD) Type 2. In four key steps, this document lays out the entire process to ensure the setup of your dbt project to capture data changes efficiently.
Step 1: Set Up dbt
Before introducing SCD Type 2, you needed to install and set up dbt in your data warehouse. dbt should integrate the database, such as BigQuery, Snowflake, Redshift, or PostgreSQL.
Step 1.1: Initializing a dbt Project
If you haven’t set up a dbt project yet, you can initialize it by running:
dbt init my_dbt_project
This will create the required directory structure, including folders for models, snapshots, and configurations.
Step 1.2: Configuring dbt with Your Data Warehouse
Update the profiles.yml file to connect dbt to your data warehouse. Below is an example configuration for PostgreSQL:
my_dbt_project:
target: dev
outputs:
dev:
type: Postgres
host: your-database-host
user: your-username
password: your password
port: 5432
dbname: your-database-name
schema: analytics
After setting up the connection, test if dbt is working by running:
dbt debug
If everything is correctly configured, dbt will confirm the connection is successful.
Step 2: Define the Source Table
Now that dbt is set up, the next step is to identify the source table that contains the data we want to track changes for.
Let’s take an example of a customer table that stores details like Customer ID, Name, Address, and Last Updated Timestamp.
Example Source Table (customer_table)
customer_id | name | address | updated_at |
101 | Alice | New York | 2023-01-10 12:00:00 |
102 | Bob | Chicago | 2023-01-11 15:45:00 |
103 | Carol | San Francisco | 2023-01-12 08:30:00 |
What happens when data changes?
- If Alice moves to Los Angeles, the address field will change.
- If Bob updates his contact details, the updated_at timestamp will change.
- We need to track these historical changes using a dbt Snapshot.
Step 3: Create a dbt Snapshot
Now, generate a snapshot file that records the alteration in customer data. You will declare the snapshot in a new (customer_snapshots.sql) file under the snapshots/ folder. This will include the timestamp strategy in dbt for creating a record for an updated_at change.
{% snapshot customer_snapshot %}
{{
config(
target_schema='snapshots',
unique_key='customer_id',
strategy='timestamp',
updated_at='updated_at’
)
}}
SELECT * FROM my_database.raw.customers
{% endsnapshot %}
Breaking Down the Snapshot Configuration:
- target_schema=’snapshots’: Stores the snapshot in a separate schema for historical tracking.
- unique_key=’customer_id’: Ensures changes are tracked at a per-customer level.
- strategy=’timestamp’: Uses the updated_at column to determine when changes occur.
- updated_at=’updated_at’: Specifies which column dbt should check for updates.
Step 4: Run dbt Snapshot Command
Once the snapshot file is defined, run the snapshot command to execute the snapshot process:
dbt snapshot
This command will create historical records whenever changes occur in the source data.
Querying and Analyzing Historical Data
Tracking changes over time after executing the snapshot command, querying the snapshot table would allow you to see how the data changed. An example query to check for changes in customer data would be as follows:
Understanding dbt_valid_from and dbt_valid_to:
Between record tracking and modifications, the two fields which dbt uses are extremely important:
- dbt_valid_from: This is the timestamp from which a particular record version was considered valid.
- dbt_valid_to: This is the timestamp until the record is effective and is replaced by a newer version, if applicable. For the latest record, dbt_valid_to will be NULL.
These fields are useful to query historical versions of records and analyze how the data has changed over time.
Best Practices for SCD Type 2 in dbt
1. Optimize Performance
Over the years, SCD Type 2 tables can infinitely grow, so optimizing for performance is necessary:
- Partitioning: You can partition the snapshot table on the date so that queries are faster.
- Indexing: Columns such as updated_at should be indexed for faster lookups.
2. Handle Schema Changes
Because data schemas evolve with time, all modifications must be reviewed during the process:
- Version control systems must be used for Schema change tracking as part of documented procedures that ensure consistency.
- Regular verification of your snapshots ensures the correct tracking of data.
3. Ensure Data Accuracy
Every SCD Type 2 implementation requires accurate data maintenance. To enhance data accuracy:
- Regular Snapshot Execution: Running dbt snapshot jobs at set intervals enables the maintenance of fresh data.
- Data Validation: Validate snapshots against source tables to ascertain correctness..
Challenges and Solutions
1) Management of the Late Coming Data
Sometimes, such late data causes problems in historical tracking. To take care of this, the following points are:
- Backfill: Using backfilling methods, updating records whenever new data comes in late.
2) Performance Issues
The huge tables arise because of SCD Type 2. To take care of that –
- Optimized Queries: accommodate large datasets through optimized queries and indexing strategies.
3) Debugging Snapshot Problems
Debugging becomes most important when issues occur:
- dbt Debug: Use the dbt debug command to get problems identified in the snapshots and resolve them fast.
Real-World Use Cases of dbt SCD Type 2
1) Retail Industry: Price Change
Every retail store has prices that are always changing from time to time. This is where historical prices can be tracked for products. SCD Type 2 helps a business in analyzing customer behavior about price changes to assess their effectiveness over time.
2) E-commerce: Tracking Customer Profiles
In an e-commerce business, it is sometimes evident that a customer’s address, payment method, and preference change over time. SCD Type 2 will help an e-commerce company see a customer’s changing profile.
Conclusion
Implementing dbt Snapshots for handling Slowly Changing Dimensions (SDC) Type 2 provides organizations with a superior method to ensure data consistency and generate thorough trend reports. Businesses achieve efficient data change tracking functions through dbt’s automated processes, removing the need for complex ETL procedures. Data accuracy remains maintained alongside compliance standards to enable well-informed decision-making. The efficiency lifecycle requires three best practices: performance optimization, schema change management, and data precision methods. The implementation of SCD Type 2 enables organizations in any data-driven sector to maintain historical data trends while improving their analytical capabilities for prolonged business advancement. The use of dbt for SCD Type 2 implementation results in an expandable data approach that provides both reliable and durable data strategies. Sign up for Hevo transformer and get dbt-powered data transformation capabilities for free.
FAQs
1. How to create SCD type 2 table in dbt?
Use dbt Snapshots: Define a snapshot file with a unique key and a timestamp column (e.g., updated_at
) to capture record changes, then run dbt snapshot
to update your history table.
2. What is Type 2 SCD approach?
It tracks historical changes by inserting new records when a value changes, preserving previous versions along with their valid time periods for comprehensive trend analysis.
3. What is SCD in dbt?
SCD stands for Slowly Changing Dimensions—a data warehousing technique to maintain and query historical data changes. In dbt, it’s implemented using snapshots to automatically track these changes over time.