How dbt build Command fits in Your Data Transformation Puzzle?

Q: 3. Can I run dbt build on specific models?

Yes, you can use --select flag for it. Syntax: dbt build --select <model_name>

dbt has revolutionized data transformation in many ways. It introduced engineering practices to the analytics world. It offers commands like dbt init, dbt compile, dbt run, dbt seed, and others. One such important dbt command is dbt build. It simplifies the process of building, testing, and deploying data models. This is particularly beneficial for optimizing your dbt models for production environments.

Let’s explore how you can use the dbt build command. We’ll also see the use cases where you can run dbt build along with examples that differentiate dbt build and dbt run commands.

Table of Contents

What is dbt build command?

Dbt build assesses all the dependencies between your models, tests, and seeds before running them. Essentially, it combines dbt run, dbt test, dbt snapshot, and dbt seed into a single operation.

This ensures that your models are correctly built, your tests validate the integrity of your data, and your snapshots reflect the latest updates. This makes it essential for maintaining efficiency and data accuracy in production workflows.

Why dbt build?

Runs Everything at Once – It executes models, tests, and snapshots in a single command.
Optimized for Incremental Models – Instead of reloading all data, it only updates what’s new.
Understands Dependencies – It ensures that models run in the right order based on dependencies.
Better Error Handling – If a model fails, you can retry just that model instead of restarting the whole pipeline. Also, if you have a model A that is dependent on model B, if the test fails on model B, the dependent model A will not be built.
Saves Time – Eliminates redundant steps and speeds up development.

How to use the dbt build command?

Here is what I got when I ran the dbt build in my project directory through the command prompt.

Running with dbt=1.5.0
Found 4 models, 6 tests, 1 snapshot, 1 seed file, 3 sources

18:30:21 | Concurrency: 2 threads (target='dev')
18:30:21 |
18:30:21 | 1 of 8 START seed file analytics.customer_spending_data............... [RUN]
18:30:22 | 1 of 8 OK loaded seed file analytics.customer_spending_data........... [INSERT 500 in 0.09s]
18:30:22 | 2 of 8 START view model analytics.monthly_spending_trends............ [RUN]
18:30:22 | 2 of 8 OK created view model analytics.monthly_spending_trends....... [CREATE VIEW in 0.15s]
18:30:22 | 3 of 8 START model analytics.top_customers........................... [RUN]
18:30:22 | 3 of 8 OK created table model analytics.top_customers................ [CREATE TABLE in 0.18s]
18:30:22 | 4 of 8 START test not_null_monthly_spending_trends_customer_id....... [RUN]
18:30:22 | 4 of 8 PASS not_null_monthly_spending_trends_customer_id............. [PASS in 0.05s]
18:30:22 | 5 of 8 START test unique_top_customers_customer_id.................. [RUN]
18:30:22 | 5 of 8 PASS unique_top_customers_customer_id......................... [PASS in 0.04s]
18:30:22 | 6 of 8 START snapshot analytics.customer_spending_snapshot........... [RUN]
18:30:22 | 6 of 8 OK snapshotted analytics.customer_spending_snapshot........... [INSERT 30 in 0.20s]
18:30:22 | 7 of 8 START test relationships_monthly_spending_trends_transactions. [RUN]
18:30:22 | 7 of 8 PASS relationships_monthly_spending_trends_transactions....... [PASS in 0.06s]
18:30:22 | 8 of 8 START model analytics.daily_spending_summary.................. [RUN]
18:30:22 | 8 of 8 OK created table model analytics.daily_spending_summary....... [CREATE TABLE in 0.22s]
18:30:22 |
18:30:22 | Finished running 1 seed, 3 models, 3 tests, 1 snapshot in 1.15s.

Completed successfully

Done. PASS=8 WARN=0 ERROR=0 SKIP=0 TOTAL=8

Do you want to run only on a specific model?

Syntax: dbt build --select <model_name>

Do you want to exclude a model?

Syntax: dbt build --exclude <model_name>

dbt build vs dbt run

The dbt build and dbt run aren’t the same. Their functionalities are subtly different. Let me walk you through a simple example to show the differences. Learn the difference between dbt run vs dbt build to know more.

Imagine you’re analyzing spending patterns of customer transactions to identify loyal, long-term customers.

Step 1: Create a dbt model to analyze the total spending of a customer

WITH transactions AS (
    SELECT
        customer_id,
        SUM(amount) AS total_spent
    FROM {{ ref('transactions') }}
    GROUP BY customer_id
)
SELECT
    customer_id,
    total_spent,
    CASE
        WHEN total_spent > 1000 THEN 'high spender'
        ELSE 'regular'
    END AS spender_category
FROM transactions

You can either execute the dbt run or dbt build command, depending on your use case. Let’s explore both scenarios.

Step 2: Execute dbt run

dbt run --select high_spenders

This command will execute the high_spenders model. However, we aren’t sure the data we have is clean and accurate. So, we’ll add some tests to our model.

Step 3: Let’s add a dbt test on the total_spent column to make sure its values are positive.

Yaml file:

models:
  - name: high_spenders
    tests:
      - dbt_expectations.expect_column_values_to_be_between:
          column_name: total_spent
          min_value: 0

Step 4: Run dbt build

dbt build --select high_spenders

We know that: dbt build = dbt run+ dbt test+ dbt snapshot+ dbt seed

In a nutshell, when we run dbt build, the following happens:

The high_spenders.sql model gets executed.
Runs tests defined on the model high_spenders_test.yml to validate data quality
Takes a snapshot of customer_snapshot.sql to track the changes.
JSON files are generated. Manifest.json contains all the metadata of the project and run_results.json contains the detailed execution results

Comparison table: dbt build vs dbt run

Feature	dbt run	dbt build
Runs models	Yes	Yes
Runs tests	No	Yes
Tracks snapshots	No	Yes
Recommended for development	Yes	No
Recommended for production	No	Yes

Key Takeaway

That was a lot to go over, but hopefully you’re now much better informed about dbt build and the awesome ways you can use it. The best way to grasp it better is by implementing it in your data transformation project, so go ahead.

In a nutshell, these are the key takeaways from this article:

dbt build is a production command because it tests the model before executing it to minimize data quality issues.
During schema change in the incremental model, you can configure the full-refresh flag to reprocess the data transformation.
Make use of flags like --select and --exclude to run targeted models, tests, seeds, and snapshots.
Monitor execution logs regularly and look out for foreseen errors to prevent failed executions.

FAQs

1. What does dbt build do?

The dbt build command does a lot. It runs your models, tests them, captures snapshots to track changes, and loads csv files into tables. This simple command effectively builds and tests your entire dbt project in one go.

2. How is dbt build different from dbt run?

The dbt just executes models, while the dbt build does everything: runs, tests, snapshots, and seeds.

3. Can I run dbt build on specific models?

Yes, you can use –select flag for it. Syntax: dbt build –select <model_name>

4. Does dbt build run incremental models?

Yes, the dbt build will only process new data instead of rebuilding the entire model.

Srujana Maddula Technical Content Writer

Srujana is a seasoned technical content writer with over 3 years of experience. She specializes in data integration and analysis and has worked as a data scientist at Target. Using her skills, she develops thoroughly researched content that uncovers insights and offers actionable solutions to help organizations navigate and excel in the complex data landscape.

Mastering dbt build Command: Streamline, Test & Deploy Data Models

What is dbt build command?

Why dbt build?

How to use the dbt build command?

dbt build vs dbt run

Comparison table: dbt build vs dbt run

Key Takeaway

FAQs

1. What does dbt build do?

2. How is dbt build different from dbt run?

3. Can I run dbt build on specific models?

4. Does dbt build run incremental models?

Related articles

Mastering dbt build Command: Streamline, Test & Deploy Data Models

What is dbt build command?

Why dbt build?

How to use the dbt build command?

dbt build vs dbt run

Comparison table: dbt build vs dbt run

Key Takeaway

FAQs

1. What does dbt build do?

2. How is dbt build different from dbt run?

3. Can I run dbt build on specific models?

4. Does dbt build run incremental models?

Related Articles

Optimize your data integration with Hevo!

Related articles