dbt deps
is a command that downloads (or updates) the dependencies of your dbt project. In practical terms, running dbt deps
will read your project’s dependency file, by default packages.yml, and pull the specified packages into your project.
It does this by pulling the latest version mentioned in your dependency file from Git, ensuring that your project uses the latest specified versions. You can think of it as the dbt equivalent of running pip install
(for Python) or npm install
(for Node.js), but specifically for dbt packages.
Table of Contents
Why is It Useful?
By using dbt deps
to manage packages, you can easily incorporate proven, pre-built macros, data tests, and models into your own project without copying code manually. This means you can rapidly introduce new functionality or industry best practices simply by referencing a package rather than constructing SQL logic from the ground up.
For instance, if you wish to incorporate standardized data quality tests or versatile SQL utilities created by the broader community, you can effortlessly include the relevant package and let dbt deps
handle the retrieval. This approach not only accelerates development but also promotes consistency and reliability by leveraging well-vetted, reusable code.
Understanding dbt Dependency Management
Before we dive into using dbt deps
, it’s important to understand how dbt handles dependencies. In dbt, a package is essentially a bundle of dbt macros, models, seeds, or tests that can be installed into another dbt project. These packages allow you to modularize and share code just like libraries in other programming ecosystems.
Where are Dependencies Defined?
dbt expects you to list your desired packages in a YAML file at the root of your project. In dbt versions 1.x, this file is typically named packages.yml
(or in newer dbt Cloud projects, dependencies.yml
, which serves a similar purpose).
Here’s an example packages.yml:
packages:
- package: dbt-labs/dbt_utils # Utility macros package
version: 1.3.0
- package: calogica/dbt_expectations # Data testing package
version: 0.5.0
- package: dbt-labs/audit_helper # Data auditing package
version: 0.12.1
Each entry under “packages:” specifies a dependency. You can install packages from dbt Hub by using the package: <namespace>/<package_name> syntax
with a version number, as shown above.
dbt Hub is the community-driven registry of dbt packages that you can use to extend dbt’s functionality.
In the above example, dbt-labs/dbt_utils refers to the dbt-utils
package maintained by dbt Labs, and we’ve pinned it to version 1.3.0. It’s good practice to pin your package versions like this to avoid unexpected changes/updates, which can cause discrepancies in different environments like prod or dev.
Besides dbt Hub packages, you can also specify dependencies from Git repositories or local directories. For Git-based packages (often used for custom or private packages), you would use a git: URL and a branch or tag revision, and for local packages, you’d use a local: path.
For example, if you would like to specify dependencies from git, your packages.yml could look like:
packages:
- git: "https://github.com/my-org/my_dbt_package.git"
revision: main
Similarly, below would install packages from the local directory:
packages:
- local: path/to/package
Where Do the Packages Get Installed?
When you run dbt deps, by default dbt will download these packages into a directory called dbt_packages/
inside your project (older versions used a directory named dbt_modules/
). This directory is automatically created when you install packages, and it contains the code for each package (SQL files, macros, etc.).
Usually, dbt_packages/
is git-ignored by default – you don’t commit the package code to your own repository. Instead, you commit just the packages.yml
(and a lock file, if present), and anyone who checks out your project can run dbt deps
to fetch the packages. This keeps your repo light and avoids duplicating code in source control.
Package Version Pinning & Lock File
To ensure reproducibility, dbt will create a lock file (often named packages.lock
or package-lock.yml
in newer versions) when you install packages. This lock file captures the exact versions (and Git commit hashes, if applicable) that were resolved. By committing this lock file to your repository, you guarantee that everyone (and every environment, like production or CI) installs the same versions of the dependencies.
When you run dbt deps
, dbt will use the lock file to install those exact versions if the lock file is present. If you want to upgrade to newer versions of packages, you can update packages.yml
and run dbt deps --upgrade
(which refreshes the lock file to the latest allowed versions).
How to Use dbt deps in Your Project?
Using dbt deps is straightforward. Just navigate to your project and follow the below steps:
- Create a
packages.yml
file in the project’s root (the same folder as yourdbt_project.yml
). - Add your desired packages (like
dbt_utils
) topackages.yml
.
Make sure to specify the correct package in a correct format, i.e., namespace/name and a version number. You can find the latest package names and versions on the dbt Hub site for community packages.
- Open Terminal and run the
dbt deps
command.
For example, if you are installing dbt_utils
package, your output might look like this:
- Verify the packages are installed.
You can simply check that the dbt_packages/
folder has been created and contains directories named after the packages you added.
- (Optional) Use the package contents in your project.
Installing a package doesn’t automatically apply anything to your data; it just makes the package’s resources available to your project. To leverage a package, you’ll reference its macros, models, or dbt tests in your own project. For example, after installing dbt-utils
, you can use one of its macros in your SQL models. Suppose you want to select all columns except a few sensitive ones – dbt-utils
has a star macro for this. You could do something like:
-- models/example.sql
SELECT {{ dbt_utils.star(from=ref('raw_customers'), except=['ssn', 'credit_card']) }}
FROM {{ ref('raw_customers') }};
- (Optional) Commit your changes.
Usually, you’ll want to commit the packages.yml
file (and the package-lock.yml
file if generated) to your repository so that others on your team or your deployment pipeline know about the new dependency. Do not commit the actual dbt_packages/
directory.
Updating a package: If you need to update to a newer version of a package, simply edit the version number in packages.yml
and run dbt deps
again. dbt will download the new version. Alternatively, you can run dbt deps --upgrade
to automatically upgrade to the latest versions that satisfy any version ranges available to you.
Conclusion
dbt deps simplifies managing and incorporating dependencies in your dbt projects, streamlining the process of adding and updating functionality. By leveraging pre-built packages, you can reduce the time spent on custom logic and focus on building scalable data models.
For a smoother and more efficient data transformation process, you can integrate dbt with Hevo Transformer. Hevo Transformer allows you to seamlessly handle your data transformations while leveraging dbt’s capabilities, eliminating the complexity of manual data wrangling.
Ready to enhance your data workflows? Try Hevo Transformer today to streamline your data transformation process!
Frequently Asked Questions
1. How do I install dependencies using dbt deps
?
To install dependencies, simply run the dbt deps
command in your project directory. This will download the necessary packages defined in your packages.yml
file into the dbt_packages/
directory.
2. Can I update my dbt dependencies with dbt deps
?
Yes, you can update your dependencies by running dbt deps --upgrade
. This will pull the latest versions of the dependencies specified in your packages.yml
file.
3. What happens if a dependency is not found when running dbt deps
?
If a dependency cannot be found, dbt will raise an error indicating the issue. It’s important to ensure that all the dependencies listed in packages.yml
are correctly available on dbt Hub or GitHub (for custom dependencies).