Imagine you’re building a complex data pipeline in dbt, and you find yourself repeating the same logic across multiple models. Every minor change requires updating dozens of files, increasing the risk of errors and inconsistencies. Now, what if you could write that logic once and reuse it everywhere effortlessly? dbt macros are what you need to write cleaner, more efficient, and maintainable SQL.
dbt (data build tool) has become an essential framework for modern data transformation, allowing teams to build modular, scalable, and version-controlled analytics workflows. However, as data models grow in complexity, managing repetitive transformations becomes a challenge.
dbt macros built with the Jinja templating engine allow you to write reusable SQL code, following the DRY (Don’t Repeat Yourself) principle to improve consistency and efficiency. Whether you’re standardizing date conversions, automating column naming conventions, or simplifying business logic across models, macros help streamline development, reduce errors, and enhance collaboration.
In this article, you will learn about dbt macros and its benefits, how to write dbt macros with Jinja with step-by-step examples, and best practices on optimizing your macros., and how to use your dbt macros on the Hevodata platform.
At the end of the article, you will understand the core concepts of dbt macros, write and test a dbt macros and use macros to optimize your data transformation workflows with Hevodata’s dbt integration, known as Hevo Transformer.
Table of Contents
dbt Macros 101: The Basics You Need to Know
What is dbt?
dbt also known as data build tool, is an open-source analytics workflow that enables data analysts and engineers to transform raw data into analytics-ready datasets using SQL in your cloud data platform and warehouses like Snowflake, BigQuery, and Redshift.
It simplifies data modeling by incorporating software engineering best practices like modularity, version control, and automated testing. Instead of manually handling complex ETL (Extract, Transform, Load) processes, dbt automates SQL transformations, allowing analysts and engineers to focus on data quality and business logic.
At its core, dbt allows teams to:
- Write SQL transformations as reusable models, macros and tests.
- Use version control (Git) to track changes and collaborate.
- Automate documentation and testing to ensure data integrity.
As projects/ data pipelines grow, maintaining SQL transformations efficiently becomes a challenge, repetitive SQL logic becomes a bottleneck and is difficult to manage —this is where Jinja templating and dbt macros come into play.
What is Jinja?
Jinja is a templating engine that allows you to generate dynamic and reusable SQL code in dbt. With Jinja, you can write SQL that adapts based on conditions, loops, and reusable functions—just like programming in Python, making transformations more flexible and scalable. An example of using Jinja in dbt is the ref function. Jinja also is the base for writing macros in dbt.
Syntax
Jinja is embedded in dbt using double curly braces ({{ }}) for expressions and curly percent signs ({% %}) for logic statements. Jinja also supports the creation of simple Python objects like variables, lists and dictionaries.
Variables
A variable is a container or name for storing data value. Variables allow you to store values that can be reused across different parts of your SQL code. This makes it easier to manage dynamic logic and ensures consistency.
Syntax
To set a variable in Jinja, you use the set tag, followed by the variable name and value:
{% set variable_name = value %}
For instance, let’s say you work for an online retail company and need to reference product categories in your sales models more often, you can create a variable named product_categories that contains a list of product categories like this:
{% set product_categories = ['electronics', 'furniture', 'clothing'] %}
Calling product_categories will return a list of the predefined categories.
Placeholders
Anything between double curly brackets in jinja represents a variable, expression or function that can be inserted dynamically into SQL. Common placeholders in dbt are the ref and source functions.
The ref()
function is used to reference other dbt models within a project. It ensures that dependencies between models are correctly managed. dbt automatically compiles the ref() function into the correct schema and table name.
SELECT * FROM {{ ref('orders') }}
If orders is a dbt model, dbt will replace {{ ref(‘orders’) }} with the actual schema and table name, such as:
SELECT * FROM analytics.orders
The source()
function is used to reference a raw data source defined in your sources.yml file.
SELECT * FROM {{ source('sales', 'orders') }}
If your sources.yml is defined as this:
version: 2
sources:
- name: sales
tables:
- name: orders
dbt will compile it to look like this:
SELECT * FROM raw.sales.orders
Conditionals
Conditionals in dbt are used to determine the logic flow in your SQL code based on some conditions. Jinja provides if
, elif
, and else
statements to handle conditional logic inside dbt models and macros. These statements allow dbt to generate different SQL depending on conditions. Every conditional must have an if and else statements, elif also known as else if
Syntax
{% if condition %}
-- SQL executed if condition is true
{% elif another_condition %}
-- SQL executed if the first condition is false but this condition is true
{% else %}
-- SQL executed if none of the conditions are true
{% endif %}
Imagine we have an orders table where some regions use different currencies (USD, EUR, GBP). We need to apply currency conversion rates dynamically. Region is a variable that will be passed in when running in dbt.
SELECT
order_id,
customer_id,
region,
total_amount,
{% if var('region') == 'US' %}
total_amount * 1.0 AS converted_price -- USD (no conversion)
{% elif var('region') == 'EU' %}
total_amount * 0.85 AS converted_price -- Convert to EUR
{% elif var('region') == 'UK' %}
total_amount * 0.75 AS converted_price -- Convert to GBP
{% else %}
total_amount AS converted_price -- Default, no conversion
{% endif %}
FROM {{ ref('orders') }}
For Loops
For loops in Jinja allow you to iterate over lists, dictionaries and sequences to generate repetitive SQL statements. This is useful for generating SELECT
statements, UNION
queries, or applying transformations across multiple fields. You can also use an if statement inside a for loop to apply conditional logic, such as excluding specific values or applying different logic to the first or last iteration.
Syntax
{% for item in iterable %}
{{ item }}
{% endfor %}
{% … %}: Jinja control structures (not rendered in the final SQL).
{{ … }}: Expressions that are evaluated and rendered.
Within a for loop, Jinja provides a special loop object that gives metadata about the iteration:
Property | Description |
loop.index | 1-based index (starts from 1) |
loop.index0 | 0-based index (starts from 0) |
loop.first | True for the first iteration |
loop.last | True for the last iteration |
loop.length | Total number of iterations |
{% set columns = ['id', 'name', 'age', 'email'] %}
SELECT
{% for column in columns %}
{{ column }}{% if not loop.last %}, {% endif %}
{% endfor %}
FROM users;
The columns list contains column names. The for loop iterates over the list and generates a SELECT statement. The loop.last condition ensures that the last column does not get a trailing comma.
Here is the compiled SQL code:
SELECT
id, name, age, email
FROM users;
What exactly are dbt macros?
A dbt macro is a reusable SQL function that lets you define logic once and call it anywhere in your project across multiple models. Instead of writing the same SQL over and over, you define a macro once and reuse it wherever needed – just like we have functions in conventional programming. Unlike dbt models, which create tables or views, macros return SQL expressions that are dynamically inserted wherever they are called.
This table summarizes how macros are different from models in dbt.
dbt Models | dbt Macros | |
Purpose | Transform and store data | Reuse SQL logic dynamically |
Execution | Run as a query in the database | Generate SQL dynamically before execution |
Reusability | Limited to specific models | Can be used across multiple models |
How to Write and Use Your First Macro
We are going to use a simple example to teach how to write macros in dbt.
For sample data with product name, category and price columns, you may want to calculate the total price of items based on category. This can be simple if you have two or three categories. However what happens when you add new categories or even rename a category, it becomes repetitive and hard to track. With dbt macros, you only have to write your SQL once and call it everywhere else.
- Identify the repetitive logic in your workflow.
- In your dbt project, create a new file in the macros folder. The file name should be descriptive of what the file does. In this case, we are going with macros/total_price_by_category.sql
- Write the macro
{# calculates the total price by category #}
{% macro total_price_by_category(table_name, category_column, price_column) %}
select {{ category_column }} as category, sum({{ price_column }}) as total_price
from {{ table_name }}
group by {{ category_column }}
{% endmacro %}
This macro takes a table name, category column, and price column as inputs, aggregates PRICE per CATEGORY and returns the total price for each category.
- Use the macro in your model
After writing the macro, use it in a model to ensure it works as expected. In your models folder, create a file category_prices that uses the total_price_by_category macro.
{{ total_price_by_category(source("RAW", "PRODUCTS"), "CATEGORY", "PRICE") }}
Here is the dbt implementation and the compiled SQL code from the macro:
The resulting table looks like this:
dbt Packages
dbt packages are reusable collections of macros, models, tests, and other components that enhance your data transformation workflows. Instead of writing complex logic from scratch, you can leverage these pre-built solutions to streamline development, enforce best practices, and improve maintainability. Some of the most widely used dbt packages that include useful macros are:
Here are some popular dbt packages that include useful macros:
- Dbt-utils:
- Dbt-expectations: A package that enables data testing within dbt.
- Useful macros:
- expect_column_values_to_not_be_null(): Ensures a column has no null values.
- expect_row_values_to_match_other_table(): Validates if data is consistent between tables.
- Useful macros:
How to Use a Macro in a dbt Package
Step 1: Install the Required dbt Package
Add the required package to your packages.yml file in your dbt project. For example, to install dbt-utils:
packages:
- package: dbt-labs/dbt_utils
version: 1.0.0
Then, run the following command to install the package:
dbt deps
This downloads and installs the package into your project.
Step 2: Use a Package Macro in Your dbt Model
Once installed, you can use package macros in your dbt models.
For example, if you need to create a unique product identifier by hashing multiple columns, you can use generate_surrogate_key()
from dbt-utils:
SELECT
{{ dbt_utils.generate_surrogate_key(['category', 'product_name']) }} AS product_id,
category,
product_name
FROM {{ ref('products') }}
When compiled, this macro generates a unique hashed key for each product using product_id and category
Best Practices When Using Macros
Look out for repetitive patterns in your SQL code
If you find the same code across different models, you should consider writing a macro for reusability.
Keep Macros Modular
Macros should focus on one specific function rather than trying to do everything.
Use clear and consistent naming conventions
Your macro name should be descriptive and clear to anyone.
Set Variable Names at the top of a model
When using macros inside models, define important variables at the beginning of the SQL file. This makes it easier to adjust values without modifying multiple lines of code.
Document your macros
Each macro should include a brief description of what it does, parameters and expected data types.
Test Your Macros
Since you will be using a macro across multiple models, it is important to test and ensure it is working as expected.
Conclusion
Hevo Transformer empowers data teams to streamline their transformation workflows by integrating seamlessly with dbt, enabling you to run macros efficiently within a no-code environment. With Hevo Data’s robust automation, you can ingest data from over 150+ sources and transform it effortlessly using dbt, eliminating manual complexities. Whether you’re standardizing transformations, optimizing queries, or implementing dynamic logic, Hevo Transformer simplifies the process, helping you focus on insights rather than infrastructure. Try Hevo Transformer today and experience a seamless, automated approach to data transformation.