Using Macros in dbt to Save Time

By Chris Knight – Senior BI Engineer

Here at Crimson Macaw, we use dbt, a data transformation tool to help build data warehouses for our clients. You can learn more about dbt here.

Macros in dbt are pieces of code that can be referenced in multiple dbt models. They ensure that the code you write is ‘DRY’ (don’t repeat yourself – a principal that is fundamental to the dbt ethos) since they can be reused across multiple dbt models. This results in code that is easier to understand and quicker to write.

Macros in dbt use jinja2; a templating language that can be used in conjunction with SQL.

They must be created in the macros directory in your dbt project, as defined in the macro-paths element within the projects.yml file as shown below.

folder path config

Example 1 – metadata generator

At Crimson Macaw, one way in which we use macros is to add standard metadata columns to our dbt models. We use metadata columns to capture data about each row such as the source system, change data capture (CDC) timestamp and operation, as well as the insert and update dates.

Having a macro that inserts these standard columns saves time as you don’t have to manually add them to each model.  If a change or addition is required, this can be done in one place, with any changes propagating to each model, which also reduces the risk for errors.

The screenshot below shows the macro code. Note the opening and closing jinja2 tags.

metadata macro When we create our dbt models, rather than manually listing every metadata column in every model that requires them, we need only add a reference to the macro, using the appropriate jinja2 syntax.

metadata macro usage

When the dbt model is compiled, the jinja2 code will be replaced by the column definitions from the macro. You can find the compiled code within the target\compiled directory within your dbt project. The compiled code for the above example; a user dimension is shown below.

metadata macro compiledAs you can see, once the model has been compiled, the jinja2 reference to the macro name is replaced by the columns specified in the macro itself.

Example 2 – date dimension

A date dimension is a common feature of many data models. Date dimensions typically include additional attributes relating to a date and allow hierarchies to be defined so that measures can be rolled up. For example they allow daily sales to be rolled up to weekly sales, and weekly sales to be rolled up to monthly sales etc, which is useful for drill-down reports. In addition, they often include details of fiscal periods, national holidays, and leap years.

date dim columns

Date dimensions will often have the same or similar structure across not only multiple data models within a project, but across multiple projects and multiple clients. This is a great scenario where a macro can save time. Here at Crimson Macaw, we have developed such a macro which accepts parameters for the start and end dates to determine the range of dates the finished dimension will contain, as well as the start of the fiscal year. This allows for slightly different requirements across projects and clients, whilst removing the need for multiple variations of the same macro thus ensuring that the code is DRY.

The macro leverages the date_spine macro from the dbt_utils package to provide the list of dates within the range specified by the start and end date parameters and uses this as the basis for the logic for each column in the date dimension. Other dbt packages can be found on the dbt hub.

To use the macro, create your date dimension dbt model and call the macro, passing in the desired variables.

dim date macro usage

To allow this macro to be portable across projects we have included it in our own custom utilities package, which we have referenced in the packages.yml file, so it can be installed in a project simply by running dbt deps.

Here is the markdown that we added to our dbt documentation, which gives further details:

date dim markdown

Conclusion

The examples we’ve looked at here only scratch the surface of what is possible. At Crimson Macaw we have developed macros for a wide variety of uses including batch control, generation of prototype data models and automated schema.yml file generation. We would recommend the use of macros whenever you have code that is repeated across multiple dbt models. If you have macros that can be used across multiple dbt projects, consider including these in a package.

The consistency afforded using macros leads to quicker cross-project familiarisation, which in turn allows us to better support our clients.

Want to find out more? Need support on your own data journey? Get in touch here.