Here at Crimson Macaw one of our favourite ELT tools within Data Warehousing is Matillion and we have used it to provide many complex solutions for our clients.
Matillion has many useful features that can help you orchestrate your ELT pipeline, including precedence constraints which allow you to easily set custom actions to perform on occurrence of both success and failure.
In this blog post we will provide an overview of how we manage orchestration here at Crimson Macaw using templates, and in future posts we will walk you through how to set some up to handle an ELT pipeline, and to handle a few other scenarios too.
While Matillion allows a developer to easily manage their pipeline and handle success and failure for the more complex tasks; the development canvas can start to get a little busy. The main reason for this is the need to have additional components running off each transformation task to handle success and failure. This can make it difficult to follow the flow of the data as the pipeline progresses and can be inefficient as there is a lot of duplication. There is an additional challenge that if we want to make a change to our pattern then we will have to make that change everywhere that pattern is utilised, which on larger projects can translate into 100s of changes.
Within Matillion there are two types of tasks: Orchestration and Transformation. Orchestration tasks are extremely powerful because not only do they allow you to call Transformation Tasks, but they allow you to call other Orchestration Tasks too. Additionally, you can pass variables in which allows you to encapsulate reusable logic into a single task. These tasks could include things such as writing to audit tables, writing metrics to cloud watch, e-mailing a distribution list on failure, facilitating full and incremental loads, etc.
You can easily pass in scalar or grid variables using a simple interface.
This leads to a much tidier and concise development environment which makes managing process flow and debugging significantly easier. There is the added advantage that if we use a single template to manage all our orchestration tasks then any change of pattern will require a single change saving time and reducing the potential to make silly mistakes.
In this series we are going to walk you through the process of creating Orchestration templates that will cover most of the common use cases such as:
- Master Templates that will manage both full and incremental builds and encapsulate audit reporting.
- Templates to run individual transformation tasks
- Iteration templates that allow you to quickly automate repetitive tasks
Want to know more and cannot wait for the next instalment? Get in touch with us here.