Understanding and capturing user requirements in any technical domain is tricky and in the world of business intelligence and analytics; this is no exception. As a data consultancy it is critical for us to truly understand user requirements but also equally important that the users and stakeholders understand what they are going to get as an output.
What is an “analytical requirement”?
It’s a broad term, but these are common sentiments we see when we ask users “what do you need to understand from your data that will drive value for your business?”:
For Crimson Macaw, our goal is to create solutions that can allow these questions to be answered easily, consistently, timely and with a strong understood definition that gives users confidence to make bold data-driven decisions.
Building Analytical Models
Ultimately, capturing user requirements is used to inform data model design i.e. how to structure the data and define the data processing rules to make these requirements a reality. There are various schools of thought, but at Crimson Macaw we really value the Dimensional Modelling approach popularised by Ralph Kimball and Margy Ross.
If you have ever heard the words “facts”, “dims”, “dimensions” or “measures” when hearing discussion about data models; they were probably talking about Dimension Modelling.
In simple terms, Dimensional Modelling is the approach of storing measurable information pertaining to specific business process events into “fact” tables and storing reference information describing those events into “dimension” tables. Combined we end up with a concept of a “star schema” where multiple dimension tables can join to a fact table to describe the events that are stored.
For Crimson Macaw using Dimension Models for servicing analytical requirements provide the following benefits:
Before Sun Models
So, before we dive into what Sun Models are, let’s think about some other mechanisms that are used to capture analytical requirements:
- Enterprise Bus Matrix
- Report Wireframes
Enterprise Bus Matrix
The Enterprise Bus Matrix is a tabular capture format that shows the required intersections between “measures” and “dimensions”. It can be quite informative and shows where conformed dimensions will emerge across different business processes.
The limitation with the Enterprise Bus Matrix is that it isn’t very user friendly, and it doesn’t really capture useful information such as “what attributes related to product will I see?” or “can I drill up or down the customer group hierarchy?” since that information isn’t typically captured. Trying to capture this information in one place in a tabular format becomes very unwieldly and can disengage the user since it becomes cluttered and hard to read.
Report Wireframes are super-valuable for understanding how a user will use the analytical requirements to build reports and dashboards. We can understand what modelling considerations we need to make based on how they wish the visualisations to interact and navigate.
However, a report-centric approach to assessing requirements can produce narrow thinking. Lots of very useful information may never end up on a report or dashboard and therefore those requirements can be missed when only thinking about specific report type outputs.
Reporting requirements are a valuable input into the requirements process but it should not be the only lens to which requirements should be rationalised.
An Entity Relationship Diagram (ERD) [see the “star schema” image further up] is essentially a technical design artefact that informs a BI Engineer what columns need to be populated to fulfil the requirements. ERDs are complicated and not easy to follow for a business user, especially when the number of dimensions increase, and therefore can be disengaging as a mechanism to discuss and document requirements. There are also other pieces of useful information that are not captured such as:
- How are hierarchies arranged?
- Slowly Changing Dimension configuration
- What dynamic calculation fields are required in the BI tool? For Example, “Returns Rate %” is a calculation that has to happen at run time and cannot be stored in a table since it will return different results based on filters applied.
Enter, Sun Models
Sun Models are niche. They aren’t something many people have heard of before and googling often doesn’t produce many results. Sun Models originated from Professor Mark Whitehorn at Dundee University, but you will find similarities to Starnet or the modelling of Matteo Golferelli.
The idea behind Sun Models is to capture and visualise user requirements without having to discuss the intricacies of data modelling and other complex data terminology. Whereas some technical users may understand technical language, it is better to have a consistent format that can be used and understood by technical and non-technical users alike.
Sun Models are very easy to understand for data consumers, analysts, senior management and C-Suite. It also gives them confidence about what they should expect to see as a deliverable.
In this example Sun Model, we can see requirements captured for tracking Sales events in a retail business. This is a common data model.
Similar to the earlier mechanisms discussed we are capturing measures and dimensions; however, we now have a much richer context to work with. We can see 8 measures that are required to be grouped/sliced by 4 core dimensions, each of which have multiple attributes. Let’s check some out.
In the below image for the Product dimension, we can see the following:
- 12 Product Attributes
- 2 alternate Hierarchies driven from Product Range
The user wants to be able to drill up and down different hierarchical definitions when looking at various measures. This now means that the BI Engineer creating and populating the data model knows what columns will go into the
dim_product table, but also how they should be arranged in the BI tool as well as creating any semantic logic required in the BI Tool to create the desired drill up and drill down capability.
In the below Customer dimension we can see the following:
- 11 customer attributes
- 1 hierarchy for customer location
- Definitions for slowly changing dimensions
You will notice numbers populated in the circles. This is an extension to the sun model format that Crimson Macaw have created to define requirements for Slowly Changing Dimensions.
The numbers translate to the Slowly changing behaviour “type” that the user wishes to see:
0 = Type 0 – Take the original value and ignore any changes
1 = Type 1 – Show the current value
2 = Type 2 – Show the value based on the time of event
3 – Type 3 – 2 attributes for the original and current
6 = Type 6 – (1+2+3 combined) 2 attributes one for the current and one based on the time of the event
For information on Slowly Changing behaviour types check out this link.
The information captured for the date dimension also covers another extension to the format that Crimson Macaw created and that is to show which dimensions are “role playing”.
In the example you can see that there is a requirement to group date based on the Order Date and Despatch date. The Sun Model format captures this requirement effortlessly.
In the example below we can see that there are 8 measures required. 3 of these measures are dynamically calculated and would not be captured in the ERD. Therefore, the developer understands that there is a requirement to build those dynamic calculations in the BI tool in order to fulfil the user requirements.
One of the great things about Sun Models is that as well as giving the user an easy-to-understand representation of the requirements, it also translates perfectly into a star schema design which can be represented as an ERD. This represents the target model that will be populated by the data transformation work and is a great tool for the development team.
Inputs into Sun Models
Crimson Macaw use various information sources to populate Sun Models. Some include:
- Interviews and Workshops with users and stakeholders
- Existing reporting and dashboards
- Industry standard metrics and dimensions for the given business process
- System Data Profiling
Hopefully you are more informed about how Sun Models can help in the requirement gathering and design translation process. At Crimson Macaw we think Sun Models are fantastic and are one of many tools we use in our requirement gathering and design iteration process. Tune in to future blogs where we will cover how Sun Models fit into the wider process we employ and drive our rapid prototype mechanism that helps further bring user requirements to life.
If you want to know more and don’t want to wait for the next blog then get in touch with us here!