BEAM meets Data Vault and wham bam thank you ma’am 

by | 6.1.3 - Data Requirements, 6.2.2 - Data Vault, AgileBI Articles, Underway

As a Developer
I want to understand who I can leverage BEAM and Data Vault together
So that I can develop faster and safer.

Whether you are pipelining [[link]] your Agile delivery or you are managing to deliver a thin slice [[link]] every sprint iteration both should involve BEAM to gather your  data requirements in an agile and repeatable way as well as Data Vault to model and load data in an iterative way, and a way that can be easily refactored.

One of the benefits of these two approaches is that they dovetail almost perfectly together.  It is as if Lawrence Corr and Hans Hultren worked together to leverage the strengths of each of the respective methods.

As an aside I know for a fact they haven’t, although not from lack of trying on my part to get these two gurus to find times in their busy schedules to align.

So a quick recap of the core structure of each to set the scene.

BEAM

In BEAM we capture the following artifacts:

  • Event
    Core business process that are defined by the questions of Who does What.
    Example: Customer Orders Product.
  • Detail
    Core things that comprise the event, in BEAM speak it’s the 7w’s of Who, What, When, Where, Why, How and How Many.
    Example: Customer, Product, Order
  • Detail of Detail
    Things that help describe or provide context for the core things.
    Example: Customer Name, Customer Address, Customer Age, Product Name, Product Type

As an aside I am hoping Lawrence Corr will one day do a revision of his BEAM method and rename Detail of Detail to something else.  Try saying it repetitively in a 4 hour modelstorming workshop to experience why!

Data Vault

In Data Vault we model the following artifacts:

  • Hub
    Table that only holds the keys for the core entity.
    Example: Customer Hub, Product Hub, Order Hub
  • Satellite (Sat)
    Table that holds all attributes of the core entity.
    Example: Customer Sat holds customer name, customer age and yes customer address (this is a discussion for a later article)
  • Link
    Table that holds the keys that represent the relationship of core entities that comprise a business process.
    Example: Link table with the following keys, Customer || Product || Order

So lets look at examples for these …….

BEAM Event and Detail

beam-4

BEAM Detail of Detail

beam-1

Data Vault Model

agilebi-guru-dv-model-v2

Wow look at that BEAM and Data Vault align perfectly:

  • BEAM Event = Data Vault Link
  • BEAM Detail = Data Vault Hub
  • BEAM Detail of Detail = Data Vault Sat

In fact we can even take the next step and map these easily to Dimensions and Facts in a Star Schema, but I won’t.

Keep in Mind

Most practitioners in the data warehouse and Business Intelligence domain have experience modeling using the Dimensional Star Schema pattern, and so are used to the concept of a fact table.  BEAM and Data Vault treat the fact record slightly different to a Dimensional pattern.

In BEAM the fact is initially captured as part of the Event template as a How Many.  In this example for Customer Orders Product, there is a How Many of Order Value.

In Data Vault the fact is a record in the Sat that hangs of the Hub for the thing that drives the relationship for the Link.  In this example the Order Hub.

One of the benefits of BEAM is it also closely aligns with a Dimensional model, and it is still the way Lawrence Corr teaches it in his excellent three day workshops.  So in the BEAM templates there is a sheet for defining the fact details.  At the moment I tend to not use this template and just extend the Event table to have the How Many’s.  However if you have a large number of How Many’s (Facts) I would typically use a hybrid template to capture this.

In the AgileBI BEAM to Data Vault approach, the Verb in the BEAM Event (in this example Order) becomes a Hub in Data Vault.

The How in the BEAM event defines the key for the Hub (i.e Order number).  The How Many become entries in the Verb Sat (i.e Order Value in the Order Sat).

A World of Opinions

There are various variations and views on how Data Vaults should be defined, seems we are at the equivalent of the BetaMax vs VHS argument of old.  I also know some of the approaches above, specifically around the definition of Facts for BEAM and the use of a Ensemble (Hub and Sat)  in Data Vault for the Order key rather than hanging a Sat off the Link table are slight variations of what is typically taught.

However in my view they allow closer alignment between BEAM and Data Vault, which in turn reduces the effort and latency in delivery as well as increasing the agility of the delivery process.