Enabling Data Culture in Organizations Using Purposeful Documentation

The Power of Documentation in Navigating the Data Driven Journey

Joon Solutions
Joon Solutions Global

--

How much documentation do we need to change a light bulb?

A lot, as the end-users might not understand what you wrote in the first place!

Data end-users, especially non-technical business users, often spend a significant amount of time trying to understand the data being presented to them in order to make sense of the insights used for their next business plans. In other words, a business analyst’s biggest fear might be hearing this sentence from the BI developer team: “Sorry, you’ve used the wrong metric for that sales report; it’s not what you think it is.” This means a new business campaign with “wrong” insights to benchmark might have to be reviewed.

The problem faced by non-technical end-users is real — the inability to navigate and having little control over which data dimensions are being presented to them. The root cause of this problem lies in how the management team initially views and handles data as assets, leading to an insufficiency of purposeful documentation in the development phase and the lack of suitable tools to navigate and understand their current data in the launching phase. So, what can be done to improve this situation?

Set documentation as a cornerstone for the data project

Documentation is often taken rather lightly or not prioritized, especially when tight deadlines are involved. Skipping documentation might help move quickly in the short term, but it creates a nuisance for end-users who rely on it more frequently than developers. It takes time to decipher technical logic and relevant business requirements when documentation is lacking, whereas properly done documentation can save a lot of cognitive load for users, including the developers themselves who often become the documentation owners. The documentation process should be performed throughout the lifetime of the data to keep it relevant as changes arise due to business feedback or logic requirement changes. Setting the right mindset at the beginning of the project will help clear up potential inconveniences and create a smoother user experience in the long run.

Documentation in dbt

If you are using the most modern data stack, where you’ve extracted and loaded all data sources into your data lake, applied dbt to transform data, and built a beautiful data model, why not take it a step further and push these details to your end-users, such as business analysts, who might have limited access to your dbt code but operate more often on data warehouse platform UIs, e.g., Snowflake, Google BigQuery, etc?

Consider using +persist_docs either as a config block on the model or your entire dbt project. This simple step can greatly improve the lives of many BI end-users.

For instance, in your dbt project, you can include the following:

Check out the dbt documentation to config persist_docs on your dbt project.

models:
[<resource-path>](resource-path):
+persist_docs:
relation: true
columns: true
{{ config(
persist_docs={"relation": true, "columns": true}
) }}
select ...

Documentation in Looker

Different strokes for different folks. Different end-users might expect different depths of detail as to how a metric is created. Users should be best provided with proper tools or training courses to tailor various expectations and needs. For example, Looker has become a strong modern data platform that empowers the entire organization to easily see and analyze their data, at the same time being the playing field for LookML developers with more advanced skills. A business user wanting to explore an insight on the difference between the performance of those who use a specific discount code and who don’t will only need to know where to find that correct dimension among many dimensions discount_code. Meanwhile, a data analyst might want to check the relationship from one dataset to another, whether it helps them to produce the correct result or not. The level of concern varies and should be approached with different ways to resolve it. Several tools available on Looker can be introduced to different users to elevate the data exploring experience: dimension definition, data dictionary, and LookML diagram.

The most basic and frequently used tool is LookML dimension/measure description, which is a default configuration to declare in any Looker model. End-users can find the definition of a field before deciding which one to use. If you have a Looker Developer role, simply declare the description of the dimension or measure in the view and save the changes, the definition for a specific field will appear in Explore.

Meanwhile, a Looker Data Dictionary is a summary of all dimensions’ information in the LookML project with definitions, data type, or how it was created. This is quite helpful in auditing, e.g. whether a field has a description or not, or as a collaborating medium between end users and the data team (through comments). This tool will make sense only when sufficient descriptions are provided in the Looker model in the step above. Reading more on how to enable Looker Data Dictionary on your LookML project and guide your users to use it.

More advanced users can take a look at the LookML diagram to check the relationship between one dataset to another, finally ensuring the precision of their data analysis. More details on setting up and using the LookML diagram can be found here

Going beyond

As the business grows with an emerging data culture, more people are involved in the data usage, more data sources come into presence and shared knowledge to ensure trusted and open access becomes more relevant. Data governance outgrows simple documentation and demands a consolidated source of data knowledge built on a collaborative and interactive platform where information is alive. Modern data stack then will likely go hand in hand with more advanced metadata management products, e.g. data catalog tools. In this manner, end-users play the key driver of data transformation and co-create to make data usable.

From an IT department with IT personnel to a data analyst and a data team with diversified roles, companies’ vision in improving data availability and data democratization is a long essential walk for data-driven solutions. For that to be realized, starting from small steps, and minding your documentation to serve the most direct users will introduce those changes gradually and insightfully to build a more holistic data culture in your organization.

Reference

--

--