9.5. Data Blending: Behind The Scenes - Live Demo
Understanding Data Blending with a Live Example
We previously explored data blending in slides. Now, let's examine how it works with real tables in Looker Studio.
Imagine we have two tables: one with dates and clicks from the 1st to the 6th of March 2021, and another with dates and sales for some of those dates. However, we don't have the number of sales for all of the dates.
When data blending is applied, it will return a table with all the dates from the left data source, clicks from the left data source, and sales where it can find it from the right data source. When it cannot find matching data, it will show "null" values.
Key Takeaways from the Live Example
- Joint key: In this example, the joint key is "Date." It is essential to have one or more joint keys to blend data.
- Left join: In this scenario, we use a left join with one condition – that the date in Table 1 should exactly match the date in Table 2 for data from Table 2 to be written.
- Nulls: Null values are not zeros or empty strings. They essentially mean "nothing" and don't have a field type. It's crucial to differentiate null values from zero, especially when dealing with missing data.
- Missing data representation: In Looker Studio, you can choose how to represent missing data in tables. Options include showing "null" as it is, a dash, "no data," zero, or blank. However, zero and blank can be confusing, as they may not accurately represent the data.
Remember that null values might indicate gaps in your data that need to be fixed. It's good practice to show "null" unless you're certain that missing data means zero.
Dealing with Null Values and Related Issues
When calculating averages, nulls can affect the results. Let's explore how Looker Studio handles nulls and the potential problems with replacing null values with zeros.
Example: Averaging Sales with Null Values
Suppose you want to calculate the average of sales data that includes null values. When Looker Studio calculates the average, it correctly ignores the null values. However, if you replace the null values with zeros, the calculated average might appear incorrect, especially to someone manually calculating it. This is why showing null values instead of zeros is a better practice.
Common Issues in Blending Data
In some cases, you may encounter data application issues when blending data. For example, imagine you have data on clicks and sales for different states on different days. When blending this data, you may end up with duplicate click values, which can lead to incorrect results.
To fix data application issues, you can either:
- Aggregate the right dataset: Remove the unnecessary field (e.g., state) from the right dataset, causing Looker Studio to aggregate the sales data by date. This will result in one sales value for each day, ensuring correct blending.
- Break down the data: Break down the data by both date and state (e.g., clicks from New York and clicks from California). This will allow you to join the data based on two different values (date and state), which will ensure accurate reporting.
Keeping Blended Data Simple
When creating blended data sources in Looker Studio, it's easy to get carried away and include numerous metrics and dimensions. However, this can lead to overcomplicated and confusing data sources, which might result in incorrect numbers or duplicate values.
To avoid these issues, it's best to keep your blended data sources as simple as possible. Focus on including only the necessary metrics and dimensions for the resulting chart or visualization you want to create. This will make it easier to manage, understand, and troubleshoot your data blends.
Remember, a simple and streamlined approach to data blending can save you time and prevent potential errors in your reports.