Chapter 17: BigQuery: Hands-on Workshop
17.2. Planning the Data Pipeline
In this lesson, we'll create a data pipeline from scratch to extract data from Google Analytics and Google Search Console, process it in BigQuery, and visualize it in Looker Studio.
Extracting Data
First, we want to grab data such as session, landing page, number of sessions, and poll completions from Google Analytics. We also want to extract search query, landing page, impressions, and clicks data from Google Search Console. To do this, we'll use Supermetrics for Sheets to connect to both APIs and pull the data into separate Google Sheets.
Moving Data to BigQuery
Instead of performing all the data modeling in Sheets like we used to do traditionally, we will bring each Google Sheet into BigQuery as separate tables. So we will have two BigQuery tables representing the extracted data from both sources.
Processing Data in BigQuery
Once our tables are created in BigQuery, we'll perform all necessary operations like cleaning up, processing, joining, merging and calculations there. We will cover these operations in Data Modeling in BigQuery lesson.
Connecting Looker Studio
Finally, after processing our data in BigQuery, we'll connect BigQuery directly to Looker Studio for visualization purposes.
Understanding The Data Relationship
It's important to note that multiple search queries can point to one landing page. There isn't a one-to-one relationship between queries and landing pages. All these queries with their impressions and clicks are creating sessions in Google Analytics.
We need a way to calculate the share that each query has in creating these sessions based on their clicks' contribution because the value generated cannot be directly attributed entirely to each query. If you are curious about this way, you can check calculating % share of total lesson!
Calculating Attribution Shares
To calculate attribution shares for each query:
- Calculate the share of total depending on clicks.
- Multiply the total number of sessions by that share of total.
This calculation will give us an accurate distribution of attribution for each query in terms of sessions, conversions, and value.
Final Output
After all the calculations and data analysis are done in BigQuery, we'll have a comprehensive view of the data in Looker Studio, making it easier to understand and visualize the relationship between queries, clicks, sessions, conversions, and generated value.
📩 Receive my weekly Looker Studio tips
🖇 Connect with me on LinkedIn