17.9. Data Modeling in BigQuery

In this lesson, we'll walk through data modeling in BigQuery using Google Search Console and Google Analytics data as examples.

Preparing the Data

First, let's prepare our data. We have two datasets: Google Search Console and Google Analytics. We'll start by removing rows with zero clicks from the Google Search Console dataset, as they don't provide any valuable insights for our analysis.

SELECT *
FROM search_console_data
WHERE clicks > 0

Next, we need to filter the Google Analytics dataset to only include traffic from Google Organic:

SELECT *
FROM google_analytics_data
WHERE source_medium = 'google/organic'

Standardizing Landing Pages

In order to join these two tables together, we need to standardize the format of the landing pages. For example, we need to remove https:// and ensure that both formats have a trailing slash at the end.

For Google Search Console data:

SELECT
  search_query AS search_query,
  REPLACE(landing_page, 'https://', '') AS landing_page,
  impressions AS impressions,
  clicks AS clicks
FROM search_console_data
WHERE clicks > 0

For Google Analytics data:

SELECT
  REGEXP_EXTRACT(landing_page_path, r'^[^?]*') AS landing_page,
  source_medium AS source_medium,
  sessions AS sessions,
  goal_completions_all_goals AS goal_completions,
  total_goal_value_all_goals AS goal_value
FROM google_analytics_data_cleaned_up
WHERE source_medium = 'google/organic'

Now that we've standardized the landing pages in both datasets, we can proceed with joining them.

Joining Tables

We want to join these tables using a left join, with the Google Search Console dataset as the left table and the Google Analytics dataset as the right. We'll use the landing_page field as our join key.

SELECT
  search_query,
  landing_page,
  impressions,
  clicks,
  sessions,
  goal_completions,
  goal_value
FROM search_console_data_cleaned_up
LEFT JOIN google_analytics_data_cleaned_up
ON search_console_data_cleaned_up.landing_page = google_analytics_data_cleaned_up.landing_page

With this data, we can now analyze the impact of search queries on site traffic and conversions. Remember to save your queries so you can easily access them later!

Previous
Previous

17.10. SQL Join in BigQuery

Next
Next

17.8. Writing SQL in BigQuery