Decode GA4

Overview

Decode GA4 is a BigQuery utility which empowers consumers of the Google Analytics 4 BigQuery export to:

Automate extraction of complex nested data.
Simplify the structure to support downstream analytics.
Augment data to unlock downstream use-cases.
Evolve the schema automatically upon source schema and parameter changes.
Optimize compression and storage for ongoing cost savings.

It removes the complexity required to handle the structural nuances of the data export, simplifying downstream data modelling and analytics. It is a robust, efficient, future-proof and zero-maintenance solution to the practical challenges presented by the GA4 BigQuery export.

Architecture

Decode GA4 is built entirely within Google BigQuery, fitting into and simplifying existing GA4-related workflows.

It uses Google Cloud Storage as the partitioned, compressed file store for transformed event data and BigQuery external tables to read the data into BigQuery when required. This both compresses the data and stores it in a low-cost, flexible location for significant, immediate ongoing cost savings.

Execution

Metadata-driven incremental logic is deployed to lower processing costs, and schema changes are automatically incorporated, so a full data refresh is never required in order to reflect a new schema in the output data table.

It is designed to process each inbound date partition precisely once.

This supports subsequent BigQuery transformations using any method or tool such as Dataform, DBT or SQLMesh.

It can also be configured to export transformed event data to AWS S3 or Azure Blob Storage, empowering simple, robust cross-cloud workflows.