Decoded GA4 as a clean
Bruin source — flat YAML,
flat columns, no UNNEST.
Bruin's pitch is simple: define SQL and Python assets as YAML, manage dependencies, run against BigQuery. The cleanness of that model falls apart the moment your source is the GA4 export, because every asset that touches it has to flatten event_params first. Decode GA4 keeps Bruin honest. The decoded events table reads like a regular BigQuery source, and your assets stay short.
Bruin's appeal is short YAML and short SQL. The GA4 export breaks both of those if you point Bruin at it directly. Decoding upstream is what makes the YAML-first model actually work for GA4.
The shape of the problem
The GA4 BigQuery export stores every event parameter inside a repeated record. To pull a single page_location into a Bruin SQL asset you write a correlated subquery against event_params. To get device category, another. To get user pseudo id, another. The asset SQL grows long, the YAML metadata stays short, and the imbalance is awkward — Bruin is supposed to be the place where the SQL stays small and the orchestration stays declarative.
Why this hurts Bruin projects specifically
Two reasons. First, Bruin's AI-assisted authoring works far better when source columns are already named clearly. UNNEST scaffolding is the worst possible context for an LLM-assisted edit. Second, Bruin's strength is dependency management across SQL and Python assets. Pointing those dependencies at a source that silently drops new GA4 parameters means downstream Python assets can run successfully against incomplete data, and you only find out when the dashboard numbers stop matching the GA4 UI.
What changes with Decode GA4
Decode writes a flat events table into a BigQuery dataset of your choice. You add a BigQuery connection to .bruin.yml, then create assets that select directly from your-gcp-project-id.bruin_sources.events. There is no UNNEST in any asset, anywhere. When GA4 adds a new event parameter, the next decode run picks it up and your downstream assets see it on the next bruin run.
The traditional approach
UNNEST inside a Bruin SQL asset
Write a base events asset with a CROSS JOIN UNNEST per parameter. Refresh costs scale with the number of parameters. Every new GA4 parameter requires a code change. Bruin's promise of small, focused SQL assets goes out the window for the one source most analytics work depends on.
Adapt a public GA4 package
Lift staging models from a community dbt-for-GA4 package and rewrite them as Bruin SQL assets. They flatten more events than you need, materialise intermediate tables, and lag behind whatever Google last shipped. You inherit modelling decisions you did not make and a release cadence you do not control.
Run a separate flattener before Bruin
Stand up Cloud Functions or a Python job to flatten GA4 and write back to BigQuery, then point Bruin at that. Two pipelines, two failure modes, two places where the schema can drift. Bruin's dependency graph looks tidy. The thing it depends on is now a separate maintenance project.
Decode GA4 vs DIY UNNEST in Bruin
| Feature | Decode GA4 source | Hand-built Bruin staging asset |
|---|---|---|
| Lines of SQL to flatten a page_view | 3 | ~18 per parameter |
| New GA4 parameter handling | Auto-detected, appears in source | Manual SQL update required |
| Refresh cost | External table, no scan on bruin run | Full UNNEST scan on every refresh |
| YAML asset readability | Short SQL, clean dependencies | Long SQL, noisy diffs |
| AI-assisted authoring quality | Clear column names, predictable shape | UNNEST scaffolding confuses suggestions |
| Maintenance over a year | Zero | Recurring, every GA4 release |
One install. A clean BigQuery source. Bruin handles the rest.
- [ 1 ]
Subscribe via Google Cloud Marketplace
Decode GA4 is a Marketplace listing. Usage-based pricing, no monthly minimum. The subscription takes under a minute and billing appears on your existing GCP invoice.
- [ 2 ]
Deploy with the events_external template
Pick a BigQuery dataset that Bruin will use as a source — for example bruin_sources. Set destination_dataset_id to that dataset. Decode writes a Parquet-backed external table called events into it.
- [ 3 ]
Add a BigQuery connection to .bruin.yml
Use Application Default Credentials or a service account file. Bruin now knows how to talk to the project. No additional warehouse setup is needed for the decoded events table — it is already a normal BigQuery table from Bruin's perspective.
- [ 4 ]
Run bruin run as normal
Create SQL assets that select from your-gcp-project-id.bruin_sources.events. Bruin resolves dependencies, runs assets in order, and reports status. The decoded events table updates daily and your assets pick up the new partitions on the next run.
HOW THE SETUP WORKS
Wire decoded GA4 into a Bruin pipeline in four small steps. Nothing here is Bruin-specific magic — it is the same source pattern you already use for any other BigQuery table.
GCP
Run the Decode GA4 installer with destination_dataset_id pointing at your Bruin sources dataset.
GCP
Grant the BigQuery service account or user Storage Object Viewer on the Decode GCS bucket.
Bruin
Install the Bruin CLI and add a BigQuery connection block to .bruin.yml.
Bruin
Create SQL assets that select from your-gcp-project-id.bruin_sources.events. Run bruin run.
The events table is a BigQuery external table backed by Parquet files in GCS. Bruin reads it natively through BigQuery, but the underlying storage stays in your project. No data leaves your perimeter, and there is no extra ingestion step for Bruin to schedule.
What you get in Bruin
A first-class BigQuery source
The events table is referenced through a normal FROM clause inside any SQL asset. Bruin's dependency resolver picks it up like any other source. No special handling required.
Direct event parameter columns
page_location, page_referrer, page_title, ga_session_id, ga_session_number — every standard parameter is a direct column. No correlated subqueries inside your SQL assets.
External table economics
The source reads from Parquet files in GCS. There is no duplicate storage in BigQuery, and Bruin runs against decoded data without paying to scan the raw GA4 export.
Schema evolution that just works
When GA4 adds a new event parameter, the next decode run picks it up. Bruin assets that need the new column can reference it immediately. Assets that do not are unaffected.
Short SQL, clean YAML
Asset SQL stays short and readable. Bruin's YAML metadata stays declarative. The diff between two pipeline versions stops being a wall of UNNEST changes and starts being actual business logic.
Better AI-assisted authoring
Bruin's AI-assisted authoring works far better against clearly named columns than against nested-record UNNEST patterns. Suggestions stop trying to recreate scaffolding you no longer need.
Marketing attribution pipelines
Build a session_facts SQL asset on top of the decoded events table without writing a single UNNEST. Source-medium, campaign, and landing page are direct columns. The path from raw event to attribution mart shrinks from five intermediate assets to one.
Funnel analysis with Python downstream
Standard ecommerce funnels — view_item, add_to_cart, begin_checkout, purchase — become readable case statements in the SQL asset. The Python asset that runs cohort analysis on top reads from a clean intermediate table rather than re-doing the flattening.
Product analytics marts driven by YAML
Custom event parameters that your product team adds — feature flags, plan tier, A/B variant — show up as direct columns the moment they start firing. New asset YAML stays small. New asset SQL stays small. The whole pipeline scales as the data does, not as the unnesting does.
Does this work with the open-source Bruin CLI?
Yes. The integration is at the BigQuery layer — the open-source CLI sees the decoded events table the same way any BigQuery client would. No platform-specific features required. See setup →
Do I need to delete my existing GA4 staging assets?
Not immediately. You can run the decoded events table alongside an existing flattening asset and migrate downstream assets one at a time. Most teams find that the flattening asset can be deleted entirely once Decode is in place.
What permissions does my BigQuery connection need?
The standard BigQuery Data Viewer and BigQuery Job User roles, plus Storage Object Viewer on the Decode GA4 GCS bucket — required because the events table is an external table backed by Parquet files. Full prerequisites →
Can I use Bruin's quality checks against the decoded events?
Yes. The events table behaves like any other BigQuery source — Bruin's column-level checks for not-null, unique, accepted values, and custom SQL all work exactly as they would against a regular table.
Deploy in under 5 minutes
Your Bruin pipelines,
without the UNNEST tax.
Subscribe via Google Cloud Marketplace, point destination_dataset_id at your Bruin sources dataset, and have a clean events table available to your assets before the end of the day.
Google Cloud Marketplace · Usage-based · No monthly minimum