Decoded GA4 as a clean
Bruin source — flat YAML,
flat columns, no UNNEST.

Bruin's pitch is simple: define SQL and Python assets as YAML, manage dependencies, run against BigQuery. The cleanness of that model falls apart the moment your source is the GA4 export, because every asset that touches it has to flatten event_params first. Decode GA4 keeps Bruin honest. The decoded events table reads like a regular BigQuery source, and your assets stay short.

Connection: BigQuery external table CLI: bruin Template: events_external Maintenance: zero schema SQL

Bruin's appeal is short YAML and short SQL. The GA4 export breaks both of those if you point Bruin at it directly. Decoding upstream is what makes the YAML-first model actually work for GA4.

The shape of the problem

The GA4 BigQuery export stores every event parameter inside a repeated record. To pull a single page_location into a Bruin SQL asset you write a correlated subquery against event_params. To get device category, another. To get user pseudo id, another. The asset SQL grows long, the YAML metadata stays short, and the imbalance is awkward — Bruin is supposed to be the place where the SQL stays small and the orchestration stays declarative.

Why this hurts Bruin projects specifically

Two reasons. First, Bruin's AI-assisted authoring works far better when source columns are already named clearly. UNNEST scaffolding is the worst possible context for an LLM-assisted edit. Second, Bruin's strength is dependency management across SQL and Python assets. Pointing those dependencies at a source that silently drops new GA4 parameters means downstream Python assets can run successfully against incomplete data, and you only find out when the dashboard numbers stop matching the GA4 UI.

What changes with Decode GA4

Decode writes a flat events table into a BigQuery dataset of your choice. You add a BigQuery connection to .bruin.yml, then create assets that select directly from your-gcp-project-id.bruin_sources.events. There is no UNNEST in any asset, anywhere. When GA4 adds a new event parameter, the next decode run picks it up and your downstream assets see it on the next bruin run.

The traditional approach

Option A

UNNEST inside a Bruin SQL asset

Write a base events asset with a CROSS JOIN UNNEST per parameter. Refresh costs scale with the number of parameters. Every new GA4 parameter requires a code change. Bruin's promise of small, focused SQL assets goes out the window for the one source most analytics work depends on.

Hundreds of lines of glue SQL

Option B

Adapt a public GA4 package

Lift staging models from a community dbt-for-GA4 package and rewrite them as Bruin SQL assets. They flatten more events than you need, materialise intermediate tables, and lag behind whatever Google last shipped. You inherit modelling decisions you did not make and a release cadence you do not control.

Opinionated, slow to update

Option C

Run a separate flattener before Bruin

Stand up Cloud Functions or a Python job to flatten GA4 and write back to BigQuery, then point Bruin at that. Two pipelines, two failure modes, two places where the schema can drift. Bruin's dependency graph looks tidy. The thing it depends on is now a separate maintenance project.

Two pipelines to maintain

Decode GA4 vs DIY UNNEST in Bruin

Feature	Decode GA4 source	Hand-built Bruin staging asset
Lines of SQL to flatten a page_view	3	~18 per parameter
New GA4 parameter handling	Auto-detected, appears in source	Manual SQL update required
Refresh cost	External table, no scan on bruin run	Full UNNEST scan on every refresh
YAML asset readability	Short SQL, clean dependencies	Long SQL, noisy diffs
AI-assisted authoring quality	Clear column names, predictable shape	UNNEST scaffolding confuses suggestions
Maintenance over a year	Zero	Recurring, every GA4 release

One install. A clean BigQuery source. Bruin handles the rest.

[ 1 ]

Subscribe via Google Cloud Marketplace

Decode GA4 is a Marketplace listing. Usage-based pricing, no monthly minimum. The subscription takes under a minute and billing appears on your existing GCP invoice.
[ 2 ]

Deploy with the events_external template

Pick a BigQuery dataset that Bruin will use as a source — for example bruin_sources. Set destination_dataset_id to that dataset. Decode writes a Parquet-backed external table called events into it.
[ 3 ]

Add a BigQuery connection to .bruin.yml

Use Application Default Credentials or a service account file. Bruin now knows how to talk to the project. No additional warehouse setup is needed for the decoded events table — it is already a normal BigQuery table from Bruin's perspective.
[ 4 ]

Run bruin run as normal

Create SQL assets that select from your-gcp-project-id.bruin_sources.events. Bruin resolves dependencies, runs assets in order, and reports status. The decoded events table updates daily and your assets pick up the new partitions on the next run.

HOW THE SETUP WORKS

Wire decoded GA4 into a Bruin pipeline in four small steps. Nothing here is Bruin-specific magic — it is the same source pattern you already use for any other BigQuery table.

GCP

Run the Decode GA4 installer with destination_dataset_id pointing at your Bruin sources dataset.

GCP

Grant the BigQuery service account or user Storage Object Viewer on the Decode GCS bucket.

Bruin

Install the Bruin CLI and add a BigQuery connection block to .bruin.yml.

Bruin

Create SQL assets that select from your-gcp-project-id.bruin_sources.events. Run bruin run.

The events table is a BigQuery external table backed by Parquet files in GCS. Bruin reads it natively through BigQuery, but the underlying storage stays in your project. No data leaves your perimeter, and there is no extra ingestion step for Bruin to schedule.

Full walkthrough with commands →

What you get in Bruin

A first-class BigQuery source

The events table is referenced through a normal FROM clause inside any SQL asset. Bruin's dependency resolver picks it up like any other source. No special handling required.

Direct event parameter columns

page_location, page_referrer, page_title, ga_session_id, ga_session_number — every standard parameter is a direct column. No correlated subqueries inside your SQL assets.

External table economics

The source reads from Parquet files in GCS. There is no duplicate storage in BigQuery, and Bruin runs against decoded data without paying to scan the raw GA4 export.

Schema evolution that just works

When GA4 adds a new event parameter, the next decode run picks it up. Bruin assets that need the new column can reference it immediately. Assets that do not are unaffected.

Short SQL, clean YAML

Asset SQL stays short and readable. Bruin's YAML metadata stays declarative. The diff between two pipeline versions stops being a wall of UNNEST changes and starts being actual business logic.

Better AI-assisted authoring

Bruin's AI-assisted authoring works far better against clearly named columns than against nested-record UNNEST patterns. Suggestions stop trying to recreate scaffolding you no longer need.

Marketing attribution pipelines

Build a session_facts SQL asset on top of the decoded events table without writing a single UNNEST. Source-medium, campaign, and landing page are direct columns. The path from raw event to attribution mart shrinks from five intermediate assets to one.

Funnel analysis with Python downstream

Standard ecommerce funnels — view_item, add_to_cart, begin_checkout, purchase — become readable case statements in the SQL asset. The Python asset that runs cohort analysis on top reads from a clean intermediate table rather than re-doing the flattening.

Product analytics marts driven by YAML

Custom event parameters that your product team adds — feature flags, plan tier, A/B variant — show up as direct columns the moment they start firing. New asset YAML stays small. New asset SQL stays small. The whole pipeline scales as the data does, not as the unnesting does.

Does this work with the open-source Bruin CLI?

Yes. The integration is at the BigQuery layer — the open-source CLI sees the decoded events table the same way any BigQuery client would. No platform-specific features required. See setup →

Do I need to delete my existing GA4 staging assets?

Not immediately. You can run the decoded events table alongside an existing flattening asset and migrate downstream assets one at a time. Most teams find that the flattening asset can be deleted entirely once Decode is in place.

What permissions does my BigQuery connection need?

The standard BigQuery Data Viewer and BigQuery Job User roles, plus Storage Object Viewer on the Decode GA4 GCS bucket — required because the events table is an external table backed by Parquet files. Full prerequisites →

Can I use Bruin's quality checks against the decoded events?

Yes. The events table behaves like any other BigQuery source — Bruin's column-level checks for not-null, unique, accepted values, and custom SQL all work exactly as they would against a regular table.

Deploy in under 5 minutes

Your Bruin pipelines,
without the UNNEST tax.

Subscribe via Google Cloud Marketplace, point destination_dataset_id at your Bruin sources dataset, and have a clean events table available to your assets before the end of the day.

Get Started on Marketplace → Read the documentation

Google Cloud Marketplace · Usage-based · No monthly minimum

Decoded GA4 as a clean
Bruin source — flat YAML,
flat columns, no UNNEST.

WHY THIS
MATTERS

The shape of the problem

Why this hurts Bruin projects specifically

What changes with Decode GA4

The traditional approach

UNNEST inside a Bruin SQL asset

Adapt a public GA4 package

Run a separate flattener before Bruin

Decode GA4 vs DIY UNNEST in Bruin

HOW
DECODE
GA4
WORKS

Subscribe via Google Cloud Marketplace

Deploy with the events_external template

Add a BigQuery connection to .bruin.yml

Run bruin run as normal

HOW THE SETUP WORKS

What you get in Bruin

A first-class BigQuery source

Direct event parameter columns

External table economics

Schema evolution that just works

Short SQL, clean YAML

Better AI-assisted authoring

COMMON
USE CASES

Marketing attribution pipelines

Funnel analysis with Python downstream

Product analytics marts driven by YAML

COMMON
QUESTIONS

Does this work with the open-source Bruin CLI?

Do I need to delete my existing GA4 staging assets?

What permissions does my BigQuery connection need?

Can I use Bruin's quality checks against the decoded events?

Your Bruin pipelines,
without the UNNEST tax.

Decoded GA4 as a cleanBruin source — flat YAML,flat columns, no UNNEST.

WHY THISMATTERS

The shape of the problem

Why this hurts Bruin projects specifically

What changes with Decode GA4

The traditional approach

UNNEST inside a Bruin SQL asset

Adapt a public GA4 package

Run a separate flattener before Bruin

Decode GA4 vs DIY UNNEST in Bruin

HOWDECODEGA4WORKS

Subscribe via Google Cloud Marketplace

Deploy with the events_external template

Add a BigQuery connection to .bruin.yml

Run bruin run as normal

HOW THE SETUP WORKS

What you get in Bruin

A first-class BigQuery source

Direct event parameter columns

External table economics

Schema evolution that just works

Short SQL, clean YAML

Better AI-assisted authoring

COMMONUSE CASES

Marketing attribution pipelines

Funnel analysis with Python downstream

Product analytics marts driven by YAML

COMMONQUESTIONS

Does this work with the open-source Bruin CLI?

Do I need to delete my existing GA4 staging assets?

What permissions does my BigQuery connection need?

Can I use Bruin's quality checks against the decoded events?

Other transformation integrations

GA4 BigQuery → dbt

GA4 BigQuery → Dataform

GA4 BigQuery → SQLMesh

Your Bruin pipelines,without the UNNEST tax.

Decoded GA4 as a clean
Bruin source — flat YAML,
flat columns, no UNNEST.

WHY THIS
MATTERS

HOW
DECODE
GA4
WORKS

COMMON
USE CASES

COMMON
QUESTIONS

Your Bruin pipelines,
without the UNNEST tax.