Export GA4 BigQuery data
to Azure Blob — automatically.

GA4's BigQuery export wasn't designed to leave GCP. Moving raw nested data to Azure Blob costs more and solves less than you expect. Decode GA4 transforms and compresses your data inside GCP first — clean Parquet files, 75–90% smaller, every event parameter as a direct column.

Category: Cross-cloud export Deploy time: ~5 minutes Format: Parquet (compressed) Runs inside your GCP project

GA4 exports to BigQuery. Most Azure-native data teams want their analytics data in Blob Storage.

Why GA4 data is hard to get into Azure

This gap is more common than it sounds. Your analytics team uses GA4. Your data platform runs on Azure — Synapse for the warehouse, Microsoft Fabric or OneLake for unified analytics, Power BI for dashboards, Databricks on Azure for modelling. GA4's BigQuery export does not provide a native path to get there.

The DIY trap

The teams that do build something themselves typically end up with one of two outcomes. Either they write a Cloud Function that extracts raw, nested GA4 event data and pushes it to Blob Storage — at which point every downstream consumer still has to deal with UNNEST logic in Synapse or complex Spark transformations in Databricks. Or they spend weeks writing transformation SQL first, then stand up a BigQuery export to GCS, then sync GCS to Blob with AzCopy or Azure Data Factory. Four moving parts, indefinitely.

A simpler path: resolve the complexity once

There is a third option. Resolve the structural complexity once, inside BigQuery, and export the clean result directly to Blob Storage. That is what Decode GA4 does.

The hidden cost of raw exports: egress

There is also a cost argument most teams discover after the fact. Standard GCP egress rates run $0.08–0.12 per GB. Moving raw, uncompressed GA4 event data to Azure means paying for every byte of UNNEST overhead — the repeated arrays, the repeated key names, the repeated metadata. Transforming and compressing to Parquet inside GCP before export reduces payload size by 75–90%. For most GA4 deployments that is a material reduction in the monthly transfer bill, not a rounding error.

The traditional approach

Option A

Export raw GA4 data to Blob

Write a Cloud Function or Cloud Run job that reads from the GA4 BigQuery export and writes to Blob Storage. Fast to build. The problem: you are moving nested, structurally complex data to Azure. Every analyst querying it via Synapse serverless or Databricks still needs correlated subqueries to extract any event parameter. You have moved the problem, not solved it.

Still nested in Blob

Option B

Flatten in BigQuery, sync via ADF

Write transformation SQL to unnest the GA4 data in BigQuery, materialise a clean table, set up a BigQuery export to GCS, then sync GCS to Blob via Azure Data Factory or AzCopy. You now maintain: the unnesting SQL (which breaks when GA4 adds a new parameter), the scheduled query, the GCS export job, and the ADF pipeline. Four moving parts, indefinitely.

Four things to maintain forever

Option C

Use an ETL platform

Fivetran, Airbyte, or Azure Data Factory connectors extract your GA4 data and load it somewhere. They handle movement but not transformation — you still get the same nested structure at the destination, and now your GA4 data has left your Google Cloud project and passed through a third-party system. Most platforms also charge a monthly minimum regardless of usage.

Data leaves your project

Decode GA4 vs the common way

Feature	Decode GA4	DIY / ETL platform
Setup time	Under 5 minutes	Hours or days of App Registration and pipeline config
Compression	Automatic ZSTD — 75–90% size reduction	Manual scripting, often uncompressed
Schema drift	Auto-detected and handled	Manual schema updates when GA4 changes
Egress costs	Pre-compressed inside GCP before transfer	Full raw payload transferred — no savings
Data residency	Data never leaves your GCP project	Most ETL platforms process your data on their infrastructure
Maintenance	Zero — deploy once, run forever	Ongoing SQL updates, pipeline monitoring, schema fixes

One deployment. Clean data in Blob. Zero maintenance.

[ 1 ]

Subscribe via Google Cloud Marketplace

Decode GA4 is available on Google Cloud Marketplace. Usage-based pricing — no monthly minimum, no credit card required. The subscription takes under a minute and billing appears on your existing GCP invoice.
[ 2 ]

Deploy with Blob export configured

The installer takes your GA4 properties, Azure storage account, container name, and the BigQuery connection ID you set up for Azure. The connection setup is a one-time ~5-minute step, and the docs have the exact commands. Decode GA4 itself installs entirely within your GCP project — no data touches any external system except your own Azure storage account.
[ 3 ]

Clean Parquet files appear in Blob, daily

Decoded GA4 data is written to Blob Storage in compressed Parquet format, hive-partitioned by date. Query it with Synapse serverless, read it with Databricks, pick it up in Power BI via DirectLake, or register it as a OneLake shortcut in Fabric. Each partition is processed once, unless GA4 modifies it upstream — which is detected and handled automatically.

HOW THE SETUP WORKS

Cross-cloud integration, done in four steps. Full commands in the docs.

Azure

GCP

Create a BigQuery connection pointing at that application's tenant, client ID, and Azure region.

Azure

Add the Google identity BigQuery returns as a federated credential on the application.

Azure

Assign the Storage Blob Data Contributor role to the app on your storage account.

Authentication uses Workload Identity Federation, not stored Azure client secrets. BigQuery exchanges a short-lived Google OIDC token for an Azure AD access token at runtime — no client secrets or connection strings are kept on the GCP side.

Full walkthrough with commands →

What you get in Blob

Clean, flat Parquet files

Every event parameter exposed as a direct column. No UNNEST. No correlated subqueries. Query with Synapse serverless using simple dot-notation, or read in Databricks without Spark schema gymnastics.

Hive-partitioned by date

Data lands in Blob in a standard hive-partitioned folder structure. Synapse OPENROWSET partition pruning works out of the box. So does Databricks Auto Loader, Fabric Lakehouse shortcuts, and most ADLS-aware query engines.

Automatic schema evolution

When GA4 adds a new event parameter, it appears in the next export without any configuration change on your end. The pipeline does not break. You do not get paged at 2am.

Incremental, not full refreshes

Each partition is processed once. If GA4 modifies a historical partition — which it does, unpredictably — Decode GA4 detects the change and reprocesses only that date. Daily runs are cheap.

75–90% lower egress costs

ZSTD-compressed Parquet leaves GCP at a fraction of the size of raw BigQuery exports. Less data transferred means a proportionally smaller egress bill. The compression happens before transfer, inside your project.

The full Azure analytics stack

Flat Parquet in Blob opens Synapse for serverless SQL, Fabric and OneLake for unified analytics, Power BI DirectLake for sub-second dashboards, and Databricks for ML. All from the same clean source files.

Power BI DirectLake against GA4 event data

DirectLake skips Import mode entirely — Power BI queries Parquet files directly from Blob or OneLake, no dataset refresh job, no dataset size limit. Point DirectLake at the decoded GA4 partitions and users get sub-second query performance over years of event-level data without any intermediate modelling layer. The flat column structure means measures and dimensions map directly; no DAX gymnastics for nested parameters.

Historical data preservation

GA4 retains raw event data for 14 months by default. Move it to Blob and you own it indefinitely. Lifecycle policies automatically shift older partitions to Cool, then Archive tier — archive storage runs around £0.002 per GB per month, so years of GA4 event history fit in most teams' petty cash. Standard year-over-year analysis stops being a problem you have to engineer around.

Attribution modelling in Databricks or Fabric

Platform-reported attribution is always last-click or close to it. Clean event-level data in Blob gives your data science team the raw material for proper Markov chain or data-driven models in Databricks or Fabric Data Science — without touching BigQuery quotas. The event parameters are already flat. The path sequences are already queryable without subqueries.

Do I need to store an Azure client secret anywhere?

No. BigQuery authenticates to Azure using Workload Identity Federation. A federated credential on your Azure AD app links BigQuery's Google identity directly — no client secrets, no connection strings, no long-lived credentials on the GCP side. How auth works →

What GCP and Azure permissions do I need to install this?

On GCP, the BigQuery Connection Admin role. On Azure, permission to register an application, add a federated credential, and assign Storage Blob Data Contributor on the target storage account. Both are standard for a data platform owner. See prerequisites →

Which Azure regions are supported?

Any Azure region where BigQuery Omni operates. You set the region when you create the BigQuery connection, and it's fixed per connection — one connection per region if you need multiple destinations. Setup guide →

Deploy in under 5 minutes

Your GA4 data in Azure Blob,
today.

No credit card required. Install via Google Cloud Marketplace, configure your Blob container, and have clean Parquet files appearing in Azure before the end of the day.

Get Started on Marketplace → Read the documentation

Google Cloud Marketplace · Usage-based · No monthly minimum

Export GA4 BigQuery data
to Azure Blob — automatically.

WHY THIS
MATTERS

Why GA4 data is hard to get into Azure

The DIY trap

A simpler path: resolve the complexity once

The hidden cost of raw exports: egress

The traditional approach

Export raw GA4 data to Blob

Flatten in BigQuery, sync via ADF

Use an ETL platform

Decode GA4 vs the common way

HOW
DECODE
GA4
WORKS

Subscribe via Google Cloud Marketplace

Deploy with Blob export configured

Clean Parquet files appear in Blob, daily

HOW THE SETUP WORKS

What you get in Blob

Clean, flat Parquet files

Hive-partitioned by date

Automatic schema evolution

Incremental, not full refreshes

75–90% lower egress costs

The full Azure analytics stack

COMMON
USE CASES

Power BI DirectLake against GA4 event data

Historical data preservation

Attribution modelling in Databricks or Fabric

COMMON
QUESTIONS

Do I need to store an Azure client secret anywhere?

What GCP and Azure permissions do I need to install this?

Which Azure regions are supported?

Your GA4 data in Azure Blob,
today.

Export GA4 BigQuery datato Azure Blob — automatically.

WHY THISMATTERS

Why GA4 data is hard to get into Azure

The DIY trap

A simpler path: resolve the complexity once

The hidden cost of raw exports: egress

The traditional approach

Export raw GA4 data to Blob

Flatten in BigQuery, sync via ADF

Use an ETL platform

Decode GA4 vs the common way

HOWDECODEGA4WORKS

Subscribe via Google Cloud Marketplace

Deploy with Blob export configured

Clean Parquet files appear in Blob, daily

HOW THE SETUP WORKS

What you get in Blob

Clean, flat Parquet files

Hive-partitioned by date

Automatic schema evolution

Incremental, not full refreshes

75–90% lower egress costs

The full Azure analytics stack

COMMONUSE CASES

Power BI DirectLake against GA4 event data

Historical data preservation

Attribution modelling in Databricks or Fabric

COMMONQUESTIONS

Do I need to store an Azure client secret anywhere?

What GCP and Azure permissions do I need to install this?

Which Azure regions are supported?

Other integrations

GA4 BigQuery → Amazon S3

GA4 BigQuery → Google Cloud Storage

GA4 BigQuery → dbt

Your GA4 data in Azure Blob,today.

Export GA4 BigQuery data
to Azure Blob — automatically.

WHY THIS
MATTERS

HOW
DECODE
GA4
WORKS

COMMON
USE CASES

COMMON
QUESTIONS

Your GA4 data in Azure Blob,
today.