Skip to content

Events

Overview

The events table is a flattened transformation of the raw inbound events_YYYYMMDD table, with the following structural and semantic changes:

  • Simplified flat output structure for optimal query syntax
  • Simple event count metrics for comparison and calculations
  • Standard and custom event parameters in property-specific flat schema
  • Custom user properties in property-specific flat schema
  • Geo-based local timestamps for time-of-day analyses
  • Observation-based data type coalescence to prevent type-based historic data loss
  • Session and event ids to support robust data quality assurance processes

One row represents one single discrete event.

Schema

events_YYYYMMDD
    ├── event_date STRING   
    ├── event_timestamp INTEGER 
    ├── event_name STRING   
    ├── event_params ARRAY<STRUCT>  
    ├── event_previous_timestamp INTEGER    
    ├── event_value_in_usd FLOAT    
    ├── event_bundle_sequence_id INTEGER    
    ├── event_server_timestamp_offset INTEGER   
    ├── user_id STRING  
    ├── user_pseudo_id STRING   
    ├── privacy_info STRUCT 
    ├── user_properties ARRAY<STRUCT>   
    ├── user_first_touch_timestamp INTEGER  
    ├── user_ltv STRUCT 
    ├── device STRUCT   
    ├── geo STRUCT  
    ├── app_info STRUCT 
    ├── traffic_source STRUCT   
    ├── stream_id STRING    
    ├── platform STRING 
    ├── event_dimensions STRUCT 
    ├── ecommerce STRUCT    
    ├── items ARRAY<STRUCT> 
    ├── collected_traffic_source STRUCT 
    └── is_active_user BOOLEAN  
events
    ├── project_id STRING   
    ├── dataset_name STRING 
    ├── analytics_property_id STRING    
    ├── event_id STRING 
    ├── session_id STRING   
    ├── event_date DATE 
    ├── event_timestamp TIMESTAMP   
    ├── event_name STRING   
    ├── event_previous_timestamp TIMESTAMP  
    ├── event_value_in_usd FLOAT    
    ├── event_bundle_sequence_id INTEGER    
    ├── event_server_timestamp_offset TIMESTAMP 
    ├── user_id STRING  
    ├── user_pseudo_id STRING   
    ├── privacy_info STRUCT 
    ├── user_first_touch_timestamp TIMESTAMP    
    ├── user_ltv STRUCT 
    ├── device STRUCT   
    ├── geo STRUCT  
    ├── app_info STRUCT 
    ├── traffic_source STRUCT   
    ├── stream_id STRING    
    ├── platform STRING 
    ├── event_dimensions STRUCT 
    ├── collected_traffic_source STRUCT 
    ├── is_active_user BOOLEAN
    ├── local STRUCT    
    ├── count STRUCT
    ├── parameter STRUCT
    └── property STRUCT     

New Columns (Metadata)

The following metadata columns are added to the output events table.

COLUMN NAME DATA TYPE DESCRIPTION
project_id STRING project_id containing the ga4_dataset_id
dataset_name STRING dataset_name of the GA4 dataset
analytics_property_id STRING analytics_property_id of the GA4 dataset
event_id STRING Unique ID per event
session_id STRING Unique ID per session

Existing Columns

The following columns are passed through directly from the raw events_YYYYMMDD, with no changes to structure or data type:

COLUMN NAME DATA TYPE
event_name STRING
event_value_in_usd FLOAT
event_bundle_sequence_id INTEGER
event_server_timestamp_offset INTEGER
user_id STRING
user_pseudo_id STRING
privacy_info STRUCT
user_ltv STRUCT
device STRUCT
geo STRUCT
app_info STRUCT
traffic_source STRUCT
stream_id STRING
platform STRING
event_dimensions STRUCT
collected_traffic_source STRUCT
is_active_user BOOLEAN
privacy_info STRUCT
user_ltv STRUCT
device STRUCT
geo STRUCT
app_info STRUCT
traffic_source STRUCT
event_dimensions STRUCT
collected_traffic_source STRUCT

Data Type Changes

The following columns are included in the events output table, with appropriate data type changes.

COLUMN NAME FROM DATA TYPE TO DATA TYPE DESCRIPTION
event_date STRING DATE Inbound shard date
event_timestamp INTEGER TIMESTAMP UTC timestamp of event occurrence
event_previous_timestamp INTEGER TIMESTAMP UTC timestamp of previous event occurrence
user_first_touch_timestamp INTEGER TIMESTAMP UTC timestamp of user first touch

New Columns (Augmented)

The following augmentation columns are added to the output events table:

local STRUCT    
    ├── timezone_id STRING  
    ├── timezone_name STRING    
    ├── country_code STRING 
    ├── timezone_source STRING  
    ├── latitude STRING 
    ├── longitude STRING    
    ├── date DATE   
    ├── timestamp DATETIME  
    ├── time TIME   
    ├── hour INTEGER    
    ├── hour_decimal FLOAT  
    ├── previous_timestamp DATETIME 
    └── first_touch_timestamp DATETIME  
COLUMN NAME DATA TYPE DESCRIPTION
local STRUCT NEW local timezone and geolocation STRUCT
local.timezone_id STRING NEW timezone id
local.timezone_name STRING NEW timezone name
local.country_code STRING NEW ISO country code
local.timezone_source STRING NEW source of timezone allocation (city/region/country)
local.latitude STRING NEW latitude
local.longitude STRING NEW longitude
local.date DATE NEW geo-adjusted event_date
local.timestamp DATETIME NEW geo-adjusted event_timestamp
local.time TIME NEW geo-adjusted time
local.hour INTEGER NEW geo-adjusted hour
local.hour_decimal FLOAT NEW geo-adjusted decimal hour
local.previous_timestamp DATETIME NEW geo-adjusted previous_timestamp
local.first_touch_timestamp DATETIME NEW geo-adjusted first_touch_timestamp

New Columns (Restructured)

The following flat STRUCT columns are added, replacing the nested source columns which are excluded from the output events table. The event_name column is still included in the output events table, but is supplemented with the count metrics STRUCT.

SOURCE COLUMN NAME SOURCE DATA TYPE TRANSFORMATION
event_name STRING Transformed to count metric STRUCT
event_params ARRAY Transformed to parameter STRUCT
user_properties ARRAY Transformed to property STRUCT

Count Schema

The following metric sub-columns are included as standard in the count STRUCT, in addition to any other event_name values detected in the data.

count STRUCT    
    ├── events INTEGER  
    ├── conversions INTEGER 
    ├── click INTEGER   
    ├── file_download INTEGER   
    ├── first_visit INTEGER 
    ├── form_start INTEGER  
    ├── form_submit INTEGER 
    ├── page_view INTEGER   
    ├── purchase INTEGER    
    ├── scroll INTEGER  
    ├── search INTEGER  
    ├── session_start INTEGER   
    ├── user_engagement INTEGER 
    ├── video_complete INTEGER  
    ├── video_progress INTEGER  
    ├── video_start INTEGER 
    └── view_search_results INTEGER 

Parameter Schema

The following value sub-columns are included as standard in the parameter STRUCT, in addition to any other event_parameter values detected in the data.

    parameter STRUCT    
        ├── anonymize_ip STRING 
        ├── batch_ordering_id INTEGER   
        ├── batch_page_id INTEGER   
        ├── campaign STRING 
        ├── debug_mode INTEGER  
        ├── engaged_session_event INTEGER   
        ├── engagement_time_msec INTEGER    
        ├── entrances INTEGER   
        ├── file_extension STRING   
        ├── file_name STRING    
        ├── form_destination STRING 
        ├── form_id STRING  
        ├── form_name STRING    
        ├── ga_session_id INTEGER   
        ├── ga_session_number INTEGER   
        ├── ignore_referrer STRING  
        ├── link_classes STRING 
        ├── link_domain STRING  
        ├── link_id STRING  
        ├── link_text STRING    
        ├── link_url STRING 
        ├── medium STRING   
        ├── outbound STRING 
        ├── page_location STRING    
        ├── page_path STRING    
        ├── page_referrer STRING    
        ├── page_title STRING   
        ├── percent_scrolled INTEGER    
        ├── search_term STRING  
        ├── session_engaged STRING  
        ├── source STRING   
        ├── term STRING 
        ├── video_current_time INTEGER  
        ├── video_duration INTEGER  
        ├── video_percent STRING    
        ├── video_provider STRING   
        ├── video_title STRING  
        ├── video_url STRING    
        └── visible STRING  

Property Schema

There are no standard values for user_properties, so sub-columns will only be included in the property STRUCT if they are detected in the data.