November 14, 2023 - v.1.05
Revised the logic for determining the source with disabled auto-tagging, closely aligning it with GA UA sessionization. When you prioritize manual tagging, the system checks for gclid in the session. If present, it then verifies the presence of utm parameters, extracting their values. If utm_source and utm_medium are empty, the default value becomes google/cpc.
Implemented a check for the number of events per user per day. Users with over 10,000 events in a day are excluded from the sessionization due to potential bot-like behavior.
Adjusted the population of parameters hits.transaction.transactionId and hits.transaction.transactionRevenue. Now, if the transaction format is incorrectly passed (e.g., using INTEGER or FLOAT instead of STRING), and ecommerce is not filled, values are extracted from event_params for the respective parameters.
September 19, 2023 - v.1.04
Renamed "Sessionization" to "Sessionization - OWOX BI Events Streaming" to prevent confusion between sessionization on OWOX GA4 streaming data and sessionization on GA4 Export data.
Implemented proper handling of session breaks when a user switches from "google / cpc" to "google / organic." Previously, sessions were not interrupted due to the source being labeled as "google / organic" for such events in the "events_intraday_" table. As a result, source changes were not recorded, and new sessions were not initiated.
Adjusted handling of sessions originating from "android:app//google.com." Previously, such sessions were categorized as "android:app / referral," but they are now categorized as "google / organic." Introduced population of the "trafficSource.channelGrouping" field following old GA UA logic. Removed "GA4" from operation names for clarity.
July 28, 2023 - v.1.02
Removal of operation U - GA4 Sessionization (gclids from old DT): One of the significant changes was the removal of an operation related to parsing gclids from the old dataTransfer. This update was crucial as the operation was deprecated in April and had ceased to function.
Adjustment of operation U - GA4 Sessionization (gclids from new DT): This operation was renamed to U - GA4 Sessionization (gclids from DT) and was enhanced. Now, there is no need to specify table_suffix when a client sets up data extraction from multiple MCC accounts into a single dataset.
Deletion of the variable dt_objects_suffix: This variable is no longer used and has been removed.
June 22, 2023 - v.1.01
Code optimization and operation merging: Two operations, U - GA4 Sessionization Step 4 and A - GA4 Sessionization Step 5, were combined into a single operation named A - GA4 Sessionization Step 4.
Relocation of operation D - GA4 Sessionization: The operation was moved closer to appending final results, minimizing cases where data deletion was followed by a transformation that hit limits and couldn't write new data.
June 16, 2023 - v.1.00
Addition of manual tagging handling: In this update, the capability to handle manual tagging in conjunction with Bing and Facebook auto-tagging was added.
Exclusion of certain fields from hits.customDimensions storage: Fields like source, medium, campaign, term, content, and others were excluded from storage in hits.customDimensions.
June 09, 2023
Correction of parsing logic for product labels: This update focused on correcting the parsing logic for product labels, especially when gclid is present in page_location.
Addition of source/medium determination at transformation level: Now, if mscklid is present in the markup, the field
trafficSource.mscklid is added to the data schema.
Introduction of the user.user_pseudo_id field: This field was introduced to keep track of the user identifier when forming sessions.
May 20, 2023
Preservation of user_pseudo_id: User_pseudo_id is now preserved in the user.user_pseudo_id field. Previously, this field wasn't saved during sessionization based on owox_user_id.
Filling "(not set)" for campaign, keyword, adContent fields: If campaign, keyword, or adContent values are undefined, "(not set)" is now assigned instead of null.
Enhancement of gclid parsing for by-request stream: Logic for parsing gclid from the ByRequest pipeline was improved to consider adGroup and offer the option to select based on table_suffix.
May 08, 2023
Correction of code: In this update, corrections were made to the code. The "D - GA4 Sessionization" step is now enabled by default and doesn't lead to errors when the owoxbi_ga4_sessions table is absent. Additionally, the logic for the adContent value was adjusted to be sourced from the appropriate field. The events first_visit and session_start are now categorized as non-interactive events. Parsing of adGroup was added for both new and old dataTransfer, with plans to add it for the by-request pipeline when a client with that setup appears.
March 31, 2023
Correction of code: In this update, corrections were applied to the code. In cases where a source is present in the referral exclusion list, a session with source / medium = (direct) / (none) is now created, as opposed to the previous (direct) / (not set).
March 17, 2023
Correction of code: To handle a large volume of data, optimization was carried out on the code. The process was divided into several steps using temporary tables, which helped avoid errors due to memory consumption limits and optimized data processing.
February 28, 2023
Addition of identifier selection: This update introduced the ability to choose an identifier. This parameter can be set on the "GA4 Sessionization Inputs" sheet in the Primary Identifier field. Two options were added: user_pseudo_id and owox.user_id.
Change in referral exclusion list check: The check is now based on inclusion in the list, rather than exact matching. To specify the referral exclusion list, a "|" separator is now used instead of ",".
January 23, 2023
Exclusion of profiling code: This update involved excluding profiling code. This change was due to the fact that profiling is now done during the modeling stage rather than during sessionization.
Division of code into stages: Sessionization was divided into two stages - profiling and sessionization. This ensured proper handling of large data volumes and prevented errors related to memory limits
January 09, 2023
Correction of profiling code and handling of identifiers: In this update, corrections were made to the profiling code and handling of cases where clients provide identifiers '0', '-', '', 'null'. This prevented the creation of a single large "superuser" and avoided errors in queries.
Division of code into stages: Sessionization was divided into multiple stages to avoid errors due to memory consumption limits. The stages include extracting event data, preparing session and hit-level parameters, preparing a temporary table for changing labels, applying the model to data, and appending data to the final table.