Differences between the session data collection algorithms

OWOX BI can collect session data using one of the two methods: the first one is based on Google Analytics API while the second or based on raw hit data with the OWOX BI own algorithm.

Why the new algorithm?

Because we want you to get complete and accurate data on user behavior at your website. Here are the benefits of using the OWOX BI algorithm:

  1. The OWOX BI algorithm doesn’t depend on GA Core Reporting API and calculates sessions based on raw non-sampled hit data
  2. No interruptions of session tables collection caused by the GA Core Reporting API limits exceeding or no access to Google Analytics. No delays caused by the session table fields import from Google Analytics
  3. OWOX BI doesn't provide limits on data uploading while Google Analytics does. All your data will get into Google BigQuery tables

With the new algorithm, you also can:

  • Track if a direct click is truly direct, not fetched from a paid source. With the OWOX BI algorithm, you can do it thanks to the trafficSource.isTrueDirect field and attribute site visits by two models: Last Non-Direct Click and Last Click
  • Аnalyze how audiences from different websites overlap by collecting an additional anonymous identifier, OWOX User ID

When it’s time to move to the OWOX BI algorithm

  • The number of sessions on a website is close to or more than 500К for the selected date range (works for Analytics Standard). 
  • You often stumble upon the data sampling in Google Analytics
  • You often face the 500 hits limit per session
  • You need to track the true source of a visit using the trafficSource.isTrueDirect field which is available only in BigQuery Export for Google Analytics 360
  • You want to unite the audiences across domains using additional user identifier OWOX User ID and analyze how these audiences overlap

What is the difference in a table data structure

Sessions tables created based on the OWOX BI algorithm have the same structure as the tables collected based on Google Analytics data. There are only a few differences in some fields and their values:  

  1. The totals.* field had prevously contained the total hit value from Google Analytics. Now this field contains has the duplicated value from the totalsStreaming.* field showing the total number of hits collected by OWOX BI.
  2. The tables contain the customDimensions, customMetrics, and customGroups fields. However, they all have the hit-level scope. It’ll be possible to define the scope of custom dimensions in further updates. 
  3. The tables contain the field isTrueDirect that helps you understand if a visit is direct (then the value is true), or its source/medium is fetched from a paid source.
  4. The tables contain the field userOwoxId.

The settings needed to set up the OWOX BI algorithm

  • If you already have set up session data collection based on Google Analytics, update the tracking code once you've set up the session data collection
  • If using the Google Ads auto-tagging, first turn on the raw reports upload from Google Ads to BigQuery using the Google Data Transfer native integration to get the auto tag data (with gclid), then, in the session data collection in OWOX BI, show the path to the BigQuery dataset containing this reports. Skip this if using the manual tagging with the utm tags.
  • The data in the user.id field is collected based on the userId (&uid) parameter not using the custom dimension. If on your website you don't have the tracking and collecting of &uid, set it up using the standard method.

Differences in data collection



Based on Google Analytics API

Based on OWOX BI algorithm

Sessions calculation

The SessionID values, traffic source, geo, and device type data are uploaded using Google Analytics Core API. The beginning and the termination of sessions is defined by GA logic.

Once we get this session data, we add to session tables raw hit data from the "streaming" tables.

Sessions are formed in Google BigQuery based on raw data collected with the OWOX BI algorithm.

The triggers if the session beginning and termination are the same as in Google Analytics.

Sessions calculation when sending data via Measurement Protocol

If the value of the &qt parameter sent via Measurement Protocol is greater than 4 hours, the hit will disappear and won't get to any session.

If the &qt parameter is sent with the empty value, a new separate session will automatically be created for this hit.

Hits sent via Measurement Protocol with the value of the &qt parameter up to 30 days will retrospectively get to the session data table ("bi_sessions") for the corresponding date and will be assigned to the correct session.

The hits will get to a hit data table ("streaming") not depending on the &qt parameter.

Traffic source determination

The session source is defined according to the Last Non-Direct click model. This means all direct visits get the source assigned from the last non-direct visit for the last 6 months.

There’s no possibility to define whether a visit was a direct one.

Session sources are attributed to traffic sources by applying the Last Non-Direct Click attribution model, the same as in Google Analytics.

To track the actual traffic source, we have added values to the trafficSource.isTrueDirect field. It indicates if the source of the session started as a direct site visit, or is it follows the session generated by an ad source.

Definition of the utm values for Google Ads auto-tagging (gclid)

Defined by the Google Analytics API which has the native integration with Google Ads.

You need reports with raw Google Ads data in BigQuery. They can be easily set up with the Google Data Transfer native integration.

Table structure

Tables are divided by days according to the Google Analytics property timezone.

Every session is a different string with nested fields containing raw hit data.

No differences in the table structure.

The start time of session data collection for the previous day

At 5 a.m., since data become available in Google Analytics Core API at 4 a.m. according to the Google Analytics property timezone.

At 1 a.m. according to the Google Analytics property timezone.

Data filtration

Filtered session data from Google Analytics is used according to the current property filters.

Session data is not filtered in any way.


Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request


Please sign in to leave a comment.