August 15, 2018 — OWOX BI Pipeline. Changes in session data collection

Attention all users collecting session data with the Google Analytics→Google BigQuery pipeline.

Currently, if you collect session data based on hit data (with the OWOX BI Algorithm), OWOX BI simultaneously collects session data based on Google Analytics.

As a result, for each day you collect two session data tables simultaneously: “owoxbi_sessions_<date>” based on hit data and “session_streaming_<date>” based on Google Analytics data.

Starting November 1, 2018, if you have session data collection based on hit data enabled, OWOX BI will be collecting the “owoxbi_sessions_<date> tables only.”

What will happen to the old tables’ format and session data collection based on Google Analytics?

You still can use the “session_streaming_<date>” tables, but we don’t do or plan any updates to this data collection method.

If you collect session data based on hit data and still use both "owoxbi_sessions_<date>" and “session_streaming_<date>” tables for querying and getting reports from, we recommend you to move to the “owoxbi_sessions_<date>” tables and use this format of table names for all your SQL queries.

If you still want to continue collecting session data to the “session_streaming_<date>” tables, use session data collection based on Google Analytics.

Why the new algorithm?

Because we want you to get complete and accurate data on user behavior at your website. Here are the benefits of using the OWOX BI algorithm:

  1. The OWOX BI algorithm doesn’t depend on GA Core Reporting API and calculates sessions based on raw non-sampled hit data
  2. No interruptions of session tables collection caused by the GA Core Reporting API limits exceeding or no access to Google Analytics. No delays caused by the session table fields import from Google Analytics
  3. The OWOX BI algorithm has no such limits as up to 500 000 sessions a day, 10 million hits a day, or 500 hits per session. All your data will get into Google BigQuery tables
  4. The data collected doesn't depend on the sessionId value sent in a custom dimension (&cd). If it has been set up incorrectly, this can cause data discrepancies that can't be fixed retrospectively. For example, the beginning and the end of a session can be incorrect, or some hits can get to the wrong session. With the OWOX BI algorithm, all hits will be assigned to the respected sessions, and the discrepancies with the Google Analytics data will be minor.

That’s not all. With the new algorithm, you also can:

  • Track if a direct click is truly direct, not fetched from a paid source. With the OWOX BI algorithm, you can do it thanks to the trafficSource.isTrueDirect field and attribute site visits by two models: Last Non-Direct Click and Last Click
  • Аnalyze how audiences from different websites overlap by collecting an additional anonymous identifier, OWOX User ID

We continue improving our session data collection algorithm, so the list of its benefits will only grow. Watch the updates!

Read also:

Difference between two session collection algorithms

Discrepancy in the number of sessions in different collection methods

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request


Please sign in to leave a comment.