Storing and updating transaction data in Google BigQuery

This article provides you with tips on choosing the right table type for CRM data storing and recommendations on CRM data updating.

Learn more about importing transaction data to Google BigQuery from this article.

Storing CRM data

To store CRM data, it is highly recommended to use a date partitioned table in Google BigQuery. This table type has advantages over a standard Google BigQuery table. Read more below.

1. Standard table is easy to create. However, when you run an SQL request against a standard table, all the table cells are scanned for the data you query. Therefore, the more data you upload to this table, the more expenses you might have on data processing and updating.

2. Date partitioned table means that the table data is divided into separate segments (partitions). Queries against date partitioned tables can determine which partitions contain the required data. It allows reducing the amount of data scanned while increasing query performance. For you, it also means better cost control.

To partition the BigQuery table by date, you need:

  • Create an empty table with a schema definition.
  • Add a DATE-type column to the table schema.
  • For Partition and cluster settings, click No partition, then select Partition by field and choose the DATE column. This option is only available if the schema contains a DATE-type column.
  • Upload CRM data into the table.

For more information, see Google BigQuery documentation.

The date in the DATE-type column must be provided according to the timezone of Google Analytics View used in your streaming pipeline.

We do not recommend using a Wildcard table for CRM data storing as this type of table does not support updating information per specified date range in one step. Because the Wildcard table represents a group of tables that share a common prefix, you’ll have to run the same wildcard query multiple times to update each table separately, therefore, you’ll be billed for each query.

If you face the error “This report does not support the partitioning type used in table {table name}” while querying the data partitioned table, follow the recommended steps from this article.

By importing CRM data to Google BigQuery, you’ll get an efficient tool for data management and analysis. While working with the table, please consider its notional structure:

  • Table entities

- User (user_id, client_id, user_phone, and user_email fields)

- Transaction (transaction_* field)

- Product (product_* and promo_* fields)

  • Nested levels

- each User may have multiple Transaction entities

- each Transaction may have multiple Product entities

- Product is at the lowest nested level of table entities

Therefore, each string of your query result outputs information about Product and corresponding Transaction. Because a single Transaction may include multiple Products, you may see the same User and Transaction data for several Products, yet the transaction_revenue field is unique for each Product.

Note, CRM data table does not contain Session-level parameters.

Such a CRM table structure allows using it as a data source for an attribution model calculation and marketing report creation.

Updating CRM data

Transaction status, as well as the price and quantity of the ordered items, may change during an order processing. For further data analysis, it is important to upload all these changes to the CRM table. To do it, use the append method that adds new data to the table without overwriting it. Make sure you specify the date and time of those changes in the transaction_changed field.

You can update your CRM data with the frequency you need. Yet, the recommended frequency is once a day. It allows synchronizing the CRM data actualization with updating of session and cost data tables.

Might be helpful:
Standard SQL in Google BigQuery: Advantages and Examples of Use in Marketing  

Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request


Please sign in to leave a comment.