The ad services → Google BigQuery pipelines collect data into partitioned tables. The data from each ad service is collected into a separate table split into partitions by date/timestamp.
Thanks to this partitioning method, an OWOX BI pipeline uploads historical data exactly by date as it's stored in the ad service. The data in such tables is processed by BigQuery times faster than in the tables partitioned by ingestion-time. When you make an SQL query to a table, it takes less time and fewer costs to process data.
However, the date/timestamp partitioned tables have a limitation: for such tables, Google BigQuery supports only the standard SQL dialect.
If it's crucial for you to use queries written in legacy SQL, then read below how you can do it.
Step 1. In Google BigQuery, compose a query to a table collected by an ad service pipeline → Google BigQuery. The query must be written in standard SQL and save the data you need from the table in the structure you can later query with a legacy SQL query.
Query example:
#standardSQL
SELECT
account_id,
CAST(date AS STRING) AS date,
CAST(campaign_id AS STRING) AS campaign_id,
campaign.name AS campaign_name,
SUM(impressions) AS impressions,
SUM((SELECT inline_link_clicks FROM UNNEST(insights.data))) AS clicks,
SUM(spend) AS cost,
account_currency AS currency
FROM `Project.Dataset.Table`
WHERE date >= '2020-07-04'
GROUP BY date,account_id, account_currency, campaign_id, campaign_name
ORDER BY date
Step 2. Create a scheduled query from the query you've composed:
Done. Now you can use legacy SQL with the tables created by queries you've scheduled.
Note: Above is described the simple method of creating tables that can be queried using legacy SQL. Scheduled queries also let you create sharded tables using the destination table naming templates.
Sharded tables let you write only the new data instead of overwriting the duplicate data. Thanks to this, storing and querying data requires less time and fewer costs.
0 Comments