How to delete personal data from BigQuery on a user demand

If a user asks you to delete all their personal data you have collected, use Standard SQL queries in Google BigQuery to find and delete such data. To make a query, you need either a User ID or user email.

Below, you can find the BigQuery tables containing the personal data collected with the OWOX BI pipelines. Also, there are examples of the queries you need to use to find and delete data:

BigQuery tables containing personal data from the sources Query for table search Query for deleting personal data

Source: Google Analytics

Table: streaming_

SELECT date FROM `project.dataset.streaming_*` WHERE userEmail = 'user@email.com' OR userId = '0000' GROUP BY date ORDER BY date DELETE FROM `project.dataset.streaming_YYYYMMDD` WHERE userEmail = 'user@email.com' OR userid = '0000'

Source: Google Analytics 

Table: session_streaming_

SELECT date FROM `project.dataset.session_streaming_*` WHERE user.email = 'user@email.com' OR user.id = '0000' GROUP BY date ORDER BY date  DELETE FROM `project.dataset.session_streaming_YYYYMMDD` WHERE user.email = 'user@email.com' OR user.id = '0000

Source: Google Analytics 

Table: owoxbi_sessions_

SELECT date FROM `project.dataset.owoxbi_sessions_*` WHERE user.email = 'user@email.com' OR user.id = '0000' GROUP BY date ORDER BY date DELETE FROM `project.dataset.owoxbi_sessions_YYYYMMDD` WHERE user.email = 'user@email.com' OR user.id = '0000'

Source: AppsFlyer

Table: android_events_

SELECT FORMAT_DATE("%G%m%d", DATE(TIMESTAMP(event_time))) AS date FROM `project.OWOXBI_AppsFlyerEvents.android_events_*` WHERE appsflyer_device_id = '0000' OR advertising_id = 'xxxx' GROUP BY date ORDER BY date DELETE FROM `project.OWOXBI_AppsFlyerEvents.android_events_YYYYMMDD` WHERE appsflyer_device_id = '0000' OR advertising_id = 'xxxx'

Source: AppsFlyer

Table: ios_events_

SELECT FORMAT_DATE("%G%m%d", DATE(TIMESTAMP(event_time))) AS date FROM `project.OWOXBI_AppsFlyerEvents.ios_events_*` WHERE appsflyer_device_id = '0000' OR idfa = 'xxxx' GROUP BY date ORDER BY date DELETE FROM `project.OWOXBI_AppsFlyerEvents.ios_events_YYYYMMDD` WHERE appsflyer_device_id = '0000' OR idfa = 'xxxx

Source: AppsFlyer

Table: windows_phone_events_

SELECT FORMAT_DATE("%G%m%d", DATE(TIMESTAMP(event_time))) AS date FROM `project.OWOXBI_AppsFlyerEvents.windows_phone_events_*` WHERE appsflyer_device_id = '0000' GROUP BY date ORDER BY date DELETE FROM `project.OWOXBI_AppsFlyerEvents.windows_phone_events_*` WHERE appsflyer_device_id = '0000

Source: SendGrid 

Table: events_

SELECT FORMAT_DATE("%G%m%d", DATE(_PARTITIONTIME)) FROM `project.dataset.events` WHERE email = ‘user@email.com’ GROUP BY date ORDER BY date

DELETE FROM `project.dataset.events` WHERE email = ‘user@email.com’ 

Source: Mandrill 

Table: events_

SELECT FORMAT_DATE("%G%m%d", DATE(ts)) as date
FROM `project.dataset.events_*`
WHERE msg.email = 'user@email.com' GROUP BY date ORDER BY date
DELETE FROM `project.dataset.events_YYYYMMDD`
WHERE msg.email = 'user@email.com'
 

Source: Sparkpost 

Table: owoxbi_sparkpost

SELECT FORMAT_DATE("%G%m%d", DATE(timestamp)) FROM `project.dataset.events_*` WHERE message_event.rcpt_to = 'user@email.com' OR message_event.raw_rcpt_to = 'user@email.com' GROUP BY date ORDER BY date DELETE FROM `project.dataset.events_YYYYMMDD` WHERE message_event.rcpt_to = 'user@email.com' OR message_event.raw_rcpt_to = 'user@email.com

Source: UniOne

Table: email_status_events_

SELECT FORMAT_DATE("%G%m%d", DATE(_PARTITIONTIME)) as date FROM `project.dataset.email_status_events` WHERE email = ‘user@email.com’ GROUP BY date ORDER BY date DELETE FROM DELETE FROM `project.dataset.email_status_events` WHERE email = ‘user@email.com’ 

 

Values highlighted with bold must be replaced with the values relevant to your project and users.

In your queries, replace the values:

  • project with the project name
  • dataset with the dataset name
  • YYYYMMDD with the date of the table with the data you need

How to delete data from tables

The data can be deleted from tables with the Standard SQL queries only.

1. In the query for table search, type in the User ID or user email.

2. The query results will show you the list of tables which contain the personal data by dates. These dates you need for the DELETE query.

Date_search_results.png

Exceptions are partition tables in SendGrid and UniOne pipelines.

3. In the DELETE query, type in one of the dates you got in the “YYYYMMDD” format and again the User ID or email.

You can’t apply the DELETE query to the data in several tables at once — you’ll have to do it for each date one-by-one.

Please note: The DELETE function is one of the DML functions in Standard SQL. To use it, uncheck the Legacy SQL box in Options. BigQuery will also show how many lines were deleted as the result of the query:

image1.png

Using the DELETE query is limited. Read about the BigQuery limits and quotas before running queries.

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.