I have a table in BigQuery that looks something like this:
schema = [ bigquery.SchemaField('timestamp', 'TIMESTAMP', mode='REQUIRED', description='Data point timestamp'), bigquery.SchemaField('event_id', 'STRING', description='EventID'), [...] ]
The table has a fairly large dataset, and I’m trying to find write an efficient query that returns the number of events that happened in the last 24 hours but also within the last N days. That is, two different records with different conditions but the same
event_id. I don’t care so much about the actual
event_id, but rather the distribution.
Ideally, the query would return something like this:
7_days: 20 30_days: 15 60_days: 7
If it’s impossible to do this in pure SQL, I also have Pandas available at my disposal.