BigQuery
This page describes how to integrate Tableflow with Google BigQuery so that you can query Iceberg tables created by WarpStream directly in BigQuery.
Tableflow can automatically register tables in BigQuery and update a table's metadata location to point to the latest snapshot.
Prerequisites
In order for this to work, the WarpStream Agents need to be upgraded to at least v737.
1. Create the BigQuery Dataset
Create a BigQuery dataset to hold your Tableflow tables. The dataset must exist before enabling the integration.
bq mk --dataset --location=<gcs_bucket_region> <project_id>:<dataset_id>Critical Requirement: The BigQuery Dataset location must match your GCS bucket region.
If your bucket is in
us-east1, your dataset must be inus-east1.If they do not match, BigQuery will be unable to read the data files.
Important: The dataset location should match your GCS bucket region, it's strictly required to avoid errors.
2. Grant IAM Permissions
The Tableflow agent service account requires the following roles:
roles/bigquery.dataEditor
Create and update external tables
roles/storage.objectViewer
Read Iceberg metadata from GCS
Grant them via:
3. Add Table Configuration
Add the following BigQuery configuration to your table config:
Top-Level Defaults (bigquery_defaults)
bigquery_defaults)These defaults apply to all tables unless overridden per-table.
project_id
The GCP project ID containing the BigQuery dataset
dataset_id
The BigQuery dataset ID where tables will be created
Per-Table Configuration (bigquery_table_config)
bigquery_table_config)enabled
Set to true to enable BigQuery sync for this table
table_id
The BigQuery table name to create/update
project_id
Override the default project_id for this table
dataset_id
Override the default dataset_id for this table
4. Query the Data
Once enabled, your tables will appear in the BigQuery console. Query them using standard SQL:
Last updated
Was this helpful?