Google Spanner (beta)

Flags

Pointing the agent to Spanner as the backing store is as simple as passing a bucket URL with the following schema.

spanner://projects/$PROJECT/instances/$INSTANCE/databases/$DATABASE

This is the most common way to address individual Spanner databases in GCP, just replace $PROJECT, $INSTANCE and $DATABASE with your GCP Project Name, Spanner Instance ID and database name. The database is expected to be provisioned by the user already. We recommend not sharing this database with other applications to prevent accidental deletions or other incidents. On startup, WarpStream agents will create the necessary tables for the data plane inside this database if they are not present yet. These are two simple tables: warpstream_files and warpstream_chunks. Tampering with these tables through means other than the WarpStream agent itself will result in undefined behavior and most probably a broken cluster.

As with S3 Express and DynamoDB, we recommend replacing the -bucketURL flag with separate -ingestionBucketURL and -compactionBucketURL flags. The former should point to Spanner and the latter to GCS. See the last two paragraphs of S3 Express above for details.

We also recommend setting the -batchTimeout flag to as low as 50 ms. When S3 is the backing store, lowering this value increases costs. Larger batching is advantageous with S3 because API usage is billed per request, regardless of payload sizes. Spanner charges for compute and storage, regardless of the number of API calls. Therefore a lower batch timeout reduces produce latency without affecting cost.

Finally, WarpStream's own control plane batching can be tuned for lower latency. See Control Plane Latency above.

Google IAM Permissions

We recommend running the agents with the IAM role roles/spanner.databaseUser assigned to them for the relevant database. This role gives agents all the permissions they need to run the data plane.

Last updated

Was this helpful?