Comment on page

Use the Agent with an S3 Compatible Object Store (MinIO, Oracle Cloud, R2, etc)

This page explains how to use the WarpStream Agent with an S3-compatible object store like MinIO, Oracle Cloud, and R2.
The WarpStream Agent has native support (embeds the provider-specific SDK) for AWS S3, GCP GCS, and Azure Blob Storage. In addition, they can use the AWS S3 SDK to integrate with any "S3 compatible" object store solution like MinIO and Oracle Cloud Object Storage.
To use the Agent with an S3 compatible object store, provide the credentials and force the client to construct the URL using the "path style".
If you have a MinIO docker container running locally on your machine on port 9000, you can run the Agent like this after creating an Access Key in the MinIO UI:
AWS_ACCESS_KEY_ID="wKghTMkQrFqszshHJcop" \
warpstream demo \
-bucketURL "s3://warpstream-minio-bucket?s3ForcePathStyle=true&endpoint="
The MinIO team has a more detailed integration guide on their website as well.
  1. 1.
    Create an account with Cloudflare.
  2. 2.
    Create an R2 bucket.
  3. 3.
    Create an R2 access token.
warpstream demo -bucketURL "s3://warpstream-demo-for-fun?s3ForcePathStyle=true&region=auto&endpoint="
Note that if you run multiple WarpStream Agents this way in non-demo mode, then by default they need to be running on the same internal network. The reason for this is that if the Agents believe they're all running in the same "availability zone", they will attempt to form a distributed cache with each other to reduce R2 API GET requests.
However, if you wish to run multiple Agents in separate networks / regions, but still allow them to function as a single "Kafka Cluster", assign each one a dedicated availability zone.
For example, Agent 1:
WARPSTREAM_AVAILABILITY_ZONE="personal_laptop_chicago" \
warpstream agent -bucketURL "s3://warpstream-demo-for-fun?s3ForcePathStyle=true&region=auto&endpoint="
Agent 2:
WARPSTREAM_AVAILABILITY_ZONE="personal_laptop_nashville" \
warpstream agent -bucketURL "s3://warpstream-demo-for-fun?s3ForcePathStyle=true&region=auto&endpoint="
This signals to each Agent that they should not attempt to communicate with each other directly over the local network, and that each one should behave as if it were running in a different availability zone. However, data will still be able to be streamed from Chicago to Nashville (or vice versa) because the Agents will use R2 as "the network".
The net result of this is a "multi-region" Cluster that can read and write all topics/partitions from multiple regions at the same time.
Apache, Apache Kafka, Kafka, and associated open source project names are trademarks of the Apache Software Foundation. Kinesis is a trademark of Amazon Web Services.