Object Storage Configuration

Bucket Configuration

Tableflow will manage the lifecycle of the objects in object storage, such as deleting compacted or expired data files and unneeded snapshots. As such, do not configure a retention policy on your bucket, and make sure that object versioning and object soft deletion are disabled. Removing files that Tableflow still considers live would make the Iceberg table unqueryable.

Bucket Permissions

The Tableflow Agent needs to have the appropriate permissions to interact with the bucket.

Specifically, the Agents need permission to perform the following operations:

  • PutObject

    • To create data files and snapshots.

  • GetObject

    • To read existing data files during compaction.

  • DeleteObject

    • To enforce retention, clean up compacted files, and prune old snapshots.

  • ListBucket

    • To enforce retention, clean up compacted files, and prune old snapshots.

Below is an example Terraform configuration for an AWS IAM policy document that provides Tableflow with the appropriate permissions to access a S3 bucket:

data "aws_iam_policy_document" "warpstream_s3_policy_document" {
  statement {
    sid     = "AllowS3"
    effect  = "Allow"
    actions = [
      "s3:PutObject",
      "s3:GetObject",
      "s3:DeleteObject",
      "s3:ListBucket"
    ]
    resources = [
      "arn:aws:s3:::<my-bucket>/warpstream/_tableflow/*",
    ]
  }
}

Last updated

Was this helpful?