Object Storage Configuration
Bucket Configuration
Tableflow will manage the lifecycle of the objects in object storage, such as deleting compacted or expired data files and unneeded snapshots. As such, do not configure a retention policy on your bucket, and make sure that object versioning and object soft deletion are disabled. Removing files that Tableflow still considers live would make the Iceberg table unqueryable.
Bucket Permissions
The Tableflow Agent needs to have the appropriate permissions to interact with the bucket.
Specifically, the Agents need permission to perform the following operations:
PutObject
To create data files and snapshots.
GetObject
To read existing data files during compaction.
DeleteObject
To enforce retention, clean up compacted files, and prune old snapshots.
ListBucket
To enforce retention, clean up compacted files, and prune old snapshots.
Below is an example Terraform configuration for an AWS IAM policy document that provides Tableflow with the appropriate permissions to access a S3 bucket:
data "aws_iam_policy_document" "warpstream_s3_policy_document" {
statement {
sid = "AllowS3"
effect = "Allow"
actions = [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:ListBucket"
]
resources = [
"arn:aws:s3:::<my-bucket>/warpstream/_tableflow/*",
]
}
}
Last updated
Was this helpful?