# AWS Dynamo DB

## Flags

Pointing the agent to DynamoDB as the backing store is as simple as passing a bucket URL with the following schema.

```go
dynamodb://$aws_region/$files_table<>$chunks_table
```

`$aws_region` is the agent's current region. If `$files_table` and `$chunks_table` are existing DynamoDB tables accessible from the same region, the agent will use those for storage. If the tables don't exist, the agent will create them. The need for two separate tables is an implementation detail that shouldn't otherwise affect developers.

As with S3 Express, we recommend replacing the `-bucketURL` flag with separate `-ingestionBucketURL` and `-compactionBucketURL` flags. The former should point to DynamoDB and the latter to S3. See the last two paragraphs of [S3 Express](#s3-express) above for details.

We also recommend setting the [`-batchTimeout` flag](#batch-timeout) to as low as 50 ms. When S3 is the backing store, lowering this value increases costs. Larger batching is advantageous with S3 because API usage is billed per request, regardless of payload sizes. DynamoDB charges per byte written and read, regardless of the number of API calls. Therefore a lower batch timeout reduces produce latency without affecting cost.

Finally, WarpStream's own control plane batching can be tuned for lower latency. See [Control Plane Latency](#control-plane-latency) above.

## AWS IAM Permissions

The process running the agent requires the following IAM permissions to use DynamoDB as the backing store.

```
"dynamodb:BatchWrite*",
"dynamodb:CreateTable",
"dynamodb:DeleteItem",
"dynamodb:Update*",
"dynamodb:PutItem",
"dynamodb:TagResource",
"dynamodb:BatchGet*",
"dynamodb:DescribeStream",
"dynamodb:DescribeTable",
"dynamodb:Get*",
"dynamodb:Query",
"dynamodb:Scan"
```

## Cost estimates

The table below presents the rough cost of each AWS service that can be used as the agent's storage layer for a hypothetical workload of a hundred kilobytes, one megabyte, and ten megabytes per second. These estimates are based on various assumptions, for example that one agent is deployed in each of three availability zone and that the compression ratio is 1:4. In the case of DynamoDB with provisioned usage, the budget is over-provisioned by a factor of 2 for headroom. Most importantly, these numbers only reflect the storage cost of keeping the last five seconds of data at any time. Since we recommend storing compacted data in S3 regardless where it's first ingested, the table below excludes any storage costs incurred after compaction. See the last two paragraphs of [S3 Express](#s3-express) above for details.

| Storage layer        | 100 KB / s | 1 MB / s | 10 MB / s |
| -------------------- | ---------- | -------- | --------- |
| S3                   | $ 159      | $ 159    | $ 159     |
| S3 Express           | $ 235      | $ 235    | $ 235     |
| DynamoDB on-demand   | $ 81       | $810     | $ 8100    |
| DynamoDB provisioned | $ 7.5      | $ 75     | $ 750     |

While these numbers are only estimates, they illustrate the advantage of using DynamoDB as the agent's storage layer for workloads with sufficiently low throughput.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.warpstream.com/warpstream/kafka/advanced-agent-deployment-options/low-latency-clusters/aws-dynamo-db.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
