# AWS Glue

Tableflow can automatically register tables in Glue and update a table's metadata location to point to the latest snapshot.

### Prerequisites

In order for this to work, the WarpStream Agents need to be upgraded to at least [version 710](https://docs.warpstream.com/warpstream/overview/change-log#release-v710). Additionally, they must have the appropriate IAM policy for Glue attached. Specifically, the `"glue:GetTable"`, `"glue:CreateTable"`, and `"glue:UpdateTable"` permissions are needed on the catalogs, databases, and tables that you want Tableflow to manage. The IAM policy should look like the following

```json
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:CreateTable",
        "glue:GetTable",
        "glue:UpdateTable"
      ],
      "Resource": [
        "arn:aws:glue:<region>:<account-id>:catalog",
        "arn:aws:glue:<region>:<account-id>:database/<database-name>",
        "arn:aws:glue:<region>:<account-id>:table/<database-name>/<table-name>"
      ]
    }
  ]
}

```

### Configuration

To enable this feature, update the configuration YAML with the following:

```yaml
tables:
    - source_topic: "example_json_logs_topic"
      ...
      aws_glue_table_config:
        enabled: { true | false }
        catalog_id: '<glue-catalog-id>'
        database_name: '<glue-database-name>'
        table_name: '<glue-table-name>'
      schema:
      ...
```

**Required parameters**

`enabled`

Specifies whether the Glue integration should run.

`database_name`

Specifies the database in which to create the table. This database needs to exist already as Tableflow will not try to create one automatically.

`table_name`

Specifies the name the Glue table should be created with. This can be different from the name of the table in Tableflow.

**Optional parameters**

`catalog_id`

Specifies the ID of the Data Catalog in which to create the Table. If none is supplied, the AWS account ID will be used.

Note that the `database_name` and `table_name` parameter should match the resources from the IAM policy.
