# DuckDB

A video walkthrough can be found below:

{% embed url="<https://youtu.be/k--BNATvkZE>" %}

## Introduction

There is no direct connection to DuckDB from any Apache Kafka-compliant service. However, a DuckDB plug-in named [Kwack](https://github.com/rayokota/kwack) provides this ability. This guide will explain how to connect the two systems together to allow you to perform analytics on your WarpStream-managed Topics.

## Prerequisites

1. Have DuckDB [installed](https://duckdb.org/docs/installation/).
2. Have a Kwack [installed](https://github.com/rayokota/kwack) (requires Java 11 or higher).
3. WarpStream account - get access to WarpStream by registering [here](https://console.warpstream.com/signup).
4. A Serverless WarpStream cluster is up and running with a populated topic.

## Step 1: Get your WarpStream credentials

Obtain the Bootstrap Broker from the WarpStream console by navigating to your cluster and clicking the Connect tab. If you don't have SASL credentials, you can also [create a set of credentials](/warpstream/kafka/manage-security/sasl-authentication.md#creating-credentials) from the console.

<figure><img src="/files/49rbkV2voeB6c95DCl8u" alt=""><figcaption><p>WarpStream Cluster Management</p></figcaption></figure>

Save these values for the next step.

## Step 2: Prepare your Kwack parameters

Kwack can accept all the connection information and even SQL queries on the command line with various switches. A more easily reproducible method is to use a "properties" file, such as the one below:

<pre class="language-yaml"><code class="lang-yaml"># Topics to manage
topics=topic1

# Key serdes (default is binary)
key.serdes=topic1=string

# Value serdes (default is latest)
value.serdes=topic1=json:@/mypath/topic1_schema.json

<strong># The bootstrap servers for your Kafka cluster
</strong>bootstrap.servers=&#x3C;YOUR_BOOTSTRAP_BROKER>:&#x3C;YOUR_PORT>
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='&#x3C;YOUR_SASL_USERNAME>' password='&#x3C;YOUR_SASL_PASSWORD>';

</code></pre>

A schema registry or a local file can describe your data in various formats. For this example, we use a local schema definition in JSON format. Assuming a simple "customers" layout, the JSON schema would look something like the following:

```json
{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "properties": {
      "customerId": {
        "type": "string"
      },
      "name": {
        "type": "string"
      },
      "zone": {
        "type": "string"
      },
      "address": {
        "type": "string"
      },
      "membership": {
        "type": "string"
      }
    }
  }
```

If you have more than one topic to connect to, then those values are separated by commas as follows:

```
# Topics to manage
topics=topic1,topic2

# Key serdes (default is binary)
key.serdes=topic1=string,topic2=string

# Value serdes (default is latest)
value.serdes=topic1=json:@/mypath/topic1_schema.json,topic2=json:@/mypath/topic2_schema.json

# The bootstrap servers for your Kafka cluster
bootstrap.servers=<YOUR_BOOTSTRAP_BROKER>:<YOUR_PORT>
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<YOUR_SASL_USERNAME>' password='<YOUR_SASL_PASSWORD>';
```

## Step 3: Consuming with Kwack

Kwack can combine a mixture of run-time switches and a property file. To launch Kwack with a properties file, use the -F switch, such as:

```bash
kwack -F myconfig.properties
```

At this point, you can perform SQL commands against the active Kafka topics in WarpStream, including joining multiple topics for analytics. The topics can be persisted into a DuckDB database with the -d switch, such as:

```
kwack -F myconfig.properties -d mydb.duckdb
```

## Next Steps

Congratulations! You can now read your WarpStream topics directly with Kwack and optionally save them as a DuckDB database. Kwack can also export your topics as Parquet files, among many other useful features.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.warpstream.com/warpstream/reference/integrations/duckdb.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
