# Rockset

## Prerequisites

1. WarpStream account - get access to WarpStream by registering [here](https://console.warpstream.com/signup).
2. Rockset account - get access to Rockset by registering [here](https://rockset.com/create/).
3. WarpStream cluster up and running.

## Step 1: Create a topic in your WarpStream cluster

Obtain the Bootstrap Broker from the WarpStream console by navigating to your cluster and then clicking the Connect tab. If you don't have SASL credentials yet, you can also [create a set of credentials](https://docs.warpstream.com/warpstream/kafka/manage-security/sasl-authentication#creating-credentials) from the console.

Store these values as environment variables for easy reference:

```bash
export BOOTSTRAP_HOST=<YOUR_BOOTSTRAP_BROKER> \
SASL_USERNAME=<YOUR_SASL_USERNAME> \
SASL_PASSWORD=<YOUR_SASL_PASSWORD>;
```

Then, create a topic using the WarpStream CLI:

{% code overflow="wrap" %}

```bash
warpstream kcmd -bootstrap-host $BOOTSTRAP_HOST -tls -username $SASL_USERNAME -password $SASL_PASSWORD -type create-topic -topic rockset_demo
```

{% endcode %}

You should see the following output in your Terminal:

`Created topic rockset_demo.`

## Step 2: Produce some records

Using the WarpStream CLI, produce several messages to your topic:

{% code overflow="wrap" %}

```bash
warpstream kcmd -bootstrap-host $BOOTSTRAP_HOST -tls -username $SASL_USERNAME -password $SASL_PASSWORD -type produce -topic rockset_demo --records '{"action": "click", "user_id": "user_0", "page_id": "home"},,{"action": "hover", "user_id": "user_0", "page_id": "home"},,{"action": "scroll", "user_id": "user_0", "page_id": "home"},,{"action": "click", "user_id": "user_1", "page_id": "home"},,{"action": "click", "user_id": "user_1", "page_id": "home"},,{"action": "click", "user_id": "user_2", "page_id": "home"}'
```

{% endcode %}

Note that the WarpStream CLI uses double commas (`,,)` as a delimiter between JSON records.

## Step 3: Set up a Rockset Integration with WarpStream

In the Rockset Console, navigate to [Integrations > WarpStream](https://console.rockset.com/integrations/new?path=warpstream).

Click "Start", and fill in the information for your WarpStream cluster.

<figure><img src="https://77315434-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjB7FxO8ty4EXO4HsQP4E%2Fuploads%2Fgit-blob-32a9f981b0027e633c28c56c4dc1dbb41e6478ff%2FScreenshot%202023-12-22%20at%202.14.46%E2%80%AFPM.png?alt=media" alt=""><figcaption><p>Be sure to add the port in your Bootstrap Server URL! Your environment variable from Step 1 omits this because the WarpStream CLI defaults to using port 9092 for Kafka requests.</p></figcaption></figure>

Click "Save Integration."

## Step 4: Create a Collection from your Integration

In the Rockset console, after creating your Integration, click "Create Collection from Integration".

<figure><img src="https://77315434-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjB7FxO8ty4EXO4HsQP4E%2Fuploads%2Fgit-blob-42297bbe45c49293649fbf9dde281d4880477527%2FScreenshot%202023-12-22%20at%202.27.25%E2%80%AFPM.png?alt=media" alt=""><figcaption></figcaption></figure>

Set the Kafka Topic name to `rockset_demo`, and set Starting Offset to `Earliest`. Select `JSON` as the Data Format. In the Source Preview window, you should see the messages that you produced in Step 2.

## Step 5: Filter out records with null values for user\_id

In the Rockset Console, write a query in the Ingest Transformation Query Editor that filters out messages with null `user_id`s.

<figure><img src="https://77315434-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjB7FxO8ty4EXO4HsQP4E%2Fuploads%2Fgit-blob-62eda214a956ce1b674927f186daf04945e30f3d%2FScreenshot%202023-12-22%20at%202.30.34%E2%80%AFPM.png?alt=media" alt=""><figcaption></figcaption></figure>

## Step 6: Save your Collection

In the Rockset Console, name your Collection and save it with the default settings.

<figure><img src="https://77315434-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjB7FxO8ty4EXO4HsQP4E%2Fuploads%2Fgit-blob-f8d73531729a96b1422acfa30163a213ff8c7358%2FScreenshot%202023-12-22%20at%202.33.05%E2%80%AFPM.png?alt=media" alt=""><figcaption></figcaption></figure>

## Step 7: Produce more messages to your WarpStream topic

In your Terminal, produce several more messages:

{% code overflow="wrap" %}

```bash
warpstream kcmd -bootstrap-host $BOOTSTRAP_HOST -tls -username $SASL_USERNAME -password $SASL_PASSWORD -type produce -topic rockset_demo --records '{"action": "click", "user_id": "user_2", "page_id": "home"},,{"action": "hover", "user_id": "user_1", "page_id": "home"},,{"action": "scroll", "user_id": "user_0", "page_id": "home"},,{"action": "click", "user_id": "", "page_id": "home"},,{"action": "click", "user_id": "", "page_id": "home"},,{"action": "click", "user_id": "user_0", "page_id": "home"}'
```

{% endcode %}

Look closely! We sent six records to WarpStream, but two of them have null values for `user_id`.

In the Rockset Console, look at the Summary tab of your Collection overview. You should notice that only **four records** made it into the Collection from this latest batch. Rockset filtered out the records with null values for `user_id`!

## Next steps

Congrats! Now you know how to integrate WarpStream with Rockset. Next, check out the WarpStream docs on [how to configure the Agent](https://docs.warpstream.com/warpstream/agent-setup/deploy) for production, or review the [Rockset docs](https://rockset.com/docs/) to learn more about what's possible with WarpStream and Rockset.
