Rockset

This page describes how to integrate WarpStream with Rockset, ingest data into Rockset from WarpStream, and query the data in Rockset.

Prerequisites

  1. WarpStream account - get access to WarpStream by registering here.

  2. Rockset account - get access to Rockset by registering here.

  3. WarpStream cluster up and running.

Step 1: Create a topic in your WarpStream cluster

Obtain the Bootstrap Broker from the WarpStream console by navigating to your cluster and then clicking the Connect tab. If you don't have SASL credentials yet, you can also create a set of credentials from the console.

Store these values as environment variables for easy reference:

export BOOTSTRAP_HOST=<YOUR_BOOTSTRAP_BROKER> \
SASL_USERNAME=<YOUR_SASL_USERNAME> \
SASL_PASSWORD=<YOUR_SASL_PASSWORD>;

Then, create a topic using the WarpStream CLI:

warpstream kcmd -bootstrap-host $BOOTSTRAP_HOST -tls -username $SASL_USERNAME -password $SASL_PASSWORD -type create-topic -topic rockset_demo

You should see the following output in your Terminal:

Created topic rockset_demo.

Step 2: Produce some records

Using the WarpStream CLI, produce several messages to your topic:

warpstream kcmd -bootstrap-host $BOOTSTRAP_HOST -tls -username $SASL_USERNAME -password $SASL_PASSWORD -type produce -topic rockset_demo --records '{"action": "click", "user_id": "user_0", "page_id": "home"},,{"action": "hover", "user_id": "user_0", "page_id": "home"},,{"action": "scroll", "user_id": "user_0", "page_id": "home"},,{"action": "click", "user_id": "user_1", "page_id": "home"},,{"action": "click", "user_id": "user_1", "page_id": "home"},,{"action": "click", "user_id": "user_2", "page_id": "home"}'

Note that the WarpStream CLI uses double commas (,,) as a delimiter between JSON records.

Step 3: Set up a Rockset Integration with WarpStream

In the Rockset Console, navigate to Integrations > WarpStream.

Click "Start", and fill in the information for your WarpStream cluster.

Click "Save Integration."

Step 4: Create a Collection from your Integration

In the Rockset console, after creating your Integration, click "Create Collection from Integration".

Set the Kafka Topic name to rockset_demo, and set Starting Offset to Earliest. Select JSON as the Data Format. In the Source Preview window, you should see the messages that you produced in Step 2.

Step 5: Filter out records with null values for user_id

In the Rockset Console, write a query in the Ingest Transformation Query Editor that filters out messages with null user_ids.

Step 6: Save your Collection

In the Rockset Console, name your Collection and save it with the default settings.

Step 7: Produce more messages to your WarpStream topic

In your Terminal, produce several more messages:

warpstream kcmd -bootstrap-host $BOOTSTRAP_HOST -tls -username $SASL_USERNAME -password $SASL_PASSWORD -type produce -topic rockset_demo --records '{"action": "click", "user_id": "user_2", "page_id": "home"},,{"action": "hover", "user_id": "user_1", "page_id": "home"},,{"action": "scroll", "user_id": "user_0", "page_id": "home"},,{"action": "click", "user_id": "", "page_id": "home"},,{"action": "click", "user_id": "", "page_id": "home"},,{"action": "click", "user_id": "user_0", "page_id": "home"}'

Look closely! We sent six records to WarpStream, but two of them have null values for user_id.

In the Rockset Console, look at the Summary tab of your Collection overview. You should notice that only four records made it into the Collection from this latest batch. Rockset filtered out the records with null values for user_id!

Next steps

Congrats! Now you know how to integrate WarpStream with Rockset. Next, check out the WarpStream docs on how to configure the Agent for production, or review the Rockset docs to learn more about what's possible with WarpStream and Rockset.

Last updated

Logo

Apache, Apache Kafka, Kafka, and associated open source project names are trademarks of the Apache Software Foundation. Kinesis is a trademark of Amazon Web Services.