Security and Privacy Considerations

Overview

WarpStream upholds the security, privacy, and compliance standards required to handle mission-critical workloads for our customers.

In the interest of transparency, WarpStream maintains a compliance portal that includes information about our security and compliance practices, including certification reports and detailed information regarding the controls that we have implemented.

In addition to following the best practices and controls documented on our compliance portal, WarpStream also supports Kafka ACLs, as well as SASL/PLAIN and SASL/SCRAM-SHA-512 authentication for both Serverless clusters and clusters with Agents running in your environment (BYOC).

Data isolation for Bring Your Own Cloud (BYOC) clusters

WarpStream's "Bring Your Own Cloud" product is designed to maintain strict security and privacy considerations by ensuring that raw data written to WarpStream clusters never leaves your VPC or object storage buckets.

The only data that ever leaves your VPC is metadata about your Kafka workloads that is required for the correct functioning of your clusters, which includes the following:

  1. Topic names

  2. Topic metadata (partition counts, configuration, etc)

  3. File metadata (object store bucket name, compressed size, uncompressed size, etc)

  4. Record timestamps and offsets (but never record keys or record contents)

  5. Consumer group names, configuration and offsets

  6. Kafka client IDs

  7. Producer IDs, epochs, and sequence numbers

  8. Agent Metadata (stored ephemerally in memory, never persisted to disk)

    1. Number of connections (for load balancing)

    2. Number of vCPUs (for determining how many concurrent jobs it can run) and utilization.

    3. Internal / Private IP addresses. These addresses are not routable from the internet, and are required so that the Agents can cluster with each other within a single availability zone.

    4. Availability zone.

  9. A small sample of the Agent's logs so that we can help diagnose and debug issues remotely. This can be disabled by setting the -disableLogsCollection flag or WARPSTREAM_DISABLE_LOGS_COLLECTION=true environment variable. These logs never contain raw data, and only contain things like error messages or high level statistics.

  10. The Agent's profiling data so that we can investigate performance degradations remotely. This can be disabled with the -disableProfileForwarding flag or the WARPSTREAM_DISABLE_PROFILE_FORWARDING environment variable. These profiles only contain information about program execution.

Data isolation for BYOC Schema Registry clusters

WarpStream's BYOC Schema Registry is also designed to maintain strict security and privacy by ensuring that raw schemas registered to the BYOC Schema Registry clusters never leave your VPC or object storage buckets.

The only data that leaves your VPC is metadata about your schemas that are necessary for the correct functioning of your schema registry clusters, which includes the following:

  1. Schema metadata: schema data format, schema ID

  2. Schema subject names

  3. Schema subject metadata: schema context name, compatibility rule, subject version, schema ID, soft deleted

  4. File metadata: object store bucket name, schema size

  5. Schema reference metadata: subject, subject version

  6. Schema context name

    1. Global configuration: default compatibility rule

  7. Agent Metadata (stored ephemerally in memory, never persisted to disk)

    1. Number of connections (for load balancing)

    2. Number of vCPUs (for determining how many concurrent jobs it can run) and utilization.

    3. Internal / Private IP addresses. These addresses are not routable from the internet, and are required so that the Agents can cluster with each other within a single availability zone.

    4. Availability zone.

  8. A small sample of the Agent's logs so that we can help diagnose and debug issues remotely. This can be disabled by setting the -disableLogsCollection flag or WARPSTREAM_DISABLE_LOGS_COLLECTION=true environment variable. These logs never contain raw data, and only contain things like error messages or high level statistics.

  9. The Agent's profiling data so that we can investigate performance degradations remotely. This can be disabled with the -disableProfileForwarding flag or the WARPSTREAM_DISABLE_PROFILE_FORWARDING environment variable. These profiles only contain information about program execution.

Last updated