Network Architecture Considerations
This page describes a variety of different approaches that can be used to deploy WarpStream with more advanced network setups.
Normal Network Architecture Setup
In most WarpStream deployments, client applications must connect directly to the WarpStream agents. This requires direct layer 3 network connectivity between the client applications and agents with no proxies, load balancers, NATing, etc. in the middle.
The below architecture is a normal network architecture where all the applications can directly communicate with all the WarpStream agents.

This is the recommended architecture for most WarpStream deployments because it is the easiest, most-effective, and most performant way to run a WarpStream cluster.
Approaches for Connectivity Between Unconnected Networks
In some situations, the direct connectivity described in the previous section is not always possible or desired. One example situation would be when the WarpStream agents are deployed in a Kubernetes cluster, but the client applications are deployed outside of the Kubernetes cluster in a completely different VPC.
There are two different approaches for enabling connectivity between Kafka Clients and the WarpStream Agents when the clients and agents are running in different networks with no direct connectivity between them:
In general, Agent Groups are the preferred solution. They're easier to set up, (generally) more cost-effective, and they don't suffer from any of the performance penalties that are associated with using a TCP load balancer.
Agent Groups (recommended)
WarpStream's diskless architecture means that any Agent can write or read data for any topic-partition. As a result, WarpStream clusters can be split into distinct "groups" that are completely isolated from each other at the networking / service discovery layer.

This feature is called Agent Groups and is very useful for enabling a single WarpStream cluster to be flexed across multiple disparate networks with no inter-connectivity without incurring the cost and performance penalties of using a TCP load balancer.
For more details, read the Agent Groups documentation.
TCP Load Balancer
Make sure you've at least considered using the Agent Groups approach before deploying a TCP load balancer for WarpStream.
While you can run the WarpStream Agents behind a load balancer, keep in mind that it may result in reduced performance. Whenever possible, direct connectivity between Kafka clients and the WarpStream Agents is preferred, especially for high volume workloads.
If the Agent Group approach is not viable for some reason, you'll have to setup a TCP load balancer instead.

Agent configuration:
WARPSTREAM_DEFAULT_VIRTUAL_CLUSTER_ID=$VIRTUAL_CLUSTER_ID
WARPSTREAM_REQUIRE_SASL_AUTHENTICATION=true
WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE=$LOAD_BALANCER_HOSTNAME
In some cases, the load balancer may be listening on a port that's different from the port the Agents are listening on (defaut 9092
for TCP/Kafka protocol traffic). In that scenario, you'll need to add one additional environment variable to the Agent configuration:
WARPSTREAM_DISCOVERY_KAFKA_PORT_OVERRIDE=$EXTERNAL_NLB_PORT
This instructs the Agents to advertise the load balancer's port within the Kafka protocol instead of the port that the Agents are listening on.
Kubernetes
Keep in mind that introducing TCP load balancers between Kafka clients and the WarpStream Agents can lead to performance issues.
Whenever possible, try to solve connectivity problems with Agent Groups instead.

Running WarpStream within Kubernetes can be simple and straightforward with our Helm charts.
However, when applications that are running outside of the Kubernetes cluster / VPC need to connect to WarpStream additional configuration may be required.
In this example setup we will have at least 3 helm deployments for 3 different agent groups. See Agent Groups for information about groups.
Agent Group One will handle applications running in the same Kubernetes cluster as the agents via direct connectivity within Kubernetes.
Agent Group Two will handle applications running in the same VPC as the Kubernetes cluster but not running in the Kubernetes cluster itself.
Agent Group Three will handle applications running outside of the VPC, for example connecting over the internet.
In all three cases the bootstrap server will be printed out in the NOTES
section during the helm install
.
Bellow are the recommended helm values to set for the various groups.
config:
agentGroup: one
bucketURL: <WARPSTREAM_BUCKET_URL>
apiKey: <WARPSTREAM_AGENT_APIKEY>
virtualClusterID: <WARPSTREAM_VIRTUAL_CLUSTER_ID>
region: <WARPSTREAM_CLUSTER_REGION>
config:
agentAroup: two
bucketURL: <WARPSTREAM_BUCKET_URL>
apiKey: <WARPSTREAM_AGENT_APIKEY>
virtualClusterID: <WARPSTREAM_VIRTUAL_CLUSTER_ID>
region: <WARPSTREAM_CLUSTER_REGION>
kafkaService:
enabled: true
annotations:
# Uncomment one of the following annotations depending on your Cloud Provider
# networking.gke.io/load-balancer-type: "Internal"
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
# service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"
type: LoadBalancer
port: 9092
# Override the hostname to be the hostname of the internal TCP Load Balancer
# In some environments this isn't needed if your Kubernetes pod IPs are routable.
# See your Kubernetes provider network documentation for details.
extraEnv:
- name: WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE
# Replace this with the hostname of your internal TCP load balancer
value: nlb-internal.xxx
config:
agentGroup: three
bucketURL: <WARPSTREAM_BUCKET_URL>
apiKey: <WARPSTREAM_AGENT_APIKEY>
virtualClusterID: <WARPSTREAM_VIRTUAL_CLUSTER_ID>
region: <WARPSTREAM_CLUSTER_REGION>
kafkaService:
enabled: true
annotations:
# If using AWS EKS uncomment the following annotation
# service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
type: LoadBalancer
port: 9092
# Set a certificate since this load balancer is exposed to the internet
certificate:
enableTLS: true
# The Kubernetes TLS secret that contains a certificate and private key
# see https://kubernetes.io/docs/concepts/configuration/secret/#tls-secrets
secretName: warpstream-external-tls
# If using mtls uncomment the following
# mtls:
# enabled: true
#
# # The secret key reference for the certificate authority public key
# certificateAuthoritySecretKeyRef:
# name: "warpstream-external-tls"
# key: "ca.crt"
# Override the hostname to be the hostname of the external TCP Load Balancer
extraEnv:
- name: WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE
# Replace this with the hostname of your external TCP load balancer
value: nlb-external.xxx
# If using SASL authentication uncomment the following
# - name: WARPSTREAM_REQUIRE_SASL_AUTHENTICATION
# value: "true"
#
# If using mTLS authentication uncomment the following
# - name: WARPSTREAM_REQUIRE_MTLS_AUTHENTICATION
# value: "true"
You can then install all three agent groups by running the following commands:
helm upgrade --install warpstream-agent-one warpstream/warpstream-agent \
--namespace $YOUR_NAMESPACE \
--values one-values.yaml
helm upgrade --install warpstream-agent-two warpstream/warpstream-agent \
--namespace $YOUR_NAMESPACE \
--values two-values.yaml
helm upgrade --install warpstream-agent-three warpstream/warpstream-agent \
--namespace $YOUR_NAMESPACE \
--values three-values.yaml
When using AWS EKS it is recommended to use the AWS Load Balancer Controller, the old in-tree or out-of-tree cloud provider for EKS is considered Legacy by AWS. While it is still possible to use the legacy provider for Load Balancers there is little available public documentation and the required annotations may be different.
Additional Configuration
Client Specific Override
It is sometimes useful to override the hostname on a client level. This is typically needed when using kubectl port-forward
.
Set the ws_host_override
parameter within the client's ID when creating the Kafka client (check Configuring Kafka Client ID Features for more details):
kgo.NewClient(...,
kgo.ClientID("ws_host_override=127.0.0.1"),
)
Our recommendation is to only use the above configuration in debugging situations and not long-term deployments.
Internal Listener Override
WarpStream agents must be able to directly communicate with each other. They need to communicate to efficiently share files and data.
In rare situations it may be necessary to override the internal agent to agent hostname.
This can be done by setting the -advertiseHostnameStrategy
flag or the WARPSTREAM_ADVERTISE_HOSTNAME_STRATEGY
environment variable to custom
. Then, provide the custom hostname by setting either the -advertiseHostnameCustom
flag or the WARPSTREAM_ADVERTISE_HOSTNAME_CUSTOM
environment variable.
However, our recommendation is to always allow agents to directly communicate with each other and not adjust the above mentioned configurations.
FAQ
Why TCP Load Balancers Can Cause Performance Problems
There are two reasons that introducing a load balancer between Kafka clients and the WarpStream Agent can result in performance problems:
WarpStream has a built-in load balancing mechanism that keeps the WarpStream Agents evenly utilized.
While any Agent can handle writes or reads for any partition, WarpStream will generally try to align writes/reads for the same topic-partition from different clients on the same Agent. This improves data locality which dramatically improves performance in a variety of different dimensions (latency, utilization, compression, etc).
Both of these mechanisms rely on WarpStream controlling (via the Kafka protocol) which clients connect to which Agents. As a result, these mechanisms degrade when a load balancer is inserted between the Kafka clients and the WarpStream Agents.
For well behaved workloads, WarpStream can still work well when running behind a load balancer, but direct connectivity between Kafka clients and the WarpStream Agents is always recommended for the most demanding workloads.
Typical issues when hostname is not overridden correctly
WarpStream agents utilize their private IP and ports for ongoing connections after the initial bootstrap. Without the correct configurations, clients might connect to bootstrap successfully yet experience issues when progressing beyond the initial phase.
For example you may receive the following errors when hostname override is incorrectly set:
% warpstream cli -bootstrap-host my-kafka.example.com -type diagnose-connection
running diagnose-connection sub-command with bootstrap-host: my-kafka.exampl.com and bootstrap-port: 9092
Broker Details
---------------
10.212.2.26:9092 (NodeID: 1195648645)
failed to communicate with Agent returned as part of Kafka Metadata response, err: <nil>, this usually means that the provided bootstrap host: my-kafka.exampl.com:9092 is accessible on the current network, but the URL that the Agent is advertising as its broker host/ip: 10.212.2.26:9092 is not accessible on this network. If this is occurring during local development whilst running the Agent in a docker container, consider adding the following flag to the docker run command: --env "WARPSTREAM_PRIVATE_IP_OVERRIDE=127.0.0.1" which will force the Agent to advertise its hostname/IP address as localhost for development purposes.
% kafka-topics --bootstrap-server my-kafka.example.com --list
[2025-02-03 15:08:19,631] WARN [AdminClient clientId=adminclient-1] Connection to node 1195648645 (10.212.2.26:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
In these examples we are trying to connect to my-kafka.example.com
. However, the WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE
environment variable is not set on the agent to that hostname. We can see that the Kafka clients are trying to connect to 10.212.2.26:9092
which is the private IP of the agent. Our Kafka clients cannot connect to the IP so they fail with connection errors.
Last updated
Was this helpful?