Network Architecture Considerations

In most WarpStream deployments client applications must connect directly to the WarpStream agents. This requires direct layer 3 network connectivity between the client applications and agents with no proxies, load balancers, NATing, etc. in the middle.

In some situations this type of connectivity is not always possible or desired. One example situation would be when the WarpStream agents are deployed in a Kubernetes cluster, but the client applications are outside of the Kubernetes cluster.

This guide clarifies key concepts and steps to guarantee a seamless network and connection setup.

To explore in detail the functionality of WarpStream's service discovery mechanism, check: Service Discovery.

Normal Network Architecture Setup

The below architecture is a normal network architecture where all the applications can directly communicate with all the WarpStream agents. This is the recommended architecture for most WarpStream deployments.

WarpStream behind a TCP Load Balancer Without Direct Connectivity

This type of network architecture is recommended when applications are connecting to WarpStream agents outside of the agent's local network, for example connecting over the internet.

In this situation applications cannot directly connect to the WarpStream agents and must connect through a TCP Load Balancer.

When exposing WarpStream to external networks it is highly recommended to configure TLS and Authentication. See Protect Data in Motion with TLS Encryption, SASL Authentication, Mutual TLS (mTLS) for configuration details.

Agent configuration:

WARPSTREAM_DEFAULT_VIRTUAL_CLUSTER_ID=$VIRTUAL_CLUSTER_ID
WARPSTREAM_REQUIRE_SASL_AUTHENTICATION=true
WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE=$LOAD_BALANCER_HOSTNAME

WarpStream agents natively load balance clients across the cluster based off of agent load. Using a TCP load balancer in front of WarpStream in conjunction with setting a hostname override prevents WarpStream from performing its native load balancing which can negatively impact performance. This typically not a large impact but is noticeable in high performance scenarios.

We recommend to only use a TCP load balancer when client applications cannot directly communicate with the agents. In some cases using Agent Groups and deploying agents near the client application is preferred to maintain expected performance.

WarpStream inside Kubernetes with Applications outside Kubernetes

Running WarpStream within Kubernetes can be simple and straightforward with our Helm charts.

However, when applications that are running outside of the Kubernetes cluster need to connect to WarpStream additional configuration is required.

In this example setup we will have at least 3 helm deployments for 3 different agent groups. See Agent Groups for information about groups.

Agent Group One will handle applications running in the same Kubernetes cluster as the agents.

Agent Group Two will handle applications running in the same VPC as the Kubernetes cluster but not running in the Kubernetes cluster itself.

In some setups this group isn't needed due to pod IPs being routable on the VPC, consult your cloud provider's Kubernetes documentation for details about routable pod IPs.

Agent Group Three will handle applications running outside of the VPC, for example connecting over the internet.

In all three cases the bootstrap server will be printed out in the NOTES section during the helm install.

Bellow are the recommended helm values to set for the various groups.

one-values.yaml

config:
    agentGroup: one
    bucketURL: <WARPSTREAM_BUCKET_URL>
    apiKey: <WARPSTREAM_AGENT_APIKEY>
    virtualClusterID: <WARPSTREAM_VIRTUAL_CLUSTER_ID>
    region: <WARPSTREAM_CLUSTER_REGION>

two-values.yaml

config:
    agentAroup: two
    bucketURL: <WARPSTREAM_BUCKET_URL>
    apiKey: <WARPSTREAM_AGENT_APIKEY>
    virtualClusterID: <WARPSTREAM_VIRTUAL_CLUSTER_ID>
    region: <WARPSTREAM_CLUSTER_REGION>
kafkaService:
    enabled: true
    annotations:
        # Uncomment one of the following annotations depending on your Cloud Provider
        # networking.gke.io/load-balancer-type: "Internal"
        # service.beta.kubernetes.io/azure-load-balancer-internal: "true"
        # service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"
    type: LoadBalancer
    port: 9092
# Override the hostname to be the hostname of the internal TCP Load Balancer
# In some environments this isn't needed if your Kubernetes pod IPs are routable.
# See your Kubernetes provider network documentation for details.
extraEnv:
    - name: WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE
      # Replace this with the hostname of your internal TCP load balancer
      value: nlb-internal.xxx

three-values.yaml

config:
    agentGroup: three
    bucketURL: <WARPSTREAM_BUCKET_URL>
    apiKey: <WARPSTREAM_AGENT_APIKEY>
    virtualClusterID: <WARPSTREAM_VIRTUAL_CLUSTER_ID>
    region: <WARPSTREAM_CLUSTER_REGION>
kafkaService:
    enabled: true
    annotations:
        # If using AWS EKS uncomment the following annotation
        # service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
    type: LoadBalancer
    port: 9092
# Set a certificate since this load balancer is exposed to the internet
certificate:
    enableTLS: true
    # The Kubernetes TLS secret that contains a certificate and private key
    # see https://kubernetes.io/docs/concepts/configuration/secret/#tls-secrets
    secretName: warpstream-external-tls
    
    # If using mtls uncomment the following
    # mtls:
    #     enabled: true
    #
    #     # The secret key reference for the certificate authority public key
    #     certificateAuthoritySecretKeyRef:
    #       name: "warpstream-external-tls"
    #       key: "ca.crt"
# Override the hostname to be the hostname of the external TCP Load Balancer
extraEnv:
    - name: WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE
      # Replace this with the hostname of your external TCP load balancer
      value: nlb-external.xxx
    # If using SASL authentication uncomment the following
    # - name: WARPSTREAM_REQUIRE_SASL_AUTHENTICATION
    #   value: "true"
    #
    # If using mTLS authentication uncomment the following
    # - name: WARPSTREAM_REQUIRE_MTLS_AUTHENTICATION
    #   value: "true"

You can then install all three agent groups by running the following commands:

helm upgrade --install warpstream-agent-one warpstream/warpstream-agent \
    --namespace $YOUR_NAMESPACE \
    --values one-values.yaml

helm upgrade --install warpstream-agent-two warpstream/warpstream-agent \
    --namespace $YOUR_NAMESPACE \
    --values two-values.yaml

helm upgrade --install warpstream-agent-three warpstream/warpstream-agent \
    --namespace $YOUR_NAMESPACE \
    --values three-values.yaml

When using AWS EKS it is recommended to use the AWS Load Balancer Controller, the old in-tree or out-of-tree cloud provider for EKS is considered Legacy by AWS. While it is still possible to use the legacy provider for Load Balancers there is little available public documentation and the required annotations may be different.

Client Specific Override

It is sometimes useful to override the hostname on a client level. This is typically needed when using kubectl port-forward.

Set the ws_host_override parameter within the client's ID when creating the Kafka client (check Configuring Kafka Client ID Features for more details):

kgo.NewClient(..., 
    kgo.ClientID("ws_host_override=127.0.0.1"),
)

Our recommendation is to only use the above configuration in debugging situations and not long-term deployments.

Internal Listener Override

WarpStream agents must be able to directly communicate with each other. They need to communicate to efficiently share files and data.

In rare situations it may be necessary to override the internal agent to agent hostname.

This can be done by setting the -advertiseHostnameStrategy flag or the WARPSTREAM_ADVERTISE_HOSTNAME_STRATEGY environment variable to custom. Then, provide the custom hostname by setting either the -advertiseHostnameCustom flag or the WARPSTREAM_ADVERTISE_HOSTNAME_CUSTOM environment variable.

However, our recommendation is to always allow agents to directly communicate with each other and not adjust the above mentioned configurations.

Typical issues when hostname is not overridden correctly

WarpStream agents utilize their private IP and ports for ongoing connections after the initial bootstrap. Without the correct configurations, clients might connect to bootstrap successfully yet experience issues when progressing beyond the initial phase.

For example you may receive the following errors when hostname override is incorrectly set:

% warpstream cli -bootstrap-host my-kafka.example.com -type diagnose-connection
running diagnose-connection sub-command with bootstrap-host: my-kafka.exampl.com and bootstrap-port: 9092


Broker Details
---------------
  10.212.2.26:9092 (NodeID: 1195648645)
failed to communicate with Agent returned as part of Kafka Metadata response, err: <nil>, this usually means that the provided bootstrap host: my-kafka.exampl.com:9092 is accessible on the current network, but the URL that the Agent is advertising as its broker host/ip: 10.212.2.26:9092 is not accessible on this network. If this is occurring during local development whilst running the Agent in a docker container, consider adding the following flag to the docker run command: --env "WARPSTREAM_PRIVATE_IP_OVERRIDE=127.0.0.1" which will force the Agent to advertise its hostname/IP address as localhost for development purposes.

% kafka-topics --bootstrap-server my-kafka.example.com --list
[2025-02-03 15:08:19,631] WARN [AdminClient clientId=adminclient-1] Connection to node 1195648645 (10.212.2.26:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)

In these examples we are trying to connect to my-kafka.example.com. However, the WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE environment variable is not set on the agent to that hostname. We can see that the Kafka clients are trying to connect to 10.212.2.26:9092which is the private IP of the agent. Our Kafka clients cannot connect to the IP so they fail with connection errors.

PreviousLow Latency Clusters NextAgent Configuration Reference

Last updated 1 month ago

Was this helpful?