Network Architecture Considerations
Last updated
Was this helpful?
Last updated
Was this helpful?
In most WarpStream deployments client applications must connect directly to the WarpStream agents. This requires direct layer 3 network connectivity between the client applications and agents with no proxies, load balancers, NATing, etc. in the middle.
In some situations this type of connectivity is not always possible or desired. One example situation would be when the WarpStream agents are deployed in a Kubernetes cluster, but the client applications are outside of the Kubernetes cluster.
This guide clarifies key concepts and steps to guarantee a seamless network and connection setup.
To explore in detail the functionality of WarpStream's service discovery mechanism, check: Service Discovery.
The below architecture is a normal network architecture where all the applications can directly communicate with all the WarpStream agents. This is the recommended architecture for most WarpStream deployments.
This type of network architecture is recommended when applications are connecting to WarpStream agents outside of the agent's local network, for example connecting over the internet.
In this situation applications cannot directly connect to the WarpStream agents and must connect through a TCP Load Balancer.
When exposing WarpStream to external networks it is highly recommended to configure TLS and Authentication. See Protect Data in Motion with TLS Encryption, SASL Authentication, Mutual TLS (mTLS) for configuration details.
Agent configuration:
WARPSTREAM_DEFAULT_VIRTUAL_CLUSTER_ID=$VIRTUAL_CLUSTER_ID
WARPSTREAM_REQUIRE_SASL_AUTHENTICATION=true
WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE=$LOAD_BALANCER_HOSTNAME
Running WarpStream within Kubernetes can be simple and straightforward with our Helm charts.
However, when applications that are running outside of the Kubernetes cluster need to connect to WarpStream additional configuration is required.
In this example setup we will have at least 3 helm deployments for 3 different agent groups. See Agent Groups for information about groups.
Agent Group One will handle applications running in the same Kubernetes cluster as the agents.
Agent Group Two will handle applications running in the same VPC as the Kubernetes cluster but not running in the Kubernetes cluster itself.
Agent Group Three will handle applications running outside of the VPC, for example connecting over the internet.
In all three cases the bootstrap server will be printed out in the NOTES
section during the helm install
.
Bellow are the recommended helm values to set for the various groups.
You can then install all three agent groups by running the following commands:
When using AWS EKS it is recommended to use the AWS Load Balancer Controller, the old in-tree or out-of-tree cloud provider for EKS is considered Legacy by AWS. While it is still possible to use the legacy provider for Load Balancers there is little available public documentation and the required annotations may be different.
It is sometimes useful to override the hostname on a client level. This is typically needed when using kubectl port-forward
.
Set the ws_host_override
parameter within the client's ID when creating the Kafka client (check Configuring Kafka Client ID Features for more details):
Our recommendation is to only use the above configuration in debugging situations and not long-term deployments.
WarpStream agents must be able to directly communicate with each other. They need to communicate to efficiently share files and data.
In rare situations it may be necessary to override the internal agent to agent hostname.
This can be done by setting the -advertiseHostnameStrategy
flag or the WARPSTREAM_ADVERTISE_HOSTNAME_STRATEGY
environment variable to custom
. Then, provide the custom hostname by setting either the -advertiseHostnameCustom
flag or the WARPSTREAM_ADVERTISE_HOSTNAME_CUSTOM
environment variable.
However, our recommendation is to always allow agents to directly communicate with each other and not adjust the above mentioned configurations.
WarpStream agents utilize their private IP and ports for ongoing connections after the initial bootstrap. Without the correct configurations, clients might connect to bootstrap successfully yet experience issues when progressing beyond the initial phase.
For example you may receive the following errors when hostname override is incorrectly set:
In these examples we are trying to connect to my-kafka.example.com
. However, the WARPSTREAM_DISCOVERY_KAFKA_HOSTNAME_OVERRIDE
environment variable is not set on the agent to that hostname. We can see that the Kafka clients are trying to connect to 10.212.2.26:9092
which is the private IP of the agent. Our Kafka clients cannot connect to the IP so they fail with connection errors.