Skip to main content

Networking

Overview

All external traffic enters the cluster through a single custom gateway (edd-gateway) running on s0. The gateway handles TLS termination, protocol detection, request routing, and SSH tunneling. The default K3s networking components (Traefik ingress and servicelb) are disabled.

Network Stack

Physical Network

All nodes share a flat L2 network on 192.168.3.0/24:

NodeIPArchitecture
s0192.168.3.100amd64
s1192.168.3.101amd64
s2192.168.3.102amd64
s3192.168.3.103amd64
rp1192.168.3.201arm64
rp2192.168.3.202arm64
rp3192.168.3.203arm64
rp4192.168.3.204arm64

Pod Networking — Calico (VXLAN)

Calico provides the CNI with a VXLAN overlay network. Each node gets a /26 pod CIDR from the cluster range 10.42.0.0/16. Cross-node pod traffic is encapsulated in VXLAN tunnels over the physical network.

Key components:

  • calico-node — DaemonSet running Felix (dataplane programming) and BIRD (route distribution) on every node
  • calico-typha — Aggregates Kubernetes API watches and fans out to calico-node pods to reduce API server load

Load Balancing — MetalLB (L2)

MetalLB runs in L2 mode, responding to ARP requests for allocated virtual IPs. When a LoadBalancer service is created, MetalLB assigns a VIP from the configured pool and one node's speaker announces it.

PoolRangePurpose
compute-pool192.168.3.150-192.168.3.200Gateway VIP + compute namespace services

The gateway service sets loadBalancerIP: 192.168.3.200 to receive a stable VIP from the pool for all external traffic on ports 80, 443, and 2222.

Client Source IP Preservation: The gateway LoadBalancer uses externalTrafficPolicy: Local, which directs kube-proxy to preserve the real client IP address instead of SNAT-ing it to the node IP. Without this, all connections would appear to originate from internal node IPs (e.g., 192.168.3.100), breaking session tracking and IP-based security features.

Disabled Components

The following K3s defaults are explicitly disabled in /etc/rancher/k3s/config.yaml:

  • Traefik — Replaced by the custom edd-gateway
  • servicelb (Klipper) — Replaced by MetalLB

Domain Structure

eddisonso.com
├── cloud.eddisonso.com # Main dashboard (React SPA)
├── auth.cloud.eddisonso.com # Authentication API
├── storage.cloud.eddisonso.com # Storage API (SFS)
├── compute.cloud.eddisonso.com # Compute API
├── health.cloud.eddisonso.com # Health/Monitoring API + Log streaming
├── notifications.cloud.eddisonso.com # Notification API + WebSocket push
└── docs.cloud.eddisonso.com # Documentation site

cloud-api.eddisonso.com is deprecated. All APIs use *.cloud.eddisonso.com subdomains.

DNS Configuration

RecordTypeValue
*.eddisonso.comA192.168.3.200 (gateway VIP)
*.cloud.eddisonso.comA192.168.3.200 (gateway VIP)

Both wildcard records resolve to the MetalLB VIP. MetalLB's L2 speaker on s0 responds to ARP for this IP, so the network routes packets directly to s0 where kube-proxy DNATs them to the gateway pod.

TLS Certificates

Managed by cert-manager using Let's Encrypt with Cloudflare DNS-01 challenge (required for wildcard certs):

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: eddisonso-wildcard
spec:
secretName: eddisonso-wildcard-tls
issuerRef:
name: letsencrypt-cloudflare
kind: ClusterIssuer
dnsNames:
- eddisonso.com
- "*.eddisonso.com"
- "*.cloud.eddisonso.com"

The gateway loads the wildcard certificate from the eddisonso-wildcard-tls Kubernetes secret and terminates TLS for all static routes. For user container HTTPS traffic, the gateway supports TLS passthrough.

Gateway Routing

The gateway determines the backend target based on the Host header and request path. Routes are stored in PostgreSQL (gateway_db) and cached in memory with a 100-entry LRU cache.

Static Routes

HostPathBackend Service
cloud.eddisonso.com/simple-file-share-frontend:80
auth.cloud.eddisonso.com/auth-service:80
storage.cloud.eddisonso.com/simple-file-share-backend:80
compute.cloud.eddisonso.com/edd-compute:80
health.cloud.eddisonso.com/cluster-monitor:80
health.cloud.eddisonso.com/logslog-service:80
notifications.cloud.eddisonso.com/notification-service:80
docs.cloud.eddisonso.com/edd-cloud-docs:80

Route configuration is managed in the gateway-routes ConfigMap (edd-gateway/manifests/gateway-routes.yaml).

Route Priority

Routes are matched by priority (highest first):

  1. Exact path matches (e.g., /sse/health)
  2. Prefix matches (e.g., /compute)
  3. Root path (/)

Container Routing

When a user creates a compute container with ingress rules, the gateway dynamically routes traffic to the container pod. The gateway subscribes to NATS events for container start/stop to update its routing table in real time.

Container traffic on ports 8000-8999 is forwarded directly to user container pods based on the configured ingress rules.

Protocol Detection

The gateway inspects the first bytes of each connection to determine the protocol:

ProtocolDetectionAction
TLS (0x16)ClientHello byteTLS termination or passthrough
HTTPGET, POST, etc.Route based on Host header
SSHSSH- prefixProxy to container SSH

Supported: HTTP/1.1, HTTPS, WebSocket, SSH Not supported: HTTP/2, gRPC passthrough

SSH Tunneling

The gateway exposes port 2222 for SSH access to user containers:

  1. Client connects to cloud.eddisonso.com:2222
  2. Gateway authenticates using the user's uploaded SSH public keys (stored in compute_db)
  3. Gateway resolves the target container pod IP
  4. Connection is proxied to the container's SSH daemon

Connection Pooling (Browser)

Services are split across subdomains to avoid browser connection limits (6 per domain in HTTP/1.1):

DomainConnection Usage
cloud.eddisonso.comDashboard, auth redirects
storage.cloud.eddisonso.comFile uploads/downloads, SSE progress
compute.cloud.eddisonso.comContainer CRUD, WebSocket status
health.cloud.eddisonso.comMetrics SSE, log streaming SSE
notifications.cloud.eddisonso.comNotification API, WebSocket push

Internal Network

Service Discovery

Services communicate internally via Kubernetes DNS:

<service>.<namespace>.svc.cluster.local
ServiceAddressProtocol
PostgreSQL (via HAProxy)haproxy.core.svc.cluster.local:5432PostgreSQL
PostgreSQL (direct)postgres.core.svc.cluster.local:5432PostgreSQL
GFS Mastergfs-master.core.svc.cluster.local:9000gRPC
NATSnats.core.svc.cluster.local:4222NATS
Log Service (gRPC)log-service.core.svc.cluster.local:50051gRPC

GFS Chunkserver Network

GFS chunkservers run with hostNetwork: true on s1, s2, and s3, binding directly to the node IP. This avoids the VXLAN overhead for large data transfers:

ChunkserverAddressPorts
s1192.168.3.1019080 (client), 9081 (replication)
s2192.168.3.1029080 (client), 9081 (replication)
s3192.168.3.1039080 (client), 9081 (replication)

Internal Service Ports

ServiceTypePortsProtocol
gatewayLoadBalancer80, 443, 2222, 8000-8999HTTP/HTTPS/SSH/TCP
auth-serviceClusterIP80HTTP
simple-file-share-backendClusterIP80HTTP
simple-file-share-frontendClusterIP80HTTP
edd-computeClusterIP80HTTP
cluster-monitorClusterIP80HTTP
log-serviceClusterIP50051, 80gRPC, HTTP
notification-serviceClusterIP80HTTP, WebSocket
edd-cloud-docsClusterIP80HTTP
gfs-masterClusterIP9000gRPC
gfs-chunkserver-NhostNetwork9080, 9081TCP (client), TCP (replication)
postgresClusterIP5432PostgreSQL
haproxyClusterIP5432PostgreSQL
natsClusterIP4222, 8222NATS, HTTP monitoring

CORS Configuration

Each backend service implements CORS middleware. The origin is reflected from the request:

func corsMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
origin := r.Header.Get("Origin")
if origin != "" {
w.Header().Set("Access-Control-Allow-Origin", origin)
w.Header().Set("Access-Control-Allow-Credentials", "true")
w.Header().Set("Access-Control-Allow-Methods", "GET, POST, PUT, DELETE, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization")
}
if r.Method == "OPTIONS" {
w.WriteHeader(http.StatusOK)
return
}
next.ServeHTTP(w, r)
})
}

Traffic Flow Summary