---
title: Service Discovery
---
# Service Discovery
SMG can automatically discover workers in Kubernetes by watching pods with label selectors. Workers are registered and removed as pods scale up or down — no manual URL management needed.
#### Before you begin
- Completed the [Getting Started](index.md) guide
- A Kubernetes cluster with worker pods deployed
- `kubectl` configured for your cluster
---
## Basic Setup
Enable service discovery with a label selector that matches your worker pods:
```bash
smg \
--service-discovery \
--selector app=sglang-worker \
--service-discovery-namespace inference \
--service-discovery-port 8000
```
SMG watches for pods matching the selector and automatically adds or removes workers.
### Parameters
| Parameter | Default | Description |
|-----------|---------|-------------|
| `--service-discovery` | `false` | Enable Kubernetes service discovery |
| `--selector` | — | Label selector for worker pods (required) |
| `--service-discovery-namespace` | (all namespaces) | Kubernetes namespace to watch |
| `--service-discovery-port` | `80` | Port to use for worker connections |
Connection mode (HTTP vs gRPC) is probed automatically during worker registration, so no protocol flag is required — the first protocol that responds successfully is used, with HTTP taking priority when both succeed.
---
## Label Selectors
### Single Label
```bash
smg --service-discovery --selector app=vllm
```
### Multiple Labels
Pass multiple `key=value` pairs separated by spaces:
```bash
smg --service-discovery --selector app=sglang environment=production
```
Matches pods that carry every listed label.
---
## PD Disaggregation Discovery
For prefill-decode deployments, use separate selectors:
```bash
smg \
--service-discovery \
--pd-disaggregation \
--prefill-selector app=sglang role=prefill \
--decode-selector app=sglang role=decode \
--service-discovery-namespace inference
```
Label your pods accordingly:
```yaml
# Prefill worker pod
metadata:
labels:
app: sglang
role: prefill
# Decode worker pod
metadata:
labels:
app: sglang
role: decode
```
---
## RBAC
SMG needs permissions to watch pods. Apply these resources to your cluster:
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: smg
namespace: inference
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: smg-discovery
namespace: inference
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: smg-discovery
namespace: inference
subjects:
- kind: ServiceAccount
name: smg
namespace: inference
roleRef:
kind: Role
name: smg-discovery
apiGroup: rbac.authorization.k8s.io
```
For cross-namespace discovery, use a `ClusterRole` and `ClusterRoleBinding` instead.
---
## Deployment Example
### SMG Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: smg
namespace: inference
spec:
replicas: 1
selector:
matchLabels:
app: smg
template:
metadata:
labels:
app: smg
spec:
serviceAccountName: smg
containers:
- name: smg
image: ghcr.io/lightseekorg/smg:latest
args:
- --service-discovery
- --selector=app=sglang-worker
- --service-discovery-namespace=inference
- --service-discovery-port=8000
- --policy=cache_aware
ports:
- containerPort: 8000
name: http
```
### Worker StatefulSet
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: sglang-worker
namespace: inference
spec:
serviceName: sglang-worker
replicas: 3
selector:
matchLabels:
app: sglang-worker
template:
metadata:
labels:
app: sglang-worker
spec:
containers:
- name: sglang
image: lmsysorg/sglang:latest
args:
- --model-path=meta-llama/Llama-3.1-8B-Instruct
- --port=8000
ports:
- containerPort: 8000
```
---
## Verify
```bash
# Check discovered workers
curl http://localhost:30000/workers | jq
# Check pod labels match selector
kubectl get pods -n inference -l app=sglang-worker
# Verify RBAC permissions
kubectl auth can-i watch pods -n inference --as=system:serviceaccount:inference:smg
```
---
## Troubleshooting
| Symptom | Cause | Solution |
|---------|-------|----------|
| No workers discovered | Wrong selector | Verify labels match: `kubectl get pods -l ` |
| RBAC error | Missing permissions | Apply Role and RoleBinding above |
| Workers not ready | Health check failing | Check worker health endpoint |
| Stale workers | Watch disconnected | Check Kubernetes API connectivity |
---
## Next Steps
- [Service Discovery Concepts](../concepts/architecture/service-discovery.md) — Worker lifecycle, monitoring metrics, cross-namespace discovery
- [Load Balancing](load-balancing.md) — Choose a routing policy for discovered workers
- [PD Disaggregation](pd-disaggregation.md) — Full PD setup with vLLM and SGLang