StackGres provides comprehensive pod scheduling options to control where cluster pods run. This enables optimizing for performance, availability, compliance, and resource utilization.
Pod scheduling in StackGres is configured through spec.pods.scheduling:
apiVersion: stackgres.io/v1
kind: SGCluster
metadata:
name: my-cluster
spec:
pods:
scheduling:
nodeSelector:
node-type: database
tolerations:
- key: "dedicated"
operator: "Equal"
value: "postgresql"
effect: "NoSchedule"
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east-1a
- us-east-1b
Note: Changing scheduling configuration may require a cluster restart.
The simplest way to constrain pods to specific nodes using labels:
spec:
pods:
scheduling:
nodeSelector:
node-type: database
disk-type: ssd
Dedicated database nodes:
nodeSelector:
workload: postgresql
Specific hardware:
nodeSelector:
cpu-type: amd-epyc
memory-size: high
Region/zone placement:
nodeSelector:
topology.kubernetes.io/zone: us-east-1a
Label nodes to match your selectors:
# Add labels
kubectl label node node-1 node-type=database
kubectl label node node-2 node-type=database
# Verify
kubectl get nodes -l node-type=database
Tolerations allow pods to be scheduled on nodes with matching taints:
spec:
pods:
scheduling:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "postgresql"
effect: "NoSchedule"
| Field | Description |
|---|---|
key |
Taint key to match |
operator |
Equal or Exists |
value |
Taint value (for Equal operator) |
effect |
NoSchedule, PreferNoSchedule, or NoExecute |
tolerationSeconds |
Time to tolerate NoExecute taints |
Tolerate dedicated database nodes:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "postgresql"
effect: "NoSchedule"
Tolerate any taint with a key:
tolerations:
- key: "database-only"
operator: "Exists"
effect: "NoSchedule"
Tolerate node pressure temporarily:
tolerations:
- key: "node.kubernetes.io/memory-pressure"
operator: "Exists"
effect: "NoSchedule"
Set up taints on dedicated nodes:
# Add taint
kubectl taint nodes node-1 dedicated=postgresql:NoSchedule
kubectl taint nodes node-2 dedicated=postgresql:NoSchedule
# Remove taint
kubectl taint nodes node-1 dedicated=postgresql:NoSchedule-
Node affinity provides more expressive node selection rules:
Pods must be scheduled on matching nodes:
spec:
pods:
scheduling:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values:
- database
- database-high-memory
Pods prefer matching nodes but can run elsewhere:
spec:
pods:
scheduling:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: disk-type
operator: In
values:
- nvme
- weight: 50
preference:
matchExpressions:
- key: disk-type
operator: In
values:
- ssd
| Operator | Description |
|---|---|
In |
Value in list |
NotIn |
Value not in list |
Exists |
Key exists |
DoesNotExist |
Key doesn’t exist |
Gt |
Greater than (numeric) |
Lt |
Less than (numeric) |
Spread pods across availability zones:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east-1a
- us-east-1b
- us-east-1c
Control co-location with other pods:
Schedule near specific pods:
spec:
pods:
scheduling:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: my-application
topologyKey: kubernetes.io/hostname
Avoid co-location with specific pods:
spec:
pods:
scheduling:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: StackGresCluster
stackgres.io/cluster-name: my-cluster
topologyKey: kubernetes.io/hostname
Note: StackGres automatically configures pod anti-affinity in
productionprofile to spread instances across nodes.
| Key | Scope |
|---|---|
kubernetes.io/hostname |
Single node |
topology.kubernetes.io/zone |
Availability zone |
topology.kubernetes.io/region |
Region |
Fine-grained control over pod distribution:
spec:
pods:
scheduling:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: StackGresCluster
stackgres.io/cluster-name: my-cluster
| Field | Description |
|---|---|
maxSkew |
Maximum difference in pod count between zones |
topologyKey |
Node label for topology domain |
whenUnsatisfiable |
DoNotSchedule or ScheduleAnyway |
labelSelector |
Pods to consider for spreading |
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
stackgres.io/cluster-name: my-cluster
Set pod priority for scheduling and preemption:
spec:
pods:
scheduling:
priorityClassName: high-priority-database
Create a PriorityClass:
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority-database
value: 1000000
globalDefault: false
description: "Priority class for PostgreSQL databases"
Configure separate scheduling for backup pods:
spec:
pods:
scheduling:
backup:
nodeSelector:
workload: backup
tolerations:
- key: "backup-only"
operator: "Exists"
effect: "NoSchedule"
This allows running backups on different nodes than the database.
apiVersion: stackgres.io/v1
kind: SGCluster
metadata:
name: ha-cluster
spec:
instances: 3
postgres:
version: '16'
profile: production
pods:
persistentVolume:
size: '100Gi'
scheduling:
# Run only on dedicated database nodes
nodeSelector:
node-type: database
# Tolerate dedicated node taints
tolerations:
- key: "dedicated"
operator: "Equal"
value: "postgresql"
effect: "NoSchedule"
# Prefer NVMe storage nodes
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: storage-type
operator: In
values:
- nvme
# Spread across availability zones
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
stackgres.io/cluster-name: ha-cluster
# High priority
priorityClassName: database-critical
apiVersion: stackgres.io/v1
kind: SGCluster
metadata:
name: dev-cluster
spec:
instances: 1
postgres:
version: '16'
profile: development
pods:
persistentVolume:
size: '10Gi'
scheduling:
# Prefer spot/preemptible nodes
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-lifecycle
operator: In
values:
- spot
tolerations:
- key: "spot-instance"
operator: "Exists"
effect: "NoSchedule"
apiVersion: stackgres.io/v1
kind: SGCluster
metadata:
name: dr-cluster
spec:
instances: 5
postgres:
version: '16'
pods:
scheduling:
# Require specific regions
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/region
operator: In
values:
- us-east-1
- us-west-2
# Spread across regions and zones
topologySpreadConstraints:
- maxSkew: 2
topologyKey: topology.kubernetes.io/region
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
stackgres.io/cluster-name: dr-cluster
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
stackgres.io/cluster-name: dr-cluster
apiVersion: stackgres.io/v1
kind: SGCluster
metadata:
name: my-cluster
spec:
instances: 3
pods:
scheduling:
# Database pods on high-performance nodes
nodeSelector:
workload: database
performance: high
# Backup pods on cost-optimized nodes
backup:
nodeSelector:
workload: backup
cost: optimized
tolerations:
- key: "backup-workload"
operator: "Exists"
effect: "NoSchedule"