StackGres uses an OpenTelemetry Collector to gather metrics from PostgreSQL clusters and expose them to monitoring systems like Prometheus.
The OpenTelemetry Collector acts as a central hub for metrics:
┌─────────────────┐ ┌─────────────────────┐ ┌─────────────────┐
│ SGCluster │────▶│ OpenTelemetry │────▶│ Prometheus │
│ (metrics) │ │ Collector │ │ │
└─────────────────┘ └─────────────────────┘ └─────────────────┘
│ │
│ │
┌───────▼─────────┐ │
│ Envoy │───────────────┘
│ (proxy metrics)│
└─────────────────┘
By default, StackGres deploys an OpenTelemetry Collector as part of the operator installation. The collector:
Configure the collector during StackGres operator installation:
# values.yaml
collector:
enabled: true
config:
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'stackgres'
scrape_interval: 30s
exporters:
prometheus:
endpoint: "0.0.0.0:9090"
service:
pipelines:
metrics:
receivers: [prometheus]
exporters: [prometheus]
Configure the collector through the SGConfig CRD:
apiVersion: stackgres.io/v1
kind: SGConfig
metadata:
name: stackgres-config
namespace: stackgres
spec:
collector:
config:
exporters:
prometheus:
endpoint: "0.0.0.0:9090"
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"
Configure how the collector scrapes metrics:
spec:
collector:
receivers:
prometheus:
enabled: true
# Additional Prometheus scrape configs
Enable OTLP protocol for receiving metrics:
spec:
collector:
config:
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"
Configure the Prometheus endpoint:
spec:
collector:
config:
exporters:
prometheus:
endpoint: "0.0.0.0:9090"
namespace: stackgres
const_labels:
environment: production
If you have Prometheus Operator installed, StackGres can automatically create PodMonitor/ServiceMonitor resources.
apiVersion: stackgres.io/v1
kind: SGConfig
metadata:
name: stackgres-config
namespace: stackgres
spec:
collector:
prometheusOperator:
# Allow discovery of Prometheus instances in all namespaces
allowDiscovery: true
# Create monitors automatically
# monitors:
# - name: prometheus
Enable automatic binding to discovered Prometheus instances:
apiVersion: stackgres.io/v1
kind: SGCluster
metadata:
name: my-cluster
spec:
configurations:
observability:
prometheusAutobind: true
This automatically creates the necessary ServiceMonitor resources.
apiVersion: stackgres.io/v1
kind: SGCluster
metadata:
name: my-cluster
spec:
configurations:
observability:
# Enable/disable metrics collection
disableMetrics: false
# Prometheus auto-discovery
prometheusAutobind: true
# Receiver name for collector scraper
receiver: my-receiver
For clusters where you don’t need metrics:
spec:
configurations:
observability:
disableMetrics: true
Configure multiple collector replicas:
spec:
collector:
receivers:
enabled: true
deployments: 2 # Number of collector deployments
Set resource limits for the collector:
# Helm values
collector:
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
spec:
collector:
config:
processors:
batch:
timeout: 10s
send_batch_size: 1000
memory_limiter:
check_interval: 1s
limit_mib: 400
service:
pipelines:
metrics:
receivers: [prometheus, otlp]
processors: [memory_limiter, batch]
exporters: [prometheus]
spec:
collector:
config:
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
tls:
cert_file: /etc/ssl/certs/collector.crt
key_file: /etc/ssl/private/collector.key
# View collector pods
kubectl get pods -n stackgres -l app=stackgres-collector
# View collector logs
kubectl logs -n stackgres -l app=stackgres-collector
# Check metrics endpoint
kubectl port-forward -n stackgres svc/stackgres-collector 9090:9090
curl http://localhost:9090/metrics
The collector exposes its own health metrics:
otelcol_receiver_received_metric_points: Received metric pointsotelcol_exporter_sent_metric_points: Exported metric pointsotelcol_processor_dropped_metric_points: Dropped metric points