A Helm chart that deploys OpenTelemetry-based deep monitoring for PostgreSQL using a sidecar collector pattern. It installs stored functions on the target database, creates a dedicated monitoring user, and runs an OpenTelemetryCollector sidecar that emits metrics via OTLP.
When argoEvents.enabled=true, the chart also installs Argo Events to run the database setup Job after the OTel Operator injects the sidecar into a target Pod.
| Repository | Name | Version |
|---|---|---|
| https://argoproj.github.io/argo-helm | argo-events | 2.4.22 |
┌───────────────────────────────────────────┐
│ Your PostgreSQL Pod │
│ │
│ ┌─────────────┐ ┌─────────────────┐ │
│ │ PostgreSQL │ │ OTel Collector │ │
│ │ container │◄──│ (sidecar) │ │
│ └─────────────┘ └────────┬────────┘ │
│ │ OTLP/gRPC │
└──────────────────────────────┼────────────┘
▼
Node IP :4317
(DaemonSet collector
or any OTLP endpoint)
The chart creates a single shared ConfigMap (postgresql-monitoring-setup) that embeds pg-monitoring-setup.sql (the stored functions), and then for each database entry in values.yaml:
<name>-pg-monitor-credentials Secret holding the auto-generated (or idempotently preserved) password for the otel_monitor user. When postgres.databases[*].namespace differs from the Helm release namespace, the same Secret is replicated into the target namespace so the injected sidecar can resolve secretKeyRef in the Pod’s namespace.otel database, and rotates the otel_monitor password to match the Secret. See Argo Events setup below for when and how this Job is created.opentelemetry.io/v1beta1 CR in the Helm release namespace that configures the contrib collector with all receivers and exports metrics to $K8S_NODE_IP:4317.When argoEvents.enabled=true and argoEvents.triggerSetupJob=true (the default), the chart installs Argo Events and uses it to run the setup Job after the OTel Operator injects the sidecar into a target Pod. This avoids racing the database setup against sidecar startup and supports deploying DB monitoring in one namespace while Postgres runs in another.
┌─────────────────────────────────────────────────────────────────────────┐
│ Helm release namespace (e.g. observability) │
│ │
│ EventBus ──► EventSource ──► NATS ──► Sensor ──► Setup Job │
│ │ │ │
│ OpenTelemetryCollector CR ◄─────────────────────┘ (same release ns) │
│ Monitor Secret (for Job) │
│ ConfigMap (setup SQL) │
│ Argo Events controller (argo-events subchart) │
└─────────────────────────────────────────────────────────────────────────┘
▲ │
│ cross-namespace inject │ cross-namespace pod watch
│ annotation │ (RBAC in target namespace)
│ ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ Target namespace (e.g. demo) │
│ │
│ PostgreSQL Pod + injected OTel sidecar │
│ Monitor Secret (replica for sidecar secretKeyRef) │
└─────────────────────────────────────────────────────────────────────────┘
The flow for each postgres.databases entry:
argoEvents.eventBusName, default default).ADD events in the database target namespace(s). Uses filter.afterStart: true, so only Pods created after the EventSource starts trigger events (existing Pods are ignored).sidecar.opentelemetry.io/inject annotation (argoEvents.sidecarInjectAnnotation). When the annotation matches the expected value, it creates the setup Job in the release namespace.<host>.<namespace>.svc.cluster.local) when the database runs in another namespace.When argoEvents.enabled=false, or argoEvents.triggerSetupJob=false, the setup Job is created directly as a Helm post-install / post-upgrade hook instead.
The OTel Operator must inject the sidecar before the Sensor fires. Annotate the target Pod (or its controller template) with sidecar.opentelemetry.io/inject:
| Scenario | Expected annotation value |
|---|---|
| Postgres in the same namespace as the Helm release | postgres-dbm-sidecar (the sidecar-name) |
| Postgres in a different namespace | <releaseNamespace>/<sidecar-name> (e.g. observability/postgres-dbm-sidecar) |
The Sensor filter is derived automatically from postgres.databases[*].namespace and sidecar-name. The OpenTelemetryCollector CR always lives in the release namespace; cross-namespace injection is driven by the annotation prefix.
Install DB monitoring in observability, target Postgres in demo:
# values.yaml for: helm install dbm … -n observability
postgres:
enabled: true
databases:
- name: postgresql
namespace: demo
sidecar-name: postgres-dbm-sidecar
host: postgresql
user: root
pwd: otel
port: 5432
argoEvents:
enabled: true
triggerSetupJob: true
On the Postgres Pod (in demo):
podAnnotations:
sidecar.opentelemetry.io/inject: "observability/postgres-dbm-sidecar"
After install, restart the Postgres Pod so the EventSource emits a fresh ADD event:
kubectl delete pod -n demo -l app.kubernetes.io/name=postgresql
kubectl get jobs -n observability -l app.kubernetes.io/component=db-monitoring-setup -w
argoEvents:
enabled: false # do not install the controller or EventBus/EventSource/Sensor
triggerSetupJob: false # use Helm hook Jobs even when argoEvents.enabled=true
Argo Events CRDs are bundled under crds/ and installed before other chart resources. The argo-events subchart is gated by argoEvents.enabled; its own CRD install is disabled (argo-events.crds.install: false) because this chart ships the CRDs.
OpenTelemetryCollector CRD and injects sidecars)pg_stat_progress_vacuum)CREATEROLE and CREATEDB) available to run the setup Job — credentials are passed via postgres.databases[*].user / postgres.databases[*].pwdotel database must exist on each target PostgreSQL instance before install (the Job connects to it: psql … -d otel)argoEvents.enabled=true): Argo Events CRDs must be available (installed from this chart’s crds/ directory on first install)helm repo add tsuga-charts https://tsuga-dev.github.io/helm-charts/
helm repo update
helm install pg-monitoring tsuga-charts/opentelemetry-database-monitoring \
--set postgres.enabled=true \
--set postgres.databases[0].name=postgresql \
--set postgres.databases[0].host=postgresql \
--set postgres.databases[0].user=root \
--set postgres.databases[0].pwd=<superuser-password>
With Argo Events enabled (default), install into the namespace where you want the sidecar CRs, EventBus, and setup Jobs to live. Set postgres.databases[*].namespace when Postgres runs elsewhere.
postgres:
enabled: true
databases:
- name: postgresql
sidecar-name: postgres-dbm-sidecar
user: root
pwd: otel
port: 5432
host: postgresql
argoEvents:
enabled: true
triggerSetupJob: true
Annotate the Postgres Pod with sidecar.opentelemetry.io/inject: postgres-dbm-sidecar.
postgres:
enabled: true
databases:
- name: postgresql
namespace: demo
sidecar-name: postgres-dbm-sidecar
user: root
pwd: otel
port: 5432
host: postgresql
argoEvents:
enabled: true
triggerSetupJob: true
Annotate the Postgres Pod in demo with sidecar.opentelemetry.io/inject: <releaseNamespace>/postgres-dbm-sidecar.
Each entry produces an independent set of resources (Secret, Sensor, sidecar CR, and setup Job trigger).
postgres:
enabled: true
databases:
- name: postgresql
namespace: demo
sidecar-name: postgres-dbm-sidecar
user: root
pwd: otel
port: 5432
host: postgresql
- name: postgresql2
namespace: staging
sidecar-name: postgres2-dbm-sidecar
user: root
pwd: otel
port: 5432
host: postgresql2
The setup Job runs assets/pg-monitoring-setup.sql idempotently. It:
otel_monitor user (if absent) and grants it pg_monitor + SELECT on pg_stat_database.pg_stat_statements extension.otel schema and grants USAGE to otel_monitor.STABLE, safe for repeated execution):| Function | Description |
|---|---|
otel.pg_table_stats() |
Vacuum/analyze ages, autovacuum counts, rows modified since last analyze. Top 50 tables by n_mod_since_analyze. |
otel.pg_bloat() |
Table bloat ratio and wasted bytes using the standard check_postgres algorithm. Top 20 tables by wasted bytes. |
otel.pg_connections() |
Connection count and max state age per database per connection state. |
otel.pg_lock_counts() |
Lock counts by type, mode, and granted status. |
otel.pg_blocking_queries() |
Count of queries currently waiting on a lock. |
otel.pg_idle_in_transaction() |
Idle-in-transaction connection count and max age per database. |
otel.pg_replication_slots() |
Replication slot lag in bytes per slot. |
otel.pg_wal() |
WAL segment file count and total WAL directory size. |
otel.pg_vacuum_progress() |
Active vacuum operations with phase and block progress (PG 10+). |
otel.pg_top_queries() |
Top 20 queries by call count from pg_stat_statements. Query text truncated to 200 characters. |
EXECUTE on all otel functions to otel_monitor.otel_monitor password to match the value in the Secret.Note: The Job connects to the
oteldatabase (-d otel). This database must exist before the chart is installed.
The sidecar collector runs two categories of receivers.
postgresql receiverPolls the PostgreSQL stats system views directly. Default collection interval.
| Metric | Unit | Description |
|---|---|---|
postgresql.bgwriter.buffers.allocated |
{buffer} |
Buffers allocated by bgwriter |
postgresql.bgwriter.buffers.writes |
{buffer} |
Buffers written by bgwriter |
postgresql.bgwriter.checkpoint.count |
{checkpoint} |
Checkpoint count |
postgresql.bgwriter.duration |
ms |
Checkpoint write/sync duration |
postgresql.bgwriter.maxwritten |
1 |
Times bgwriter stopped a round due to too many buffers written |
postgresql.blks_hit |
{hit} |
Buffer cache hits |
postgresql.blks_read |
{read} |
Disk blocks read |
postgresql.connection.max |
{connection} |
Max connections configured |
postgresql.database.locks |
{lock} |
Database lock counts |
postgresql.deadlocks |
{deadlock} |
Deadlocks detected |
postgresql.replication.data_delay |
By |
Replication data delay |
postgresql.sequential_scans |
{scan} |
Sequential scans |
postgresql.temp_files |
{file} |
Temp files created |
postgresql.tup_deleted / tup_fetched / tup_inserted / tup_returned / tup_updated |
{tuple} |
Row operation counters |
postgresql.wal.age |
s |
WAL archiver age |
postgresql.wal.lag |
s |
Replication WAL lag |
sqlquery receivers (stored functions)| Receiver | Interval | Metrics |
|---|---|---|
sqlquery/postgresql_top_queries |
default | postgresql.query.calls, .rows, .exec_time, .plan_time, .blocks.shared.*, .blocks.temp.* — attributed by db.namespace, db.user.name, server.address, db.query.text, db.version |
sqlquery/pg_table_stats |
5 min | postgresql.table.vacuum.age, .autovacuum.age, .analyze.age, .autoanalyze.age, .autovacuum.count, .analyze.count, .autoanalyze.count, .rows.modified.since.analyze |
sqlquery/pg_bloat |
1 hour | postgresql.table.bloat.ratio, postgresql.table.bloat.bytes |
sqlquery/pg_connections |
30 s | postgresql.backends (count), postgresql.backends.age (max state age) |
sqlquery/pg_locks |
30 s | postgresql.lock.count, postgresql.lock.blocking_queries, postgresql.connection.idle_in_transaction, postgresql.connection.idle_in_transaction.age |
sqlquery/pg_replication |
30 s | postgresql.replication.slot.lag |
sqlquery/pg_wal |
60 s | postgresql.wal.file_count, postgresql.wal.bytes |
sqlquery/pg_vacuum_progress |
60 s | postgresql.vacuum.heap_blks_total, .heap_blks_scanned, .heap_blks_vacuumed, postgresql.vacuum.age |
All metrics are batched (max 5000 per batch) and exported via OTLP/gRPC to $K8S_NODE_IP:4317.
user/pwd) are used only by the setup Job and are not stored in any long-lived secret.otel_monitor password from the Secret. With Helm hook Jobs, the password is rotated on every helm upgrade. With Argo Events, rotation happens each time a new setup Job completes.lookup to preserve the existing password across upgrades — a new random 24-character password is only generated on first install.otel_monitor user holds pg_monitor (read-only system views) and EXECUTE on the otel.* functions. It has no write access to application data.sslmode=disable. Enable TLS by overriding assets/pg-monitoring-config.yaml if your PostgreSQL requires it.Without Argo Events (argoEvents.enabled=false or argoEvents.triggerSetupJob=false): the setup Job runs on both post-install and post-upgrade Helm hooks with a before-hook-creation deletion policy, so it re-runs on every helm upgrade.
With Argo Events (default): the setup Job is created by a Sensor when a new target Pod is added with the correct inject annotation. Upgrading the chart updates EventSource/Sensor configuration but does not re-run setup on existing Pods. Restart the target Postgres Pod after upgrade if you need setup to run again.
All SQL statements are idempotent and the password rotation uses the existing Secret value.
Setup Job never created (Argo Events enabled)
kubectl get eventbus,eventsource,sensor -n <release-namespace>
sidecar-name; cross-namespace → <releaseNamespace>/<sidecar-name>):
kubectl get sensor -n <release-namespace> -o yaml | rg 'value:|eventName:'
kubectl get pod -n <target-namespace> <postgres-pod> -o jsonpath='{.metadata.annotations.sidecar\.opentelemetry\.io/inject}{"\n"}'
afterStart: true). Restart the Postgres Pod after install or upgrade:
kubectl delete pod -n <target-namespace> -l app.kubernetes.io/name=postgresql
kubectl get jobs -n <release-namespace> -l app.kubernetes.io/component=db-monitoring-setup -w
Job fails immediately
Check that the otel database exists and the superuser credentials are correct:
kubectl logs -l app.kubernetes.io/component=db-monitoring-setup -n <release-namespace>
Sidecar init container fails with secret not found
The monitor Secret must exist in the Pod’s namespace. Set postgres.databases[*].namespace so the chart replicates the Secret into the target namespace.
No metrics in the collector
Verify the OpenTelemetryCollector CR was injected as a sidecar into the target Pod. The OTel Operator must be running and the Pod must have the sidecar.opentelemetry.io/inject annotation (or the operator must be configured for namespace-wide injection).
pg_stat_statements not available
The extension must be listed in shared_preload_libraries in postgresql.conf and PostgreSQL must be restarted before it can be created. Add shared_preload_libraries = 'pg_stat_statements' to your PostgreSQL configuration and restart the server before installing the chart.
Top queries missing
Queries run by the collector itself are excluded via the /* otel-collector-ignore */ comment prefix. If otel.pg_top_queries() returns no rows, check that pg_stat_statements.track is not set to none.
| Key | Type | Default | Description |
|---|---|---|---|
| argo-events.crds.install | bool | false |
|
| argo-events.enabled | bool | true |
|
| argoEvents.enabled | bool | true |
|
| argoEvents.eventBus.auth | string | "none" |
|
| argoEvents.eventBus.create | bool | true |
|
| argoEvents.eventBus.replicas | int | 1 |
|
| argoEvents.eventBusName | string | "default" |
|
| argoEvents.eventSource.name | string | "" |
|
| argoEvents.eventSource.serviceAccount.create | bool | true |
|
| argoEvents.eventSource.serviceAccount.name | string | "" |
|
| argoEvents.eventSource.watchNamespace | string | "" |
|
| argoEvents.job.generateNameSuffix | string | "setup-" |
|
| argoEvents.sensor.serviceAccount.create | bool | true |
|
| argoEvents.sensor.serviceAccount.name | string | "" |
|
| argoEvents.sidecarInjectAnnotation | string | "sidecar.opentelemetry.io/inject" |
|
| argoEvents.triggerSetupJob | bool | true |
|
| postgres.databases[0].host | string | "" |
|
| postgres.databases[0].name | string | "postgresql" |
|
| postgres.databases[0].namespace | string | "" |
|
| postgres.databases[0].port | int | 5432 |
|
| postgres.databases[0].pwd | string | "" |
|
| postgres.databases[0].sidecar-name | string | "postgres-dbm-sidecar" |
|
| postgres.databases[0].user | string | "" |
|
| postgres.enabled | bool | false |
|
| postgres.image | string | "ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib" |