The multi-tenant architecture uses hierarchical database modeling to distribute queries across nodes. The tenant ID is stored in a column on each table, and Citus routes queries to the appropriate worker node.
Best practices:
Real-time architectures depend on specific distribution properties to achieve highly parallel processing.
Best practices:
Important: Do NOT use the timestamp as the distribution column for time-series data. A hash distribution based on time distributes times seemingly at random, leading to network overhead for range queries.
Best practices:
Co-located tables are distributed tables that share common columns in the distribution key. This improves performance since distributed queries avoid querying more than one Postgres instance for correlated columns.
Benefits of co-location:
Example:
SELECT create_distributed_table('event', 'tenant_id');
SELECT create_distributed_table('page', 'tenant_id', colocate_with => 'event');
Reference tables are replicated across all worker nodes and automatically kept in sync during modifications. Use them for small tables that need to be joined with distributed tables.
SELECT create_reference_table('geo_ips');
Adding a new shard is simple - increase the clusters field value in the shards section:
apiVersion: stackgres.io/v1alpha1
kind: SGShardedCluster
metadata:
name: my-sharded-cluster
spec:
shards:
clusters: 3 # Increased from 2
After provisioning, rebalance data using the resharding operation:
apiVersion: stackgres.io/v1
kind: SGShardedDbOps
metadata:
name: reshard
spec:
sgShardedCluster: my-sharded-cluster
op: resharding
resharding:
citus: {}
Citus allows creating partitioned tables that are also distributed for time-series workloads. With partitioned tables, removing old historical data is fast and doesn’t generate bloat:
CREATE TABLE github_events (
event_id bigint,
event_type text,
repo_id bigint,
created_at timestamp
) PARTITION BY RANGE (created_at);
SELECT create_distributed_table('github_events', 'repo_id');
SELECT create_time_partitions(
table_name := 'github_events',
partition_interval := '1 month',
end_at := now() + '12 months'
);
Citus supports columnar storage for distributed partitioned tables. This append-only format can greatly reduce data size and improve query performance, especially for numerical values:
CALL alter_old_partitions_set_access_method(
'github_events',
'2015-01-01 06:00:00' /* older_than */,
'columnar'
);
Note: Columnar storage disallows updating and deleting rows, but you can still remove entire partitions.
Create the SGShardedCluster resource:
apiVersion: stackgres.io/v1alpha1
kind: SGShardedCluster
metadata:
name: cluster
spec:
type: citus
database: mydatabase
postgres:
version: '15'
coordinator:
instances: 2
pods:
persistentVolume:
size: '10Gi'
shards:
clusters: 4
instancesPerCluster: 2
pods:
persistentVolume:
size: '10Gi'
This configuration will create a coordinator with 2 Pods and 4 shards with 2 Pods each.
By default the coordinator node has a synchronous replica to avoid losing any metadata that could break the sharded cluster.
The shards are where sharded data lives and have a replica in order to provide high availability to the cluster.

After all the Pods are Ready you can view the topology of the newly created sharded cluster by issuing the following command:
kubectl exec -n my-cluster cluster-coord-0 -c patroni -- patronictl list
+ Citus cluster: cluster --+------------------+--------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+------------------+------------------+--------------+---------+----+-----------+
| 0 | cluster-coord-0 | 10.244.0.16:7433 | Leader | running | 1 | |
| 0 | cluster-coord-1 | 10.244.0.34:7433 | Sync Standby | running | 1 | 0 |
| 1 | cluster-shard0-0 | 10.244.0.19:7433 | Leader | running | 1 | |
| 1 | cluster-shard0-1 | 10.244.0.48:7433 | Replica | running | 1 | 0 |
| 2 | cluster-shard1-0 | 10.244.0.20:7433 | Leader | running | 1 | |
| 2 | cluster-shard1-1 | 10.244.0.42:7433 | Replica | running | 1 | 0 |
| 3 | cluster-shard2-0 | 10.244.0.22:7433 | Leader | running | 1 | |
| 3 | cluster-shard2-1 | 10.244.0.43:7433 | Replica | running | 1 | 0 |
| 4 | cluster-shard3-0 | 10.244.0.27:7433 | Leader | running | 1 | |
| 4 | cluster-shard3-1 | 10.244.0.45:7433 | Replica | running | 1 | 0 |
+-------+------------------+------------------+--------------+---------+----+-----------+
You may also check that they are already configured in Citus by running the following command:
$ kubectl exec -n my-cluster cluster-coord-0 -c patroni -- psql -d mydatabase -c 'SELECT * FROM pg_dist_node'
nodeid | groupid | nodename | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards
--------+---------+-------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
1 | 0 | 10.244.0.34 | 7433 | default | t | t | primary | default | t | f
3 | 2 | 10.244.0.20 | 7433 | default | t | t | primary | default | t | t
2 | 1 | 10.244.0.19 | 7433 | default | t | t | primary | default | t | t
4 | 3 | 10.244.0.22 | 7433 | default | t | t | primary | default | t | t
5 | 4 | 10.244.0.27 | 7433 | default | t | t | primary | default | t | t
(5 rows)
Please, take into account that the groupid column of the pg_dist_node table is the same as the Patroni Group column above. In particular, the group with identifier 0 is the coordinator group (coordinator have shouldhaveshards column set to f).
For a more complete configuration please have a look at Create Citus Sharded Cluster Section.