What Is Multi-Tenancy?
Multi-tenancy is an architecture where a single instance of an application serves multiple customers — each called a tenant. The customers share the same underlying infrastructure (compute, database, storage, networking), yet each experiences the platform as if it were their own dedicated environment.
The isolation spectrum runs from shared everything (all tenants in one database table, lowest cost, highest blast radius) to dedicated everything (each tenant gets their own servers and database, highest cost, maximum isolation). Most SaaS platforms land somewhere in between, choosing isolation at the database, schema, or application layer.
SaaS businesses adopt multi-tenancy because it dramatically reduces cost — you run one fleet of servers instead of N. Updates are centralized: deploy once, all tenants get the fix. Observability is unified. Onboarding a new tenant is a database record, not a new cloud deployment. The challenge is ensuring tenant data never crosses boundaries — a data leak is both a security incident and an existential threat to customer trust.
Tenancy Models Compared
Three primary models dominate SaaS architecture decisions. Choosing the wrong one early is painful to reverse — pick based on compliance requirements, expected tenant count, and engineering capacity.
| Model | DB Isolation | Cost | Compliance Fit | Migration Difficulty | Best For |
|---|---|---|---|---|---|
| Silo (DB per tenant) | Highest — separate DB instance | Highest ($$$) | Excellent — HIPAA, PCI DSS, SOC 2 | Complex — N migrations to run | Enterprise, healthcare, fintech |
| Pool (Shared DB + RLS) | Application-enforced row-level | Lowest ($) | Adequate — needs RLS audit | Simple — one migration run | SMB SaaS, 1,000+ tenants, dev tools |
| Bridge (Schema per tenant) | Schema-level (PostgreSQL) | Medium ($$) | Good — schema can be dumped cleanly | Medium — N schemas, one runner | Mid-market SaaS, < 5,000 tenants |
Silo Model: Database per Tenant
In the Silo model each tenant gets a physically isolated database instance. There is zero risk of data bleed between tenants at the query layer — a misconfigured ORM query simply cannot reach another tenant's data.
When to Use Silo
- Customers require HIPAA Business Associate Agreements (BAA) — PHI must be isolatable
- PCI DSS scope — cardholder data must never share storage with other entities
- Enterprise contracts that mandate dedicated database clauses
- SOC 2 Type II auditors expect demonstrable logical or physical separation
- Tenants have vastly different load profiles (one tenant should not starve another)
Connection Pooling with PgBouncer
At 100 tenants you have 100 database endpoints. Your application cannot open a persistent connection to each — PostgreSQL's connection limit becomes a ceiling fast. Deploy one PgBouncer pool per tenant database in transaction pooling mode. The application connects to a PgBouncer sidecar, which multiplexes dozens of app connections onto a handful of real DB connections per tenant.
Dynamic Connection Routing
Store connection strings encrypted in a master control plane database. On each request, resolve the tenant from the subdomain or JWT claim, look up the connection string, and inject it into the ORM connection pool for that request's lifecycle.
Running Migrations Across Tenant DBs
Every schema change must run against all tenant databases. Build a migration runner that iterates the tenants table and applies Flyway or a custom SQL migration in parallel with a concurrency cap (10 at a time is safe). Log success and failure per tenant. Failed migrations should not block deployment — run migrations as a pre-deploy job with rollback capability.
AWS RDS Cost Estimate (Silo Model)
| Tenants | RDS Instance | Est. Monthly Cost |
|---|---|---|
| 10 tenants | db.t3.micro × 10 | ~$150/mo |
| 50 tenants | db.t3.small × 50 | ~$1,250/mo |
| 100 tenants | db.t3.medium × 100 | ~$4,800/mo |
Pool Model: Shared DB with Row-Level Security
The Pool model places all tenants in the same database — even the same tables — with a tenant_id column on every row. Isolation is enforced by PostgreSQL's Row-Level Security (RLS) policy engine, not by physical separation.
At 10,000 tenants with mostly light usage, the pool model is dramatically more cost-effective than Silo. A single db.r6g.xlarge RDS instance can serve the entire fleet. The engineering challenge shifts to data safety rigor: every query path must correctly set and propagate the tenant context.
PostgreSQL RLS Setup
Setting Tenant Context per Request
Index Strategy
Every table must have a composite index on (tenant_id, id) and any other commonly filtered columns. Without this, PostgreSQL performs a sequential scan across all tenants' rows whenever RLS filters by tenant_id.
Query Performance: EXPLAIN ANALYZE
Critical Risk: A single query missing WHERE tenant_id = ? — or a raw SQL bypass that sets the config to the wrong tenant — can expose another customer's data. Enforce RLS at the database layer AND application layer. Never let RLS be your only control.
Bridge Model: Schema per Tenant
PostgreSQL allows multiple schemas within a single database. The Bridge model creates one schema per tenant — all data physically co-located on one database server, yet logically separated at the schema level. Switching between tenants is a SET search_path command.
Schema Naming and search_path
Migrations with Flyway/Liquibase
Running schema migrations in the Bridge model means iterating all tenant schemas. Flyway supports a schemas parameter per datasource. Build a migration runner that queries information_schema.schemata for all tenant_% prefixed schemas, then applies migrations in parallel batches.
Pros and Cons vs Silo and Pool
Advantages over Silo
- One database server — 80% cheaper at 100 tenants
- Single connection pool, no PgBouncer per tenant
- Backups are simpler: one database dump
Advantages over Pool
- No tenant_id columns polluting every table
- GDPR erasure: DROP SCHEMA tenant_X CASCADE
- No RLS policy complexity or data bleed risk
Scale ceiling: PostgreSQL has a practical limit of approximately 10,000 schemas before catalog table queries become slow. Beyond ~5,000 tenants, migrate to Silo (dedicated databases) for heavy tenants or use a sharded Bridge across multiple database clusters.
Kubernetes Namespace-per-Tenant Isolation
For compute isolation (not just database isolation), Kubernetes namespaces provide a logical boundary. Each tenant namespace contains its own Deployments, Services, ConfigMaps, and Secrets. Cluster-level policies enforce that no namespace can reach another's pods.
NetworkPolicy: Block Cross-Namespace Traffic
ResourceQuota and LimitRange
Tenant-Scoped RBAC
Create a ServiceAccount per tenant namespace with a RoleBinding granting only the permissions needed within that namespace. This prevents a compromised tenant workload from using the Kubernetes API to access secrets or pods in other namespaces.
Helm Chart: One Chart, Values per Tenant
Maintain one Helm chart for the tenant workload. Deploy per tenant with a values-{tenant}.yaml override file. Use ArgoCD ApplicationSets with a list generator to auto-deploy a new namespace when a tenant record is added to the control plane database.
Automated Tenant Onboarding Pipeline
Tenant onboarding must be fully automated, idempotent (safe to retry), and observable. A manual provisioning step is a scaling bottleneck and an error source at 3am when a sales team closes a deal.
Account Creation
Capture email, company name, plan selection. Validate domain uniqueness. Generate a tenant UUID and slug. Write the tenant record to the control plane database with status = provisioning.
Database / Schema Provisioning
Trigger an idempotent provisioning job (queue-backed). For Silo: provision RDS instance via Terraform module, wait for endpoint, store encrypted connection string. For Bridge: CREATE SCHEMA IF NOT EXISTS tenant_{uuid}, run Flyway migrations. For Pool: no action needed — rows are tenant-scoped automatically.
Stripe Subscription Creation
Create a Stripe Customer object and attach a Subscription to the selected price ID. Store stripe_customer_id and stripe_subscription_id on the tenant record. For paid plans, require card upfront (Stripe Checkout or Payment Element).
Welcome Email + Onboarding Flow
Send a transactional welcome email with the subdomain URL, a magic-link login, and a link to the onboarding checklist. Trigger in-app onboarding tasks (e.g., "Connect your first integration", "Invite a teammate").
Admin Panel Tenant Record
Update tenant status to active. Emit a tenant.created event to your internal event bus. Create a PagerDuty/OpsGenie escalation policy for this tenant if they are Enterprise SLA. Log all provisioning steps to the audit trail table.
Idempotency with Redis SETNX
Provisioning jobs may be retried on failure. Wrap the entire pipeline in a distributed lock using Redis SETNX with a 10-minute TTL. If the lock exists, skip the job (a previous run is in progress or succeeded). On success, write the completion state to the control plane DB so subsequent retries are instant no-ops.
Compliance and Data Isolation
Compliance frameworks have opinions about multi-tenancy. Choosing the wrong isolation model can disqualify you from enterprise deals or create audit findings. Here is how each framework maps to your architecture choices.
SOC 2 Type II
- Requires per-tenant audit trail — log every data access with tenant context
- Access logging: who accessed which tenant data and when
- Pool model requires RLS audit evidence for auditors
- Silo model simplifies access control audit scope
HIPAA
- PHI must be isolatable for Business Associate Agreement compliance
- Silo model (DB per tenant) is strongly preferred by auditors
- PHI in a pool model requires demonstrated RLS guarantees
- Encryption at rest per tenant DB simplifies breach notification scope
GDPR
- Right to erasure: can you delete all data for one tenant cleanly?
- Silo: drop the database. Bridge: DROP SCHEMA tenant_x CASCADE.
- Pool: DELETE WHERE tenant_id = ? across all tables — error-prone
- Data residency: silo allows per-tenant region selection
PCI DSS
- Cardholder data must never comingle across entities
- Silo model required for any tenant storing raw card data
- Network segmentation at Kubernetes NetworkPolicy layer
- Dedicated nodes (node affinity) for PCI-scope tenant workloads
Per-Tenant Billing with Stripe
Multi-tenant SaaS billing is more complex than single-tenant because usage must be accurately attributed to each tenant before it is reported to the billing provider. Stripe's Meters API (introduced in 2024) is now the canonical solution for usage-based billing.
Metered Billing: Aggregate and Report
For each tenant, track API calls, storage bytes, and active seats in a Redis counter or time-series database (TimescaleDB). Every hour, aggregate per-tenant usage and report to Stripe Meters via the POST /v1/billing/meter_events endpoint.
Usage-Based Dimensions
API Calls
Redis INCR per request, report hourly
Storage (GB)
S3 ListObjectsV2 per tenant prefix, report daily
Active Seats
COUNT(users) WHERE last_login > 30d, report monthly
Credit System
Pre-paid credit models (common in AI SaaS) require a credit_balance table per tenant. Decrement atomically using a PostgreSQL transaction with SELECT ... FOR UPDATE to prevent race conditions. Reject requests when balance reaches zero, and trigger a low-credit notification at 20%.
Proration on Mid-Month Upgrades
When a tenant upgrades their plan mid-billing-cycle, Stripe automatically calculates proration if you use proration_behavior: 'create_prorations' on the subscription update. Capture the proration preview before applying it and surface the cost delta to the tenant in a confirmation modal.
Frequently Asked Questions
Ready to Build Your Multi-Tenant SaaS?
Codazz architects and builds scalable multi-tenant platforms — from model selection to Kubernetes deployment and compliance readiness. Book a free architecture review.
Book Free SaaS Architecture Review →