Architecture

•12 min read

Designing a multi-tenant B2B SaaS data model when you have ten customers, not ten thousand

A pragmatic decision guide for early-stage B2B SaaS teams choosing between shared-schema, schema-per-tenant, and database-per-tenant data models. Includes a default that survives the first hundred customers, the five markers that should push you to isolate, and how to design today so the migration is cheap when it comes.

Author: By DevLume
Published: Published 22 May 2026

Key takeaways

For a B2B SaaS with a handful of customers, the right default is one Postgres database, one schema, a tenant_id column on every tenant-scoped table, and Row-Level Security enforcing isolation in the database itself (AWS Prescriptive Guidance).

Schema-per-tenant looks tidy at five tenants and turns into a pg_dump and migration nightmare at five hundred. It is the model most teams regret first.

Database-per-tenant is rarely the right answer below twenty tenants — connection-pool math punishes it before you have a sales team to justify it (Crunchy Data).

Microsoft's own guidance flags the cost up front: "Switching to a different model later is sometimes costly" — so design the seams now even if you do not switch the model (Microsoft Learn).

The signals that justify isolating a tenant are specific and individually checkable: a contractual data-residency clause, a compliance regime your platform cannot satisfy in a pooled model, a single tenant whose workload starves the rest, or a customer paying enough that a dedicated database is cheaper than the negotiation.

TL;DR

If you have ten B2B customers, pick the pooled model. One database, one schema, tenant_id on every tenant-scoped table, Postgres Row-Level Security enforcing that no query ever returns another tenant's row. Set the tenant on every connection at the application boundary, not inside business logic. Resist the urge to give each customer their own schema or database "because it's cleaner" — the cleanliness is illusory and the operational tax shows up later, usually at the worst possible time. Spend the engineering budget you save on three things: a tenant-aware connection layer you trust, a migration tool that runs once per database (not once per tenant), and a way to promote a single tenant out of the pool to a dedicated database when one of five specific markers fires. Treat the pooled model as your default and the isolated model as an exception you can justify, not a starting position.

Why this is the wrong question to read other blog posts about

Most multi-tenant SaaS content on the web is written for an audience that does not exist yet. Search results are dominated by AWS reference architectures, Microsoft Azure SQL elastic-pool guides, and Snowflake-scale data platform pieces. They are not wrong, but their reader has thousands of tenants, a platform team, and a compliance officer. You probably have ten paying customers, two engineers, and a roadmap that does not have "rework the tenancy model" anywhere on it.

The cost of getting tenancy wrong is asymmetric. A pooled model that you eventually have to split per tenant is annoying but tractable — the data has a tenant_id on it and you can ETL it out one customer at a time. A schema-per-tenant model that you have to consolidate, or a database-per-tenant model that you have to operate before you have the headcount, can sink a quarter. So the decision worth making carefully at ten customers is not "which model wins at a thousand" — it is "which model is cheapest to live with at ten and cheapest to escape from if I am wrong."

The three models, briefly and honestly

Vendor documentation calls these things by different names. AWS uses silo, pool, and bridge (AWS). Microsoft uses standalone, database-per-tenant, and sharded multi-tenant (Microsoft Learn). The Postgres community usually says shared schema, schema-per-tenant, and database-per-tenant. They map roughly onto each other. The cleanest mental model is the third one.

Shared schema (pooled). One database, one set of tables, a tenant_id column on every row that belongs to a tenant. Every query filters by tenant. Isolation is enforced by the database via RLS, not by the application. Microsoft's own scaling table puts this model at "1–1,000,000s" of tenants — it has the widest range of any option (Microsoft Learn).

Schema-per-tenant. One database, one Postgres schema per tenant, the same tables created inside each schema. The application sets search_path based on the current tenant. Isolation is structural — no rows can leak because no tenant's tables are visible to another tenant's session. This sounds appealing for exactly as long as it takes to realise that every migration now runs N times, that pg_dump performance degrades non-linearly with schema count, and that the Postgres system catalogues are not designed for thousands of objects of the same shape.

Database-per-tenant (siloed). One Postgres database per customer, sometimes one cluster per customer. Maximum isolation, maximum operational overhead. The connection-pool math is brutal: PgBouncer pools are calculated per (user, database) combination, which means a database-per-tenant model will exhaust your max_connections long before you exhaust your CPU (Crunchy Data). AWS frames the threshold plainly: "If you have 20 siloed accounts for each of your tenants… that may be manageable. However, if you have a thousand tenants, that would likely begin to impact operational efficiency and agility" (AWS).

The default for ten customers: pooled with RLS

I have set this up enough times to call it the default. Pick the pooled model, and make Row-Level Security non-negotiable. AWS does not soften this — it says RLS "is required to maintain tenant data isolation in a pooled model with PostgreSQL" and that it "centralizes the enforcement of isolation policies at the database level and removes the burden of maintaining this isolation from software developers" (AWS Prescriptive Guidance).

The reason this matters is operational, not theoretical. The classic multi-tenant disaster is a missing WHERE tenant_id = $1 in a query somewhere in your codebase. RLS turns that from a data-leakage incident into a returned-empty-set bug — the database engine adds the predicate itself. AWS describes it as "an automated WHERE clause that the database engine manages itself… every SQL statement your developers write will look the same, regardless of tenant context, and PostgreSQL enforces isolation for you" (AWS Database Blog). That property — that you cannot accidentally write a leaky query — is what makes pooled tenancy survivable past customer five.

The pattern, in practice, is a session variable plus a policy. Set app.current_tenant_id on every connection checkout from your pool. Each policy compares the row's tenant_id against that variable. Craig Kerstiens, formerly of Heroku Postgres and Citus, summarises the approach succinctly: "Keep in mind that when designing your app for multi-tenancy, ideally you have that org_id in every table… Instead of a user per customer, what we're going to do is set a session variable when we connect" (Crunchy Data).

Two things are easy to get wrong here. First, the tenant variable must be set before the application code is allowed to issue queries — typically in a connection-acquisition middleware, not inside route handlers. Second, your superuser and migration roles must explicitly bypass RLS, but your application role must not. Test this with a hostile query: log in as your app user, set the tenant variable to one value, attempt to read a row belonging to another tenant by primary key, and confirm you get nothing back. That test should live in CI forever.

Why I will not start you on schema-per-tenant

The model that seduces small teams most often is schema-per-tenant. It looks structurally clean. Each customer's data is visibly partitioned. You can drop a customer by dropping a schema. Recoveries feel scoped. The trouble is that the appeal is aesthetic, and the costs are operational and they grow with you.

Three problems show up in this order. Migrations multiply: every schema change runs N times, against N schemas, and your migration tool needs to know how to do that transactionally. The complexity of that orchestration tends to be underestimated until the first time a migration fails halfway through tenant 47. Connection pooling stays a problem because search_path is per-session state, which means pooler reuse becomes risky if you do not reset it carefully. And the Postgres system catalogues — pg_class, pg_attribute, pg_namespace — accumulate one entry per object per schema. At a hundred schemas with twenty tables and forty indexes each, you have hundreds of thousands of catalogue rows, and operations that scan the catalogue (including pg_dump, schema introspection in ORMs, and some autovacuum work) get slower in ways that are hard to diagnose.

There are legitimate reasons to choose schema-per-tenant — strong logical isolation without separate clusters, the ability to drop a schema for compliance deletion, simpler per-tenant backup semantics. But none of those reasons apply at ten customers in a way the pooled model cannot also handle. RLS gives you isolation. Tombstoning a tenant gives you deletion. Logical backups can be scoped by tenant_id. You are paying the schema-per-tenant tax to solve problems you do not yet have.

Why database-per-tenant is rarely the right starting answer

Database-per-tenant is the maximum-isolation answer and it is genuinely cheaper to operate now than it was five years ago. Serverless Postgres providers have made the "one project per customer" pattern viable; Neon, for instance, positions database-per-tenant as the natural solution to data residency and HIPAA constraints, and notes that some customers operate hundreds of thousands of databases via their API (Neon). That is real, and it is a meaningful shift in the tradeoff space.

It is still not where most small B2B SaaS teams should start. The reasons are mostly the things vendor blogs do not lead with: managing N migrations, N sets of credentials, N monitoring targets, N backup configurations, and N upgrade windows. The work scales linearly with customers — exactly the kind of work that breaks small engineering teams. Cross-tenant reporting becomes a federated query problem instead of a GROUP BY tenant_id. New-feature rollouts get gated on per-tenant migration completion. The first time you have to ship a hot-fix and your CI takes forty-five minutes to migrate all the tenants serially, you will understand why pooled SaaS exists.

If your customer count is genuinely growing past a few hundred and your isolation requirements are real, you will end up somewhere in the bridge model: a default pool plus a small number of dedicated databases for the customers who actually need them. Notion ended up at "480 logical shards distributed across 32 physical databases" sharded by workspace ID, but they got there from a single monolith and only after VACUUM stalls became a routine production incident (Notion). They did not start there. Neither should you.

The five markers that should push you to isolate

Isolating a tenant is a specific decision with a specific cost. It is worth doing when you have a specific reason. Below are the five markers that, in my experience, actually justify lifting a customer out of the pool. Any one of them is enough.

1. A contractual data-residency clause you cannot satisfy in the pool. A customer's contract requires their data to be physically stored in Germany, or Australia, or US-East. If your pooled database is in a single region, you cannot honour that clause without isolating that tenant into a database hosted in the target region. This is the most common driver in regulated industries and the cleanest justification.

2. A regulatory regime that demands provable isolation. HIPAA, FedRAMP, certain SOC 2 controls under specific customer contracts, or a national data-protection regulator who has decided that "logical isolation via RLS" is not the same thing as "physical isolation." Microsoft's compliance guidance walks through how isolation level maps to these regimes (Microsoft Learn). Treat the question as a contractual and audit one, not a technical one — your security team's answer to "can you prove this customer's data never touches another customer's compute" is what determines whether you have to isolate.

3. A single noisy-neighbour tenant whose workload is starving the rest. Sometimes one customer is ten times the size of the next-largest. Their analytics queries hold locks; their writes saturate IOPS; their full-table scans evict everyone else's cache. The fix is not "buy a bigger instance" indefinitely — at some point it is cheaper to give that customer their own database.

4. A contract big enough that a dedicated database is cheaper than the negotiation. A six-figure enterprise customer asks for "dedicated infrastructure." The actual cost of a managed Postgres instance is small. The cost of arguing for two weeks about what "dedicated" means in your pooled architecture is large. Just give them the database. This is a commercial answer dressed up as a technical one, and that is fine.

5. A measurable scaling marker on the shared database. Figma's engineering team published a useful threshold from their own scaling story: by early 2022 their primary database was "approaching 65% CPU utilization during peak traffic," which is the point at which they began vertical partitioning (Figma). Pick a marker like this and write it down — sustained peak CPU above a threshold, replication lag above a threshold, P99 query latency on a critical table above a threshold — so that the decision to start isolating becomes an operational trigger, not a vibe.

What is conspicuously absent from this list is "we have a hundred customers and it feels like time." Customer count alone is not a marker. You can run a pooled model into the tens of thousands of tenants with good schema design; you should isolate a single tenant on month one if their contract requires it.

How to design today so isolation is cheap tomorrow

If the decision is "pooled now, with the option to isolate a tenant later," the cheap-to-escape version of that has four properties. Build them in from day one — they are nearly free at the start and expensive to retrofit.

First, every tenant-scoped table has tenant_id NOT NULL, indexed, and present on every secondary index that you query through. This is the property that makes a per-tenant export a one-line query instead of a multi-week ETL.

Second, the application acquires connections through a layer that knows the tenant, and that layer sets the session tenant variable before returning the connection to the caller. The application's data-access code should never see a tenant-less connection. If you cannot enumerate the call sites that set tenant context on one hand, you have a leak waiting to happen.

Third, migrations are written once and applied once per database, not once per tenant. This means avoiding any DDL that depends on tenant-specific naming. If you ever feel tempted to write CREATE TABLE customer_42_orders, stop.

Fourth, a tenant lift-and-shift script exists — a documented procedure for exporting one tenant's data from the pooled database, restoring it into a fresh dedicated database, switching the application's tenant-to-database routing, and decommissioning the source rows. It does not need to be automated on day one; it needs to exist as a runbook and to have been tested once on a real tenant in staging. The first time you need it will be under contract pressure with a deadline. You do not want to be designing it then.

These four properties cost almost nothing to implement at the start of a project. They cost a quarter of engineering work to retrofit two years later.

What to skip at ten customers

A short list of things vendor blogs will tell you to do that you should explicitly not do yet:

Sharding. You have ten tenants. Sharding solves problems that begin somewhere north of the ones you have. A correctly designed pooled model can be sharded later, by tenant, with the same effort either way.
A bespoke per-tenant configuration service. Until you have at least three configuration values that genuinely vary per tenant, a tenant_settings table is enough.
A "schema-evolution framework" that abstracts over your migration tool. Pick one migration tool and run its CLI. Frameworks-over-frameworks are pure cost at this stage.
Cross-cluster replication for analytics. If you have ten customers, a nightly logical dump to a separate reporting database is enough. Real-time analytics infrastructure should wait until a customer asks for it in a contract.

Save the budget for the four properties above. They are the ones that determine whether next year's tenancy decision is a Tuesday afternoon or a re-architecture.

Closing

The trap with multi-tenant data models is treating the decision as if it had to be made at FAANG scale. It does not. At ten customers, the best decision is the boring one: pooled tenancy with RLS, a tenant column on every table, connection-level isolation, and a written escape hatch for the customers who eventually justify a dedicated database. The model that wins at ten customers is also the model that survives to a thousand, provided you put the seams in early. The models that look cleaner in the architecture diagram tend to be the ones that cost most in the operational diagram, and you find out which is which the year you can least afford to.

DevLume designs and modernises data architectures for B2B platforms at exactly this stage — happy to talk if you would like a second pair of eyes on the tenancy model before you ship it.