All posts
Engineering 10 min read

Building a multi-tenant agent runtime

How we run thousands of customer agents on shared infrastructure, without leaks, noisy neighbors or surprise bills.

Yaroslav Demir
Principal Engineer
Mar 28, 2026

Why multi-tenant

Every customer running their own dedicated infrastructure is wasteful. Most agents are idle most of the time. Multi-tenant means dramatically lower cost per customer, and the savings flow to pricing.

But multi-tenant comes with hard problems: data isolation, fair scheduling, cost attribution. Get any of them wrong and you have a customer fire.

Data isolation

We use schema-per-tenant in Postgres for all customer data. The query layer enforces tenant scoping at the connection level, there's no API path that can read another tenant's data, full stop.

Vector stores use namespaces per tenant. LLM context never crosses tenants. Agent memories are tenant-scoped at storage time.

Fair scheduling

The noisy neighbor problem: one customer running a huge campaign starves everyone else's inference budget. We use weighted fair queuing on inference, with weights tied to plan tier and recent usage.

Bursts are absorbed by spillover capacity that costs slightly more, billed to the burst customer, not the platform.

#architecture#engineering
Yaroslav Demir
Principal Engineer

Owns platform reliability. 10+ years building high-throughput systems. Will defend Go in any thread.

Try MyChatBot for free

Set up your first AI agent in 10 minutes. No credit card required.

Start free trial