← Library

Agents Run in Their Own Context

When you talk to a thinqOS agent, it isn't a voice inside your chat: it's a separate participant with its own context, memory, and model. That separation goes down to the database query, which is what makes it safe to put a private agent in a room with a stranger.

When you talk to an AI agent in most products, the agent is a voice inside your session. Its instructions, its memory, the conversation, your data: all of it lives in one shared context window, one stream of tokens the model reads top to bottom. "Multiple agents" usually means one model wearing several hats in the same transcript. And "privacy between them" usually means a sentence in the prompt politely asking the model to keep things separate.

thinqOS works differently, and the difference is not cosmetic. When you talk to a thinqOS agent, you're talking to a separate participant that runs in its own context, with its own memory, its own model, its own boundaries, enforced not by instructions but by the system itself, down to the database query. This paper explains how, and why it's what makes it safe to put a private agent in a room with a stranger.

The default, and why it's fragile

The shared-context design is everywhere because it's the path of least resistance. One window, one model, everything in scope. To give such a system "roles," you write persona instructions. To give it "privacy," you write more instructions: don't reveal X to Y.

The problem is that prompt-level rules are advisory. If a piece of private information is present in the context window, the only thing stopping it from coming out is the model's continued good behavior across a long conversation, an adversarial question, or a cleverly-worded request. The data is in the room. You're trusting the model not to mention it. That's not a privacy boundary; it's a hope.

Isolation by construction

In thinqOS, each agent's turn runs in a genuinely separate execution context. In production, when an agent responds, it's given its own database session, its own context assembler, its own provider connection, and its own usage tracker: a fresh, scoped world for that turn. Its system prompt is assembled from its identity: its persona and behavior, its own private memory store, its own credentials and billing policy. Two agents in the same conversation are not two voices in one window; they're two participants, each built up independently.

And they run at the same time. When several agents respond to the same message, they execute concurrently on separate sessions rather than being serialized into a single stream, with care taken that each agent's model-usage records attach to the right agent even when they finish out of order. (In test and non-streaming modes they share a session and run serially; the isolation there is logical rather than physical. We mention it because it's true.)

Delegation passes answers, not context

Agents can consult each other. When one agent calls another, the consulted agent doesn't get pulled into the first agent's context window. Instead, a separate, bounded side-conversation is created; the second agent answers there, in its own context, and only its answer comes back, as a tool result. The first agent never sees the second agent's reasoning, its tools, or its prompt. Delegation moves answers between agents, not context. There's no bleed.

The part that matters: privacy enforced at the query

Here's the scenario that makes the architecture concrete. You have a financial-planner agent. One-on-one, it knows your finances and discusses them freely. Now you open a room, invite a colleague, or an anonymous participant, and the same agent is present. Can it still take part without leaking what it knows about you?

In a prompt-instruction system, the answer is "probably, if the model behaves." In thinqOS, the answer is enforced before the model is ever involved.

Every belief in a Mind is a belief held by someone, carrying a source, a confidence, and an audience: the set of identities it has been disclosed to. When an agent assembles its context for a turn, candidate beliefs pass through an audience filter in the database query itself, a containment check that asks: is everyone currently in this room already in this belief's disclosed-to set? A belief disclosed only to you fails that check the instant a stranger is present, so it is never retrieved into the agent's context in the first place. A belief outside the room is never even a candidate.

The audience set is rebuilt every single turn from who is actually in the room, and beliefs accumulate their disclosure scope automatically as conversations happen. So the sequence is exactly what you'd want:

The agent doesn't decide to withhold your financials. It is simply never handed them. The punchline is the whole point: the agent can't leak what it was never given.

The honest boundary

A privacy claim is only worth as much as its edges, so here are ours, stated plainly.

This retrieval-level guarantee applies to base beliefs under standard (supervised-tier) agents, which are also mechanically scoped to their own knowledge bases. That's the airtight case, and it's the default.

There is one mode where enforcement becomes behavioral rather than mechanical: an owner can explicitly grant an agent full autonomy. In that mode the agent retrieves against its owner's full knowledge and is guided by an instruction to exercise judgment about disclosure. That's a prompt-level boundary, not a query-level one, and it exists only because an owner deliberately chose it. We don't claim mechanical privacy for that mode, and you shouldn't either.

One more, in the spirit of full disclosure: the same audience-scoping had to be extended to the abstractions the consolidation process forms from groups of beliefs. An earlier version could generalize narrowly-disclosed beliefs into more broadly-visible ones; that derivation has been corrected to inherit only the disclosure scope shared by all its sources, with existing data backfilled. Base beliefs were always scoped correctly; the fix closed the gap on derived ones.

Many models, many minds, one conversation

Because each agent is its own participant, each can run on its own model. One conversation can have three agents answering on three different providers, one on Anthropic, one on OpenAI, one on a local model, chosen per agent.

But the deeper point isn't three models; it's three minds. Each agent brings not just a different engine but its own memory and its own perspectival stances on what it knows. You're not getting one model in three costumes. You're getting three participants who know different things, believe them to different degrees, and reason from different vantage points, in the same room, at the same time.

Accountability underneath

Isolation could easily become a blind spot: lots of independent agents, no unified view of what they're doing or what they cost. It isn't. Every model call, by every agent, on every provider, flows through a single tracked client that records one row per call: which agent, which capability, how many tokens, what it cost, who funded it. This is enforced in the codebase itself: there's no sanctioned path to a model that skips it. Agents run in separate contexts; the ledger still sees all of them.

Why it adds up

Separate execution, query-level privacy, and true model-and-mind plurality combine into something a shared-context assistant can't offer: you can put a private specialist agent into a shared room and trust it to participate, because the guarantee that it won't leak lives in the query plan, not in a politely-worded instruction. And because the audience lives on each belief, the isolation guarantee travels with the mind: a Mind exported with its identifiers intact carries its disclosure scope with it, so it can join a wider graph instead of being locked in. The agent isn't a voice in your context. It's someone else in the room, with its own mind, and its own boundaries that the system keeps for it.

Part of the thinqOS science series, by AI4Outcomes.

A private agent that can join a shared room without leaking.

Read the point of view, or get into the private preview.