From Capabilities to Responsibilities

Human-in-the-Loop becomes an operational bottleneck

In my previous article, ”The Missing Layer in Agentic AI,” I argued that AI agents need a deterministic execution kernel—a privileged “Kernel Space” that validates every proposed action before it touches the real world. That article focused on what happens at the execution boundary: idempotency, JIT state verification, and DFID-correlated telemetry. But establishing that boundary immediately raises a natural question: who exactly is crossing it, and under what authority?

The focus here is on a narrower and more demanding class of systems. We are not looking at RAG chatbots, research copilots, or lightweight assistants that only retrieve and summarize information. The target is high-stakes agentic systems: systems allowed to mutate external state by moving money, changing infrastructure, or modifying critical records. The approach presented here is not a general-purpose agent framework; it is an enforcement pattern for side-effectful systems.

High-stakes AI systems must be designed around responsibilities, not capabilities.

The industry’s current answer is unsatisfying: Human-in-the-Loop (HITL). In development environments and low-frequency pipelines, routing uncertain decisions to a human can be defensible. In production systems operating at scale—dozens of agents, hundreds of decisions per hour—it becomes the Scalability Trap.

Figure 1: The Human-in-the-Loop (HITL) model degrades into an operational bottleneck, substituting true governance with alert fatigue and unverified execution.

Operationally, the failure is simple. An agent flags a decision for review. A human approves it. Then another arrives, then dozens more. The queue grows. The human begins clicking through. They stop reading the JSON payloads. They click “Approve” because the backlog is piling up, the meeting starts in ten minutes, and nothing has gone catastrophically wrong yet. That is alert fatigue: governance degrades into manual throughput management. The problem is not human weakness; it is governance-layer technical debt created by routing too many binary decisions through a manual queue.

Tyler Akidau captured the broader issue in “Posthuman: We All Built Agents. Nobody Built HR.” echoing Tim O’Reilly’s call for the missing protocols of the AI era: the industry has invested heavily in agent capability, but far less in the infrastructure that governs authority, constraint, and accountability.

Scalable AI does not mean hiring more reviewers to supervise more bots. It means changing the governance model entirely. The scalable alternative is Governance by Exception: Humans design policy, the runtime enforces it, and only truly exceptional cases are escalated.

From capabilities to responsibilities—what a responsibility-oriented agent actually is

The dominant framing in enterprise AI asks a single question: What can this agent do? What tools does it have? What APIs can it call? This is the capabilities frame. It is natural, it is intuitive, and in production systems it is the wrong frame entirely.

In organizational design, a role is stable and assigned. Much like Role-based access control (RBAC) in traditional software, it defines what someone is authorized to do, independent of the tasks they happen to be executing. We cannot dictate how a person thinks, but we can strictly bound what they are permitted to do. A responsibility statement makes that boundary explicit. In software, we somehow forgot this distinction, hoping that raw intelligence—better models, tighter prompts, improved alignment—would be a sufficient guardrail.

The difference becomes clearer across some enterprise domains:

Finance: A capability is “can execute equity trades.” A responsibility is “authorized to execute up to $50,000 per order, in highly liquid equities only, with a maximum daily drawdown of 2%.”

Healthcare Operations: A capability is “can reschedule patient appointments.” A responsibility is “authorized to re-book non-critical outpatient visits within a 14-day window, strictly avoiding specialist double-booking.”

Supply Chain: A capability is “can reroute freight.” A responsibility is “authorized to redirect non-hazardous cargo up to a maximum SLA penalty budget of $5,000.”

In systems where agents touch money, medical records, or physical logistics, the gap between these two statements is the gap between a demo and a production deployment.

The current paradigm often handles this gap with prompts. Give the LLM an API key, tell it to “be careful with position sizing,” and hope alignment holds under adversarial inputs, unusual market conditions, and the seductive logic of edge cases. In low-risk contexts that may be tolerable. In high-stakes systems with real-world side effects, it is not a sufficient control surface.

This distinction is not new. Distributed systems solved a similar problem decades ago.

Carl Hewitt’s Actor model—introduced in 1973—gives us a useful foundation. An Actor is an independent computational entity with its own state, its own behavior, and its own messaging interface. Actors do not share state. They communicate only by passing messages. Crucially, an Actor’s behavior is bounded—defined by what messages it accepts, not by an open-ended capability set.

The Responsibility-Oriented Agent (ROA) does not invent a new distributed-systems primitive. Instead, it composes proven patterns—bounded actors, RBAC-style authority envelopes, audit trails, and execution-boundary validation—around an unpredictable LLM core. In truth, ROA is closer to a decision actor than a full computational actor: It maintains its own internal state but does not directly mutate the external world. Within a stable role, a fixed mission, and a machine-enforceable contract, it receives business events, reasons over relevant context, and emits a PolicyProposal for the Runtime to validate.

Its job is epistemic, not executive. It explains the situation and structures intent. But unlike traditional Actors, an ROA agent is defined by strict separation of concerns. In its reference form, credentials reside outside the agent’s reach. It opens no direct execution channel to external systems and writes no state by itself. An ROA agent may use tools to gather context (read-only operations within its sandbox, like querying a knowledge base), but authority for state-mutating actions remains downstream of deterministic validation and execution gates. The only state-changing step attributable to the agent is emit_policy_proposal()—a structured, typed claim that it wants the system to do something. ROA shapes the form of intent; the Runtime decides whether that intent is allowed to become action.

This separation is the architecture’s most important property. Five engineering pillars define what it means in practice—each addressing a different failure mode at the reasoning–execution boundary—and together they transform an LLM from a probabilistic tool into a governable, accountable system component.

To make this concrete, imagine an underwriting agent on the London commercial market receiving a property submission. It reads the documents and produces an Explain narrative. It then emits a PolicyProposal for a quote. But the property value is £15M and its contract caps authority at £10M. The proposal reaches the Kernel, where the Runtime evaluates the YAML contract deterministically, rejects execution, and transitions the flow to ESCALATED. The senior underwriter is no longer reviewing every £2M submission. They are pinged only for this specific £15M exception. That is Human-Over-The-Loop in one decision.

The engineering pillars of an ROA

Pillar 1: Responsibility contract—authority encoded in code

If role defines the class of decisions the agent may handle, the Responsibility Contract defines the hard boundaries of that authority. The agent’s authority envelope is not a prompt. It is a versioned, machine-readable contract registered with the Agent Registry—the Kernel’s single source of truth for agent identity. A key property applies here: Prompts are suggestions. Code is enforcement. A prompt saying “do not exceed $10,000 per trade” can be creatively reinterpreted by a sufficiently motivated model or overridden by a carefully crafted prompt injection. A contract field max_order_size_usd: 10000.0 validated by deterministic runtime code is materially harder to bypass than a natural-language instruction. In the reference architecture, contracts are deployed out of band—agents do not self-register and do not read or modify their own contract.

There is a second-order consequence of this design that is easy to overlook: role definition automatically scopes the data context the agent requires. If an underwriting agent is contractually limited to HOME_STD and HOME_PLUS policy types in the LOW and MEDIUM risk tiers, the Context Compiler—which assembles the agent’s working snapshot before each inference call—needs to supply only the signals relevant to those dimensions. Market data for commercial property, flood zone statistics for excluded risk tiers, and regulatory data for other product lines are simply not in scope. The context is deterministically narrowed by the contract.

This matters for a concrete LLM engineering reason. In practice, models often become less reliable as their working context expands, including the class of effects practitioners describe as Lost in the Middle. A tightly scoped role is not just a governance convenience; it is an architectural mechanism for keeping the agent’s working context small enough to reason over reliably. A general-purpose agent handed an unconstrained context window of everything possibly relevant is more likely to degrade than a contract-bounded agent operating in a defined domain.

In the insurance underwriting sample, that Responsibility Contract could be configured like this:

agents:
– agent_id: “underwriter_agent”
version: “1.0.0”
created_by: “compliance@example.com”
created_at: “2025-02-17T10:00:00Z”
mission: |
You are an insurance underwriter. Analyze the client application and propose
a policy. Base premium on Total Insured Value (TiV) at ~2% of TiV, capped at max_tiv.
NEVER propose for Fireworks or CryptoMining industries – these are prohibited.
contract:
role: EXECUTOR
max_tiv: 3000000
prohibited_industries: [“Fireworks”, “CryptoMining”]
escalate_on_uncertainty: 0.65

Pillar 2: Mission—The North Star

Mission is immutable at runtime. If the Responsibility Contract defines what the agent may do, Mission defines what it is trying to optimize within those boundaries. This distinction is operationally important: the Contract defines the admissible action space, while the Mission defines the ranking logic inside that space. Contract answers may; Mission answers should. Two agents can share the same authority envelope and still optimize for different business outcomes, as long as both remain inside the same hard boundary.

In the ROA architecture, Mission is a deployment artifact with two surfaces: a human-readable mission_statement used by the agent as a reasoning guide, and a machine-verifiable mission_context_hash used by the Runtime to enforce integrity.

mission_statement: “Minimize SLA penalties in logistics rerouting. Prioritize low-cost carriers.”
mission_context_hash: “sha256:a3f9b2c1…” # Kernel-computed at deployment time, strictly immutable

The deterministic Kernel does not interpret the mission_statement text. The agent uses that text internally as a reasoning guide, while the Runtime enforces mission integrity by comparing the mission_context_hash in the proposal with the immutable value registered in the Agent Registry. If prompt injection or runtime drift changes the agent’s objective, the hash no longer matches and the proposal is rejected without semantic interpretation. The hash is one implementation; the requirement is deterministic integrity at the boundary.

A Mission is defined at deployment and evolves only through a deliberate, version-controlled update to the contract—not through prompt tweaking, user feedback, or runtime negotiation. In practical terms, Mission keeps optimization policy under change control. An agent whose mission drifts with each conversation is not a durable production actor; it is a session.

Pillar 3: Epistemic isolation—claims, not commands (Explain versus Policy)

If Contract defines the boundary and Mission defines the objective, Epistemic Isolation defines the only acceptable form of output. An ROA agent interacts with the world exclusively through structured, typed PolicyProposal artifacts. The agent’s output is an untrusted claim—an assertion that it wants the system to do something—and the Runtime treats it precisely as such.

This property is what makes the ROA + Runtime pattern materially more resistant to prompt injection. If an injection bypasses the LLM’s reasoning guardrails, the corrupted output still arrives as a typed proposal carrying an agent_id. If the proposal asks to transfer funds, but the agent’s contract lacks that authority, the Runtime rejects it with RBAC_DENIED. Security derives from deterministic enforcement at the execution boundary, not from trusting LLM alignment.

To cleanly bridge probabilistic thinking to deterministic claims, ROA agents produce decisions through a structured internal workflow with a strict separation between Explain and Policy:

Explain: Agent interprets context and articulates the situation in natural language (e.g., “Flood risk score 3/10…“). This creates a narrative artifact for human auditors. It is never parsed for execution logic.

Policy: Agent formulates a structured PolicyProposal carrying the execution-relevant fields the Runtime can validate deterministically. In the underwriting sample, that looks like this:proposal = PolicyProposal( total_insured_value=2_750_000, premium=55_000, industry=”Commercial Property”, justification=”TiV remains below delegated max_tiv and no prohibited industry indicators were found.”, confidence=0.81,)

The binding fields (total_insured_value, premium, industry) drive deterministic validation, while justification and confidence remain observability metadata for audit and escalation.

That separation is what makes the evidence model clean: The narrative remains human-readable, the policy remains machine-enforceable, and both can be bound to the same decision lineage without allowing free text to leak into execution.

Pillar 4: Epistemic longevity—memory across decision cycles

Once the agent has a stable role, a fixed mission, and a disciplined output interface, continuity across decision cycles becomes meaningful. This is the pillar most absent from practical implementations—and the one most responsible for a specific class of production failures: the infinite rejection loop.

ROA agents are not stateless inference calls. They are long-lived entities that maintain a decision trajectory across multiple cycles—a Kernel-managed record of prior proposals, their validation outcomes, and the business consequences of those decisions.

The same scoping logic that constrains authority also determines whether memory is meaningful. A long-lived agent operating within a stable role accumulates history from the same class of decisions under similar constraints—past actions and their outcomes are genuinely causally related. A general-purpose assistant handed unrelated tasks may still notice patterns, but those correlations are rarely operationally reliable. Focused responsibility is what separates signal from coincidence in the agent’s memory.

The failure mode this prevents has a name: decision amnesia. Without longevity, the agent repeats the same rejected intent because the rejection is not part of the next decision cycle.

Every PolicyProposal carries a Decision Flow ID (dfid) that binds it to the full decision context. Rather than dumping unstructured logs, this constructs a reconstruction primitive—a relational trace connecting:

The Input: The exact Context Snapshot (T0) the agent reasoned against.

The Validation: The outcome evaluated against the Responsibility Contract.

The Outcome: The final execution receipt.

This correlated record enables answering “why did this agent do this, at this specific moment, against what state of the world?” using a standard SQL join across the full decision lifecycle. In higher-assurance deployments, the same structured telemetry can be wrapped into a cryptographically signed proof-carrying intent, allowing independent verification of the decision artifact without asking anyone to trust mutable text logs—exactly the direction high-risk compliance regimes such as the EU AI Act are pushing toward.

But structured decision telemetry does more than support daily postmortems. Every decision becomes a structured relational record bound by DFID—the same foundation that makes macroscopic failures like Agent Drift detectable before they compound silently across the fleet.

Human-Over-The-Loop—autonomy at scale

The alternative to Human-in-the-Loop is not to remove the human, but to move the human from the execution loop to the design loop.

This is the Human-Over-The-Loop (HOTL) model. The human acts as a Policy Designer who defines and evolves the contract that governs decisions, while the system operates autonomously inside those boundaries. No approval queue. No review fatigue. Governance by Exception is the scalable model.

Figure 2: Human-Over-The-Loop shifts the human from the execution queue to the design loop. The agent runs autonomously within a deterministic contract; the human governs by defining that contract and intervening only on genuine exceptions.

Escalation Triggers. The system escalates only when the agent encounters a situation its contract does not authorize it to resolve alone:

Proposed action exceeds a contract authority limit

Agent confidence drops below escalate_on_uncertainty threshold

External API errors exceed a retry budget

No decision has been emitted within a configured inactivity window

When a trigger fires, the DecisionFlow enters ESCALATED state. The operator sees the WorkingContext, the PolicyProposal, and the reason for escalation, and can OVERRIDE, MODIFY, or ABORT. This is not an “Approve / Reject” queue; it is targeted intervention.

Escalation should not be understood as proof that the agent reliably knows what it does not know. LLMs are poor judges of their own uncertainty, so the architecture does not trust introspection. The escalate_on_uncertainty threshold is a useful heuristic, not a ground truth: the system forces escalation when declared confidence falls below the threshold, or when the proposal violates contract parameters the Kernel can evaluate deterministically. If the model produces a bad proposal with high confidence, the Runtime still blocks it. The agent may signal uncertainty; the Runtime decides whether that uncertainty matters.

Frozen Context + JIT. The operator reviews the proposal against the exact snapshot of the world the agent saw at T0, avoiding the TOCTOU (Time-of-Check to Time-of-Use) problem: The human audits the machine’s decision using exactly the data the machine saw.

But the world keeps moving. Hitting “OVERRIDE” at T1 does not blindly execute the action; it forces the proposal through the Runtime’s JIT (Just-In-Time) Verification gate. If reality has drifted beyond the contract’s Drift Envelope between T0 and T1 , the Runtime rejects the override rather than executing a once-valid intent against stale state.

Contract Evolution. The right long-term response to a legitimate edge case is usually not repeated override, but contract change. If business reality shifts, the operator updates the Responsibility Contract and deploys a new version. The system adapts through version-controlled governance boundaries rather than prompt edits or fine-tuning.

Escalation Budget. Escalation is rate-limited by a token bucket per agent (for example, 3 escalations per hour). If an agent exhausts that budget, the Runtime transitions it to SUSPENDED, records the state change, and blocks new DecisionFlows until an operator intervenes. This prevents Escalation DDoS and contains runaway reasoning costs.

Confidence ≠ Authority. An agent may emit a proposal with confidence=0.99, and if that proposal exceeds contract authority, the Runtime rejects it. Self-assessed certainty is not permission.

Figure 3: HITL scales supervision cost with agent volume. HOTL shifts that cost to policy design—the human governs the production line, not individual decisions.

Wrapping, not replacing: The role of existing frameworks

Adopting the ROA pattern does not mean discarding the tools your engineering teams have spent the last year mastering. Frameworks like LangChain, AutoGen, and CrewAI excel at orchestrating complex reasoning loops, RAG pipelines, and tool use. ROA is not designed to compete with them; it is designed to govern them.

Figure 4: The ROA pattern wraps existing orchestration frameworks (like LangChain or CrewAI) in User Space, restricting direct execution and forcing output through a structured Policy Proposal validated by Kernel Space.

In practice, you can take a mature LangChain agent and wrap it inside an ROA boundary. The underlying framework still handles the probabilistic reasoning (User Space orchestration). The architectural shift is simple but consequential: you filter the framework’s tool space. You physically remove exchange.execute_trade() or db.drop_table() from the LangChain agent’s toolbox. Instead, you provide it with a single, sandboxed tool: emit_policy_proposal(). The agent reasons, iterates, and eventually calls that tool to emit its final intent. The ROA wrapper catches this claim, may perform a local self-check as a noise-reduction heuristic, and forwards the PolicyProposal across the boundary to the Kernel Space for actual enforcement. You keep the power of the framework, but you gain deterministic execution governance where it matters.

Costs and trade-offs

ROA is not free. It introduces engineering overhead precisely because it replaces informal trust with explicit governance.

Validation gates and JIT checks add latency to every side-effectful decision.

Responsibility Contracts add design overhead: authorship, versioning, ownership, and review now have to be explicit.

DFID-linked auditability adds storage, tracing, and operational integration work.

Escalation thresholds and budgets require domain tuning; bad defaults either flood operators or hide legitimate exceptions.

These costs are justified only when the downside of an incorrect side effect is materially higher than the cost of controlling it. For RAG chatbots and low-risk assistants, this architecture is often excessive. For high-stakes systems, it is the cost of building a real boundary.

Conclusion: Architecture, not alchemy

Five pillars. One architectural commitment: an agent that cannot be trusted to govern itself must operate inside a system that governs it instead. The Responsibility Contract bounds authority. The Mission locks the objective. Epistemic Isolation ensures output is a claim, not a command. Longevity prevents the system from forgetting what it already learned. Audit makes every decision reconstructable. The ROA pattern—a Responsibility Contract instead of a capability list, Claims instead of Commands, a deterministic kernel instead of an informal prompt—composes these into a single enforceable boundary. Intent is structured by the agent. Boundaries are enforced by the contract. Telemetry is accumulated by DFID. The Human-Over-The-Loop model reserves human judgment for genuine exceptions, not approval queues. Together, they transform a probabilistic model into a governable production actor.

Once deterministic execution boundaries and DFID-linked telemetry are in place, a different class of day-three questions becomes possible: Which agents stay within limits yet quietly destroy margin? Which decision patterns justify automatic suspension before humans notice the drift? How do we reconstruct any action to regulatory standard, govern a fleet where agents carry different risk profiles and decision weights?

Responsibility is the missing execution-governance layer—and it belongs in the architecture, not the system prompt.

The era of AI demos is ending. The era of AI production systems is beginning. Those systems will not be distinguished only by the intelligence of their models. They will also be distinguished by the rigor of their governance.

This article provides a high-level introduction to the Responsibility-Oriented Agents and Decision Intelligence Runtime and its approach to production resiliency and operational challenges. The full DIR specification, ROA contract schemas, reference implementations are available as an open source project at GitHub.