
Human-in-the-loop is not a checkbox that makes an AI workflow safe.
It is an architecture pattern.
If the human reviewer receives a vague AI recommendation, no source evidence, no risk tier, no policy context, no rollback path, and no audit trail, the review step is just theater. The system has inserted a person into the workflow without giving that person the authority, information, or interface needed to make a responsible decision.
The practical question for a CIO, DSI, CISO, enterprise architect, AI platform team, or product engineer is:
How should human approval be designed when an AI agent can affect money, customers, employees, security, legal exposure, infrastructure, or physical operations?
My answer: treat human approval as a controlled state transition, not as a button after the model response. The workflow needs risk tiering, identity, evidence packs, policy checks, separation of duties, approval scopes, timeout behavior, override rules, rollback paths, and replayable audit logs before the AI action is executed.
This article extends the control-plane model from AI governance architecture, the abuse-case framing from threat modeling enterprise AI agents, and the retrieval controls from RAG governance. It also reuses the JSON-contract discipline from AI agent architecture: the model may propose, but deterministic systems must decide what is allowed.
Key takeaways
- Human-in-the-loop approval is only useful when the system defines what is being approved, by whom, for which risk tier, with which evidence, and under which policy.
- Approval should be a state machine around the AI workflow, not a UI button attached to an answer.
- The approver needs an evidence pack: user request, retrieved sources, model output, tool call, risk tier, policy decision, expected impact, rollback option, and alternatives.
- High-risk workflows need separation of duties. The person who requested the action should not always be the person who approves it.
- Approval logs must be replayable. Store the decision context, not only “approved=true”.
- The durable artifact is an approval matrix that maps AI workflow risk to required evidence, approver role, allowed actions, timeout behavior, and audit fields.
Citation-ready answer
Human-in-the-loop approval for enterprise AI is the control architecture that forces high-risk AI actions through an explicit review state before execution. A useful design includes risk tiering, identity checks, policy-as-code, evidence packs, approver authorization, separation of duties, timeout and escalation rules, rollback paths, and immutable audit events. The goal is not to make a human rubber-stamp the model; it is to make high-impact AI actions accountable, reviewable, reversible, and enforceable before they touch production systems.
Start with action risk, not model risk
The first mistake is asking whether the model is risky.
Ask what the workflow can do.
An AI assistant that summarizes a public document is not in the same category as an AI agent that can approve a refund, close a support case, modify an IAM policy, change a price, submit a purchase order, email a customer, deploy code, or update an HR record.
Use the action as the unit of control:
| AI workflow action | Risk driver | Approval default |
|---|---|---|
| Draft a response | Reputational or accuracy risk | Optional review based on domain |
| Send an external email | Customer impact and disclosure risk | Human approval before send |
| Update CRM field | Data quality and business process risk | Approval or sampled review |
| Approve refund | Financial impact | Approval above threshold |
| Change IAM permission | Security blast radius | Mandatory security approval |
| Modify production config | Availability and incident risk | Change-management approval |
| Trigger physical operation | Safety and equipment risk | Supervisor plus local safety gate |
This maps directly to the way the NIST AI Risk Management Framework asks teams to govern, map, measure, and manage AI risk. In engineering terms, the approval design should be driven by the actual consequence of the action, not by a generic “AI risk” label.
The approval state machine
A high-risk AI workflow should not jump from model output to execution.
Use an explicit state machine:
1 | requested |
Each transition should have an owner and a log event.
| State | System responsibility | Human responsibility |
|---|---|---|
requested | Capture user, session, task, and intent | Provide clear instruction |
classified | Assign risk tier and action type | None |
evidence_collected | Assemble sources, tool inputs, model output, impact | Review evidence quality |
policy_checked | Apply RBAC/ABAC, limits, and policy-as-code | None unless exception |
pending_approval | Route to authorized approver | Approve, reject, request changes, or escalate |
executed | Call tool only within approved scope | Monitor result if required |
verified | Check execution outcome and drift | Confirm business result if needed |
audited | Store replayable event chain | Accountable sign-off for high-risk cases |
The key is that approval is a workflow state, not a comment in a chat thread.
Build an approval matrix
Every enterprise AI platform needs an approval matrix before it scales agents across departments.
| Risk tier | Example | Required approver | Required evidence | Execution rule |
|---|---|---|---|---|
| Tier 0: no external effect | Summarize internal non-sensitive content | None | Source citation | No approval |
| Tier 1: low business effect | Draft a support reply | Requesting user or team queue | Draft, source snippets, confidence warning | Human send |
| Tier 2: reversible system update | Update CRM stage | Data owner or process owner | Before/after diff, source record, rollback path | Execute after approval |
| Tier 3: financial or customer impact | Issue refund, send contract clause | Manager or delegated owner | Impact estimate, policy match, customer record, alternatives | Approval plus threshold check |
| Tier 4: security or production impact | Change IAM, deploy config | Security/change approver | Risk analysis, diff, test result, rollback plan | Dual approval and change window |
| Tier 5: safety or regulated impact | Trigger physical process or regulated decision | Authorized operator and risk owner | Full evidence pack, hazard check, manual override path | Human-supervised execution only |
The exact tiers will differ by company, but the pattern is stable: higher consequence requires stronger identity, better evidence, stricter approver authorization, clearer rollback, and more durable audit.
The NIST Generative AI Profile is useful here because it emphasizes governance around generative AI risks such as confabulation, misuse, data leakage, and operational impact. The approval matrix is one way to turn that risk language into runtime control.
The evidence pack
The reviewer should not approve “the AI answer.”
The reviewer should approve a bounded action with evidence.
A useful evidence pack includes:
| Evidence field | Why it matters |
|---|---|
| Requesting identity | Establishes accountability and access context |
| AI system identity | Identifies which agent, model route, and app produced the request |
| Action type | Separates drafting, reading, writing, sending, deleting, approving |
| Target system | CRM, ERP, IAM, ticketing, code repo, customer email, robot supervisor |
| Risk tier | Determines approval rule and audit depth |
| Input summary | Shows what the user or workflow asked for |
| Retrieved sources | Shows grounding and source authority |
| Proposed action | Shows exactly what will happen |
| Before/after diff | Makes system changes reviewable |
| Policy decision | Shows which policy allowed or blocked the action |
| Alternatives | Gives the reviewer lower-risk choices |
| Rollback path | Explains how to undo the action |
| Expiration time | Prevents stale approval reuse |
Without this pack, approval quality collapses. The reviewer either trusts the model blindly or redoes the entire investigation manually. Both patterns fail at scale.
This is where RAG governance becomes important. If retrieved sources are stale, unauthorized, or low authority, the approval screen should show that. A human cannot make a strong decision from weak evidence.
Approval scopes must be narrow
Approvals should be scoped like credentials.
Bad approval:
1 | Thomas approved the AI agent. |
Useful approval:
1 | approver_id: user_482 |
That scope prevents approval reuse. It also makes the audit log meaningful.
Approving a draft email should not authorize the agent to update the customer record. Approving a refund up to 250 EUR should not authorize a refund of 2,500 EUR. Approving a production change during a maintenance window should not authorize the same change next week after the system state has changed.
The common approval anti-patterns
| Anti-pattern | Why it fails | Better pattern |
|---|---|---|
| Rubber-stamp approval | Humans click approve because the UI gives no evidence | Evidence pack with explicit risks and alternatives |
| Approval after execution | Human review becomes post-hoc blame assignment | Approval before irreversible or high-impact action |
| Same user requests and approves | No separation of duties | Route by role, risk tier, and ownership |
| Global approval | One approval unlocks too much authority | Narrow approval token with expiry and scope |
| No timeout | Stale approvals execute in changed context | Expire and revalidate before execution |
| No rollback | Reviewer cannot judge operational risk | Require rollback path for Tier 2+ |
| No audit context | Cannot reconstruct why action happened | Log evidence, policy, approver, and final result |
| Model-written evidence only | Reviewer sees the model’s story, not source data | Include sources, diffs, tool inputs, and system checks |
These are not UX details. They are control failures.
The OWASP Top 10 for LLM Applications 2025 is especially relevant around excessive agency, sensitive information disclosure, and insecure output handling. Human approval should reduce those risks, but only when the approval is attached to a constrained tool call and a real policy decision.
Routing approval to the right human
The “human” in human-in-the-loop is not generic.
Approver routing should use ownership and authority:
| Workflow | Approver should be |
|---|---|
| Customer refund | Support lead or revenue operations owner |
| Contract clause draft | Legal reviewer or contract owner |
| IAM permission change | IAM owner or security approver |
| Production config change | Service owner plus change manager |
| HR record update | HR data owner |
| Vendor payment | Finance approver with limit authority |
| Robot task execution | Authorized operator or safety supervisor |
This is a governance design issue. The AI platform can provide the routing engine, but business owners must define who is allowed to approve each class of action.
If nobody owns the approval route, nobody owns the risk.
Policy-as-code before approval
A human reviewer should not be asked to catch every rule violation manually.
Policy checks should run before the approval screen:
1 | AI proposed action |
The reviewer should see the result of those checks.
If the policy engine says the user cannot access the customer record, the action should not reach a normal approval queue. If the requested refund exceeds the manager’s limit, the workflow should route upward or block. If the model uses a low-authority source, the evidence pack should warn or require a stronger source.
The NCSC guidelines for secure AI system development are useful because they frame secure AI across design, development, deployment, operation, and maintenance. Approval is part of operation, but it must be designed into the system from the start.
Audit logs must reconstruct the decision
For high-risk workflows, this is not enough:
1 | {"approved": true} |
A useful audit event chain should include:
| Event | Minimum fields |
|---|---|
| AI action proposed | agent ID, model ID, prompt template, tool name, target object |
| Risk classified | action type, risk tier, classifier version, policy version |
| Evidence assembled | source IDs, retrieval timestamps, before/after diff, confidence flags |
| Policy checked | RBAC/ABAC decision, data classification, threshold result |
| Approval requested | approver role, approver queue, expiration time |
| Approval decision | approver ID, decision, reason code, changes requested |
| Execution performed | tool call ID, parameters hash, target system response |
| Verification completed | result status, rollback used, incident link if any |
This log is for more than compliance. It is how engineering learns which approval rules are too loose, too noisy, or too slow.
It also supports incident response. If an AI agent sends the wrong email, updates the wrong field, or triggers the wrong workflow, the team needs to know whether the failure came from retrieval, model output, policy classification, human review, tool execution, or post-execution verification.
Human oversight is not the same as human burden
Some teams push every AI action into a human queue and call it safe.
That does not scale. It also creates alert fatigue.
Use risk-based approval:
| Pattern | Best for | Risk |
|---|---|---|
| No approval, full audit | Low-risk read-only work | Hidden drift if never sampled |
| Sampled review | High volume, low impact updates | Rare edge cases may pass |
| Approval before external action | Emails, customer messages, tickets | Slower workflow |
| Approval before system write | CRM, ERP, HR, IAM, production config | Queue bottleneck |
| Dual approval | Security, finance, regulated, safety impact | Expensive but defensible |
| Human-supervised execution | Physical systems or live operations | Requires trained operator |
The goal is not maximum friction. The goal is proportional control.
The EU AI Act is not an engineering manual, but its high-risk system framing is a useful reminder: human oversight has to be effective in practice. Engineering teams should translate that into reviewable evidence, real authority, intervention paths, and traceable decisions.
Implementation checklist
Before shipping a high-risk AI workflow, confirm:
- The action type is explicit: read, draft, send, update, delete, approve, execute.
- The workflow has a risk tier.
- The approver role is defined by ownership, not convenience.
- The approver can see source evidence, policy checks, and before/after diffs.
- The approval is scoped to one action, target, time window, and threshold.
- The workflow expires stale approvals.
- The policy engine runs before the human approval step.
- The tool call cannot exceed the approved scope.
- Rollback or compensation is defined for Tier 2+ actions.
- Audit logs can reconstruct request, evidence, policy, approval, execution, and verification.
- Rejected approvals are used to improve prompts, policies, classifiers, or workflow design.
If the team cannot satisfy those points, the workflow may still be a prototype, but it should not be treated as a governed enterprise AI workflow.
FAQ
Is human-in-the-loop enough to make an AI agent safe?
No. Human-in-the-loop only helps when the person has the right authority, evidence, interface, and time to review the action. The surrounding system still needs identity, permissions, policy checks, scoped approval, audit logs, and rollback.
What should the human approve: the answer or the action?
Approve the action. A model answer is text. A high-risk workflow action has a target system, side effect, scope, and consequence. The approval screen should show exactly what will happen if the reviewer approves.
When should approval be mandatory?
Approval should normally be mandatory for external communications, irreversible actions, regulated decisions, financial thresholds, production changes, security changes, sensitive data disclosure, and any workflow that affects physical systems or safety.
Can approval be automated later?
Some approvals can move to sampled review or policy-only execution after enough evidence, low incident rate, and clear rollback. But that should be an explicit risk decision, not a shortcut because the approval queue became annoying.
Who owns approval rules?
The AI platform team can implement the approval engine. Business process owners, IAM, security, legal, finance, HR, or operations should own the rules for their domains. The DSI or CIO function should make that ownership explicit.
What is the biggest design mistake?
The biggest mistake is putting a human approval button after a vague AI recommendation. The reviewer needs a bounded action, evidence pack, risk tier, policy result, rollback path, and audit context. Otherwise the approval is not a control.