RAG Governance: Source Authority, Access Control, Freshness, and Auditability

RAG Governance: Source Authority, Access Control, Freshness, and Auditability

Retrieval-augmented generation does not make an enterprise AI assistant trustworthy by itself.

It only gives the model more context.

If that context comes from stale policies, duplicated SharePoint folders, unowned PDFs, over-permissive vector indexes, missing retention rules, or documents that lost their access-control metadata during chunking, RAG can make the answer look more authoritative while making the control problem worse.

The useful question for a CIO, DSI, CISO, enterprise architect, or AI platform engineer is not “should we use RAG?”

It is:

How do we govern retrieval so that enterprise AI answers are grounded in the right sources, visible to the right people, fresh enough to trust, and auditable after the fact?

My answer: govern RAG as a production access path, not as a search plugin. The retrieval layer needs source authority, document ownership, RBAC/ABAC inheritance, data classification, freshness SLAs, deletion propagation, prompt-injection isolation, eval gates, and replayable audit traces before the retrieved text ever reaches the model.

This article extends the control-plane model from AI governance architecture and the security workflow from threat modeling enterprise AI agents. It is also related to the JSON-contract discipline in AI agent architecture and the tool-boundary patterns in OpenClaw on Jetson: Memory, Dashboard, MCP, and Secure Local AI Agents. Different stack, same principle: the model can synthesize, but deterministic systems must decide what data it may see and what evidence must be logged.

Key takeaways

  • RAG governance is the architecture that controls which sources may be retrieved, who owns them, who may see them, how fresh they must be, and how each answer can be reconstructed.
  • Do not treat the vector database as a neutral cache. It is a derived data store that must inherit classification, ownership, retention, tenant, and permission metadata from source systems.
  • Access control must be enforced at retrieval time, not only at ingestion time. Permissions change after indexing.
  • Source authority matters more than semantic similarity. A stale but semantically close document should lose to an approved source of record.
  • Retrieved documents are untrusted data, not instructions. RAG systems need prompt-injection isolation, chunk limits, source attribution, output validation, and abuse monitoring.
  • A useful RAG governance design produces durable artifacts: a source authority register, chunk metadata schema, retrieval permission matrix, freshness policy, audit event schema, eval suite, and incident playbook.

Citation-ready answer

RAG governance is the control architecture that ensures an enterprise AI system retrieves only authorized, authoritative, fresh, and auditable knowledge before generating an answer. It combines source ownership, data classification, document-level and chunk-level access control, freshness rules, deletion propagation, prompt-injection defenses, source attribution, evaluation tests, and replayable audit logs. The goal is not simply to improve answer quality; it is to make retrieval behave like a governed enterprise access path instead of an uncontrolled semantic memory.

RAG is an access path, not just a relevance layer

A basic RAG pipeline looks harmless:

1
2
3
4
5
6
user question
-> embedding
-> vector search
-> retrieved chunks
-> prompt assembly
-> model answer

That diagram hides the parts that matter in an enterprise:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
user / agent / workflow
-> identity and session context
-> policy engine
-> query classifier
-> retrieval service
-> source authority register
-> permission filter
-> freshness filter
-> vector index
-> keyword / metadata search
-> context assembler
-> prompt-injection isolation
-> citation pack
-> token budget rules
-> model runtime
-> output validator
-> audit trace

The second diagram is the real system. It decides whether an answer is allowed, not merely whether it is fluent.

The NIST AI Risk Management Framework is useful because it separates AI risk work into governance, mapping, measurement, and management. The engineering translation for RAG is concrete: map your sources, govern who owns them, measure retrieval behavior, and manage incidents when retrieval exposes the wrong thing.

The source authority register

Most enterprise RAG failures start before embeddings.

Teams index “the knowledge base” without deciding which sources are authoritative, which are drafts, which are obsolete, which are personal notes, and which are legally sensitive. Semantic search then gives every chunk a chance to influence the model.

That is backwards.

Start with a source authority register:

SourceOwnerAuthority levelAllowed useFreshness ruleAccess modelRetention
HR policy portalHR operationsSource of recordEmployee policy Q&ARe-index within 24 hours of changeEmployee region + roleMatch HR retention
Security standards wikiCISO officeApproved standardEngineering guidanceRe-index on page publishEngineering + security rolesMatch wiki retention
Sales enablement folderRevenue operationsAdvisoryDrafting supportWeekly refreshSales org only18 months
Legal contract archiveLegalRestricted recordClause lookup with approvalEvent-driven syncMatter team + legalLegal hold aware
Slack exportsWorkspace adminsLow authorityDiscovery onlyDo not answer directlyExplicit approvalShort retention

Authority level should affect retrieval ranking and answer behavior. A source of record can support a direct answer. A low-authority source may only be used as a clue, or may require a caveat and a link to the owner.

This is where many RAG prototypes are too weak. They rank by similarity, then hope citations will fix trust. Citations do not help if the cited source should not have been in the retrieval set.

Chunk metadata is the governance boundary

The chunk is where enterprise controls often disappear.

A source document may have correct permissions in SharePoint, Confluence, Google Drive, ServiceNow, Git, or a DMS. After ingestion, it becomes many chunks with embeddings. If those chunks do not carry the original governance metadata, the vector store becomes a permission laundering machine.

Every chunk should carry a minimum metadata envelope:

Metadata fieldExampleWhy it matters
source_systemsharepoint_hrReconstructs origin and connector behavior
source_iddocument ID or stable URLSupports deletion, re-indexing, and citation
source_ownerHR operationsAssigns accountability
authority_levelsource_of_record, approved, draft, archiveControls ranking and answer confidence
classificationpublic, internal, confidential, restrictedDrives retrieval eligibility
permitted_subjectsroles, groups, tenants, matter IDsEnforces access at retrieval time
retention_policyHR-7Y, legal-hold, delete-on-source-deletePrevents stale derived data
effective_date2026-04-01Detects obsolete policy content
indexed_attimestampSupports freshness checks
hashcontent hashDetects tampering and drift

The OWASP RAG Security Cheat Sheet is direct on this point: access-control metadata has to survive chunking and retrieval. That recommendation is not a compliance nicety. It is the difference between RAG as governed search and RAG as uncontrolled data replication.

The retrieval permission matrix

Do not let the model decide whether a user is allowed to see a chunk.

The retrieval service should receive identity context from the application and evaluate it before chunks enter the prompt.

ContextExample controlDecision
Human identityuser ID, employee type, region, departmentCan this person access this document?
AI system identityassistant ID, service account, environmentIs this AI app allowed to query this collection?
Workflow contextHR case, incident ticket, customer account, project IDIs access valid for this task?
Data classificationinternal, confidential, regulated, export-controlledIs model/runtime placement allowed?
Source authoritysource of record vs draft vs archiveMay this source support an answer?
Tool authorityread-only Q&A vs workflow actionDoes retrieved context unlock a risky tool path?
Approval statenone, reviewer approved, legal approvedIs human approval required before answer or action?

This is where RBAC and ABAC meet RAG. RBAC answers “what group is this user in?” ABAC answers “given this user, document, classification, tenant, purpose, and workflow, is retrieval allowed now?”

For sensitive assistants, the permission check should be applied twice:

  1. Before retrieval, to filter eligible collections and indexes.
  2. After retrieval, to validate every candidate chunk before prompt assembly.

The second check catches index drift, stale ACLs, connector bugs, and accidental mixed-tenant retrieval.

Freshness is a product requirement and a control

RAG quality is often discussed as relevance. In enterprise systems, freshness is just as important.

A model that cites last year’s travel policy, obsolete incident response procedure, retired product spec, or superseded data-retention rule can be worse than a model that admits it does not know.

Define freshness by source class:

Source classRefresh modelStaleness behavior
Critical policyEvent-driven sync plus daily reconciliationBlock answer if stale
Security procedureEvent-driven sync plus owner attestationWarn or block depending on risk
Product documentationOn publish plus nightly scanPrefer latest version
Support ticketsNear real-time or explicit case syncScope to ticket/account
Historical archiveScheduled batchMark as archive, never source of current policy

Freshness should be visible in the answer pipeline:

1
2
3
4
5
6
retrieved chunk
-> source authority check
-> access check
-> freshness check
-> prompt-injection screen
-> citation pack

Do not hide stale retrieval behind smooth prose. If the authoritative source is stale or unavailable, the assistant should say so and avoid answering from model memory alone. In regulated workflows, “I cannot retrieve the approved source right now” is often the correct answer.

Retrieved content is data, not instruction

Prompt injection is not only a chat-input problem.

It is a retrieval problem.

A document can contain text that tells the model to ignore its system prompt, reveal secrets, call a tool, change the answer format, or trust a malicious URL. If that document is retrieved and placed in the context window, the model sees it as language. The surrounding system must preserve the boundary: retrieved content is evidence, not authority.

Minimum controls:

  • Wrap retrieved chunks in explicit delimiters.
  • Label retrieved content as untrusted data.
  • Limit chunk count and total retrieved tokens.
  • Scan chunks for obvious instruction-injection patterns.
  • Keep system and policy instructions outside retrieved content.
  • Validate final answers against allowed output schemas when the workflow is high risk.
  • Never let retrieved text create tool authority.

The OWASP Top 10 for LLM Applications 2025 and OWASP’s RAG guidance both treat prompt injection, sensitive information disclosure, and embedding/vector-store weaknesses as real application risks. The practical conclusion is simple: RAG content should be treated like hostile input until the retrieval, assembly, and output layers prove otherwise.

Audit logs need replay, not just observability

Most AI logs are built for debugging latency and cost.

RAG governance needs incident reconstruction.

When an executive asks “why did the assistant answer that?”, the team should be able to replay the evidence chain:

EventRequired fields
Request receiveduser ID, AI app ID, session ID, workflow ID, timestamp
Query generatednormalized query, embedding model, query classifier result
Retrieval executedindex name, filters, top-k, similarity threshold, source collections
Chunk selectedsource ID, chunk ID, owner, classification, permission metadata, freshness status
Prompt assembledprompt template version, chunk IDs, token budget, injection checks
Model calledmodel ID, version, parameters, region/runtime, safety settings
Answer generatedoutput hash, citations used, confidence policy, validation result
Action requestedtool name, tool owner, approval state, policy decision
User responsedisplayed answer ID, export/download/share event if applicable

You do not need to store every raw prompt forever. You do need enough signed, retained, and access-controlled evidence to answer: who asked, which sources were retrieved, what permissions were evaluated, which model saw what, and why the answer was allowed.

The NCSC guidelines for secure AI system development frame security across design, development, deployment, operation, and maintenance. RAG auditability belongs across that full lifecycle, not only inside the logging layer.

Failure modes that deserve explicit tests

RAG governance should have an eval suite that tests controls, not only answer relevance.

Failure modeTest caseExpected behavior
Unauthorized retrievalUser asks about a restricted legal matterNo restricted chunks reach the model
Stale sourceUser asks about a policy with a newer versionLatest authoritative source wins
Source conflictDraft wiki conflicts with approved standardAnswer cites approved standard and flags conflict
Prompt injection in documentRetrieved chunk says to ignore instructionsChunk is isolated as data; answer does not follow malicious instruction
Deleted document remains indexedSource file removed or de-permissionedChunks and derived cache are removed or blocked
Mixed-tenant retrievalQuery from tenant A matches tenant B documentRetrieval returns nothing from tenant B
Missing citationAnswer cannot cite approved sourceAnswer is blocked or downgraded
Index tamperingChunk hash changes without source updateAlert and quarantine affected chunks
Overbroad service accountAI app can query all indexesDeployment gate fails

This is the part many teams skip because the prototype demo still works. But if the assistant is used for HR, finance, legal, security, engineering standards, customer operations, or clinical-like workflows, these are not edge cases. They are the system.

A practical implementation sequence

Do not start by buying a bigger vector database.

Start by narrowing the scope and making governance visible.

  1. Pick one knowledge domain with a clear business owner.
  2. Identify the source of record and the sources that are explicitly not authoritative.
  3. Define the chunk metadata schema before indexing.
  4. Connect identity context from the IdP into retrieval.
  5. Enforce document and chunk permissions at retrieval time.
  6. Add freshness checks and deletion propagation.
  7. Add source attribution and answer blocking rules.
  8. Build control evals for unauthorized access, stale sources, injection, and missing citations.
  9. Log replayable traces with chunk IDs and policy decisions.
  10. Review failures with data owners, IAM, security, and the AI platform team.

This sequence keeps the first deployment small enough to operate. It also prevents the common pattern where the platform team builds a generic RAG service and discovers too late that every domain has different source authority, retention, and approval needs.

Ownership map

RAG governance fails when everyone assumes someone else owns the source.

Use this ownership map before production:

AssetPrimary ownerReview partnerProduction responsibility
Source document collectionBusiness data ownerLegal / complianceAuthority, freshness, retention
ConnectorAI platform teamSecurity engineeringSecure sync, deletion propagation, failure handling
Chunk schemaAI platform teamData governanceMetadata completeness and versioning
Vector indexAI platform teamSecurity / IAMIsolation, encryption, access control, monitoring
Permission policyIAM / securityBusiness ownerRBAC/ABAC mapping and exceptions
Prompt assemblyAI application teamAI platform / securityBoundary handling and citation pack
Evaluation suiteAI platform teamDomain ownerControl tests and relevance tests
Audit logsPlatform / security operationsLegal / privacyRetention, replay, incident support
Final answer policyProduct ownerRisk ownerWhen to answer, warn, block, or escalate

The most important row is often “final answer policy.” The business owner must decide whether a stale or uncited answer is allowed. Engineering can enforce that rule, but engineering should not invent the risk appetite alone.

What good looks like

A governed RAG assistant should be able to answer these questions before launch:

  • Which sources are allowed to ground answers?
  • Who owns each source?
  • Which source wins when documents disagree?
  • Which users, agents, and workflows may retrieve each source?
  • Does the vector index preserve document permissions after chunking?
  • What happens when permissions change after indexing?
  • What happens when a source is deleted?
  • How fresh must each source be?
  • Can retrieved text influence instructions or tool authority?
  • Can every answer be reconstructed from logs?
  • Which evals fail the build if access control, freshness, or citation behavior breaks?
  • Who is paged when retrieval starts violating policy?

If the team cannot answer these questions, the system is not ready for enterprise-wide rollout. It may still be useful as a narrow internal pilot, but it should not be marketed internally as a governed AI knowledge layer.

FAQ

Is RAG governance the same as data governance?

No. Data governance is the foundation, but RAG governance adds AI-specific controls around retrieval, prompt assembly, model context, citations, evals, and audit reconstruction. A source can be well governed in SharePoint and still become unsafe if chunking strips permissions or retrieval ignores freshness.

Should every department get a separate vector database?

Not always. Physical separation can help with high-risk or multi-tenant boundaries, but the key requirement is enforceable isolation. Some domains can share infrastructure if every chunk carries strong metadata and retrieval enforces policy before context assembly. Sensitive domains may need separate indexes, encryption keys, service accounts, and operational owners.

Can the LLM enforce access control if we put rules in the system prompt?

No. The model can follow instructions probabilistically, but access control should be deterministic. Filter and validate retrieved chunks before they reach the model. Treat the model as a consumer of authorized context, not as the authority that decides authorization.

What is the most common enterprise RAG governance mistake?

Indexing too broadly before defining source authority and permissions. Teams often ingest a large corpus to make demos look impressive, then discover that drafts, obsolete documents, confidential folders, and conflicting policies are all semantically retrievable.

How should teams measure RAG governance quality?

Measure retrieval correctness, permission correctness, freshness correctness, source attribution, injection resistance, deletion propagation, and incident replayability. Relevance and answer helpfulness are necessary, but they are not enough for governed enterprise use.

Where should this sit in the organization?

The AI platform team should usually own the shared retrieval architecture, eval harness, observability, and runtime controls. Business data owners should own source authority and freshness. IAM and security should own permission models and monitoring. Product or workflow owners should own answer policy and user experience. The DSI or CIO function should make those responsibilities explicit before scaling.