Knowledge Hub / AI Governance

AI governance for banking operations: from model validation to runtime control.

When an AI agent acts inside banking operations, governance can no longer stop at validating a model. An agent that prepares a payment, drafts a credit memo or closes a case needs controls that run at the moment of action: permissions, policy gates, approval thresholds and an audit record per case. This page sets out an operating model for governing such agents. It is written for CDOs, CROs and compliance leaders, and it is intended to be useful to a risk committee whether or not the institution ever evaluates CoreFi. It is an operating-model resource, not legal advice; obligations depend on each institution's licence, jurisdiction and system classification.

CoreFi Academy

The operating model, in 2 minutes 07.

From model validation to runtime control: the seven-step lifecycle every agent runs, the controls that bound it, and the evidence question reviewers will ask. Watch it before the risk committee; the full operating model follows below.

Why operations is different

Model governance reviews predictions. Operational governance controls actions.

Most institutions already run model governance: an inventory, validation reports, periodic revalidation, documentation per model. That framework was built for systems that produce a score or a prediction which a human then acts on. Agentic systems change the shape of the risk: the output is not a number on a dashboard, it is an action with a side effect on the ledger, the customer or the case file.

The difference is practical, not philosophical. A validation report tells you the model behaved acceptably on a test set last quarter. It does not stop an agent from releasing a payment outside its mandate this afternoon. For systems that act, the institution needs runtime controls that sit between the model's proposal and the operational system that executes it, plus a record of every decision that record-keeping and supervisory review can rely on.

This shifts the question a risk committee should ask from "is the model good enough?" to "what is this agent allowed to do, what stops it doing anything else, and what evidence exists afterwards?" Those are the same questions the second line already asks about human operators, which is why the most workable governance model treats an agent like any other operator with a written authority. The same logic applies whether the executive owner is digital (the CDO view) or second line (the compliance and risk view).

For the lending-specific version of this discussion, including the questions supervisors ask about AI in credit decisioning, see the companion article AI Governance in Lending: What Regulators Actually Expect. This page generalises that operating model from lending to banking operations as a whole.

The operating model

Seven controls a risk committee can hold an agent to.

The operating model below is deliberately vendor-neutral: each control can be assessed on any platform that runs AI agents against operational systems. Together they form a lifecycle in which the agent senses a case, plans an action, is checked against policy, acts only through governed interfaces, is audited per workflow, escalates to a human where policy requires, and feeds outcomes into periodic review. In short: Sense, Plan, Check, Act, Audit, Escalate, Learn.

One lifecycle for every agent action: policy gates before any side effect, a reviewer lane where policy requires it, and one audit record per case.

01

Scoped permissions per agent role

Each agent operates under a written authority: which data it may read, which interfaces it may call, which transaction and exposure limits apply, which customer segments and jurisdictions it may touch. An agent whose scope is implicit is an agent whose scope nobody can defend in review.

02

Policy gates evaluated before action

Every proposed action passes through versioned policy rules before any side effect: permission checks, limits, sanctions and AML filters, output guardrails, consent and jurisdiction rules. A failed gate stops the workflow with a structured reason. Gates that run after the fact are reporting, not control.

03

Human-approval thresholds

Policy declares which classes of action require a person: monetary movements, high-risk classifications, filings, offboarding. The reviewer sees the same context the agent saw and can approve, reject or edit. Effective oversight also means reviewers have the time and authority to override, and the override rate is monitored.

04

Model and prompt versioning

Every model and prompt version that ever acted in production is registered, with its deployment window and intended scope. Each case links back to the version that produced it, so a question about a past decision can be answered against the system as it was, not as it is.

05

Immutable per-workflow audit records

One append-only record per workflow: trigger, retrieved data, plan, policy decisions, actions taken, escalations, human decisions and final state. Re-running the model later is not an audit trail; the original record is.

06

Escalation paths

Pre-defined criteria for what gets escalated, to whom, and what pauses while it is open: gate refusals, threshold breaches, anomalous patterns, suspected model error. Vague language is not a control; named owners and documented criteria are.

07

Periodic review

A recurring first-line and second-line review of how the agents actually behaved: override and exception rates, gate-refusal trends, drift in case outcomes, incidents and near-misses, and changes to scope or thresholds. The output feeds the risk committee and closes the loop back into control design.

The regulatory frame

What frameworks expect, at the operating-model level.

No single rulebook covers "AI agents in banking operations." Several existing frameworks converge on the same expectations, and a risk committee can use them as the test of whether the operating model above is sufficient. The summaries below are high-level orientation, not legal interpretation; how each framework applies is the institution's determination with its own advisers.

EU AI Act

The EU AI Act takes a risk-based approach: obligations scale with the classification of the AI system, and certain uses in financial services, such as creditworthiness assessment of natural persons, are classified high-risk. For high-risk systems the regulation splits obligations between the provider of the system and the institution deploying it: risk management, logging capability and post-market monitoring sit primarily with the provider, while the deploying institution must assign and operate human oversight, use the system within its intended purpose, monitor its operation and retain the logs under its control. The lending-specific picture is covered in the companion article.

DORA and EBA outsourcing guidelines

Under DORA, which since January 2025 carries the EU's ICT risk-management expectations for financial entities, and the EBA's guidelines on outsourcing arrangements, institutions are expected to evidence control over the ICT systems and third-party providers their operations depend on: access management, logging and monitoring, due diligence, documented responsibility allocation and a credible exit strategy. An AI platform run by a vendor sits squarely inside that perimeter; the resilience angle is covered in depth in DORA, AI governance and operational resilience.

Model risk expectations

Supervisory model-risk frameworks generally expect an inventory of models in use, independent validation, ongoing performance monitoring and documented change control. Agentic systems extend each of these from the model to the action: the inventory must cover agent roles and scopes, and monitoring must cover what the agent did, not only what it predicted.

The common thread: frameworks place the obligations on the institution, not on its vendors. A platform can supply the control surfaces and the evidence; the institution remains responsible for operating them and for its own compliance. Platform-level security posture is summarised on Security & Compliance.

Control to evidence

What a reviewer asks for, control by control.

Internal audit, second-line review and supervisory inspection differ in depth but rhyme in shape: for each control, the reviewer wants the artefact that proves it operates. The table below maps each control in the operating model to the evidence typically requested. If a control has no producible artefact, treat it as not yet implemented.

Governance control	Evidence a reviewer asks for
Scoped permissions per agent role	The written authority per agent: role definition, data and interface scopes, transaction and exposure limits, segment and jurisdiction restrictions, with effective dates and an owner.
Policy gates before action	The versioned rule set with change history, evaluation statistics, and sample refusal records showing a gate stopping an out-of-policy action with a structured reason.
Human-approval thresholds	The threshold policy; reviewer-queue records with reviewer identity, timestamp and decision; override and rejection rates over time, demonstrating reviewers actually overturn the agent.
Model and prompt versioning	The registry of model and prompt versions with deployment windows and intended scope, and the linkage from any selected case to the version that produced it.
Immutable per-workflow audit records	Exported records for sampled cases covering trigger, retrieved data, plan, policy decisions, actions, escalations and human decisions, plus evidence the record cannot be silently altered after write.
Escalation paths	Documented escalation criteria with named owners, and case evidence that escalations fire in practice: who was notified, what was paused, who resumed it and on what basis.
Periodic review	The review calendar; override, exception and gate-refusal trend reports; minutes and resulting actions, showing the review changes scopes, thresholds or policies when the data says it should.

Where CoreFi fits

The operating model, implemented as a control plane.

CoreFi ships the seven controls above as the working machinery of its AI workflow control plane: every agent runs the Sense, Plan, Check, Act, Audit, Escalate, Learn lifecycle, with scoped permissions, policy gates ahead of any side effect, human-approval thresholds, model and prompt versioning and one append-only audit record per workflow. The platform runs in production today across 20+ deployments and 6 geographies, serving 200k+ end-customer accounts at 99.9% platform uptime against operational SLOs.

The boundary matters as much as the capability. CoreFi is the platform provider; the customer remains the regulated entity, holds the licence and remains responsible for its own compliance. The platform is designed to support the governance obligations described on this page; it does not make an institution compliant, and nothing here is legal advice. Control documentation and audit-record samples are available on request, and the platform's trust posture, responsibility model and evidence pack are described on the Trust Center.

Take the operating model to your risk committee. Then test it against a live platform.

A governance walkthrough takes your control framework through the seven controls on a running workflow: the scoped permissions, the policy gates, a reviewer approval, and the audit record the case leaves behind.

Visit the Trust Center

Continue in the Knowledge Hub

Related resources for risk and governance leaders.

DORA, AI governance and operational resilience

How AI platforms sit inside the ICT third-party perimeter, and what resilience reviewers ask of them.

A board guide to agentic AI in core banking

The questions a board should ask before agents act inside operations, in board-level language.

Knowledge Hub home

All buyer-education resources on agentic AI, governance and core banking in one place.