From Risk to Readiness: AI Risk Management with ISO 42001

Quick Read

AI systems present distinctive risks—including learned biases, unpredictable emergent behaviour, and sociotechnical vulnerabilities—that traditional software risk frameworks cannot adequately address, making AI-specific risk management essential. ISO 42001 requires organisations to establish a documented risk management process tailored to AI, including systematic identification of technical, data, operational, and supply chain risks, with risk criteria and prioritisation determined by the organisation's own risk appetite and tolerance. The standard does not mandate a specific methodology, allowing organisations to apply qualitative, semi-quantitative, or quantitative approaches suited to their AI activities, provided the process is consistent and documented.

Executive Summary

Risk management is the operational core of any AI management system. ISO/IEC 42001:2023 requires organisations to conduct AI risk assessments, implement risk treatment measures, and monitor the effectiveness of those measures over time. The NIST AI Risk Management Framework (AI RMF 1.0) provides complementary guidance on how to structure these activities around four functions: GOVERN, MAP, MEASURE and MANAGE. This whitepaper explains how these two frameworks work together, introduces a practical approach to building an AI risk register, and addresses the distinctive features of AI risk that make conventional risk management approaches insufficient.

Why AI Risk Is Different

AI systems pose risks that are qualitatively different from those associated with conventional software. Understanding these differences is essential before any risk management approach can be effectively designed. The NIST AI RMF identifies several distinctive characteristics of AI risk that traditional risk frameworks fail to capture adequately.

First, AI systems learn from data. This means their behaviour is not entirely determined by explicit programming but by patterns in training data that may be biased, incomplete, or no longer representative of real-world conditions. A model trained on historical credit data, for example, may encode and amplify historical patterns of discrimination against particular demographic groups — even when the developers had no discriminatory intent.

Second, AI systems can exhibit emergent behaviour. Complex models may behave in ways that are difficult to predict from their design specifications, particularly when deployed in environments that differ from their training conditions. This makes AI risk inherently harder to assess than traditional software risk, where behaviour is generally a direct function of code.

Third, AI risks are sociotechnical. They arise from the interaction of technical systems with human behaviour, social contexts, and institutional structures. A technically sound AI system deployed in an organisational context with poor human oversight, inadequate training, or misaligned incentives may cause significant harm regardless of its technical quality.

Fourth, AI risks can be systemic and interconnected. When AI systems are used at scale across industries — as in financial services, healthcare or hiring — their failures can have systemic effects that individual risk assessments at the organisational level may not capture.

NIST on AI Risk Uniqueness

The NIST AI RMF notes that AI risks differ from traditional software risks in their potential for amplifying existing inequities, their inherent sociotechnical nature, and the difficulty of detecting failures. Risks can emerge from the interplay of technical design, social context, and the behaviour of human actors who use, oversee, or are affected by the AI system. Traditional software risk frameworks that focus on technical defects and system failures are insufficient for this broader landscape.

The ISO 42001 Risk Framework

ISO 42001 places risk management at the centre of the AI Management System. Clause 6 requires organisations to establish a risk management process that is AI-specific, and Clause 8 requires that process to be operationally executed and its results documented.

AI Risk Assessment (Clauses 6.1.2 and 8.2)

The AI risk assessment requires organisations to define AI-specific risk criteria — the organisation’s risk appetite and tolerance for AI-related harms — and to systematically identify sources of risk across the AI systems in scope. These risk sources include technical failures, data quality problems, model drift, inadequate human oversight, misuse by operators or end users, adverse impacts on individuals or communities, and supply chain dependencies.

ISO 42001 does not prescribe a specific risk methodology. Organisations may use qualitative, semi-quantitative or quantitative approaches, provided that the methodology is consistent, documented, and appropriate to the scale and nature of their AI activities. Risk likelihood and impact must be evaluated, and risks must be prioritised for treatment based on the organisation’s defined criteria.

AI System Impact Assessment (Clauses 6.1.4 and 8.4)

Distinct from the AI risk assessment, the AI system impact assessment (ASIA) focuses on the potential effects of specific AI systems on people, society and the environment. It must consider impacts across the AI system lifecycle — from design through deployment, use, monitoring and decommissioning. The ASIA is prospective (conducted before deployment or significant modification) as well as periodic (reassessed as context changes).

The ASIA process addresses questions such as: What populations could be affected by this AI system’s outputs? How might the system affect individuals’ rights, opportunities or wellbeing? What are the potential environmental impacts of training and operating the system? What are the consequences if the system makes errors, is misused, or is compromised?

AI Risk Treatment (Clauses 6.1.3 and 8.3)

Having assessed and prioritised risks, organisations must select and implement risk treatment options. ISO 42001 provides four categories of treatment consistent with ISO 31000: modify (implement controls to reduce likelihood or impact), avoid (decide not to develop or deploy the AI system), transfer (share the risk through contracts or insurance), and accept (retain the risk within the organisation’s risk appetite).

In practice, most AI risks will be treated through modification — implementing controls drawn from Annex A of ISO 42001. These controls address the specific characteristics of AI risk: data quality, model governance, human oversight, transparency, and supplier management.

The NIST AI RMF: A Complementary Lens

The NIST AI Risk Management Framework organises AI risk management activities around four core functions. These functions are not sequential steps but overlapping, iterative activities that should be embedded throughout the AI lifecycle.

GOVERN

The GOVERN function establishes the organisational structures, cultures, and policies that enable effective AI risk management. It addresses leadership commitment, organisational accountability, policies and processes, team culture, and supply chain risk management. GOVERN applies at the organisational level and is designed to enable the other three functions. It maps closely onto ISO 42001’s Clauses 4 (Context), 5 (Leadership), and 7 (Support).

MAP

The MAP function involves identifying and understanding the context in which AI systems are used, the risks they present, and the potential impacts on affected populations. It requires organisations to categorise AI systems by risk level, identify the relevant stakeholders and use cases, and understand the sociotechnical environment in which the system will operate. MAP corresponds closely to ISO 42001’s Clauses 6.1 (risk identification) and 6.1.4 (AI system impact assessment).

MEASURE

The MEASURE function develops methods and metrics to assess, analyse, and track AI risks and their associated impacts. This includes technical testing and evaluation, bias and fairness assessment, red-teaming, explainability methods, and ongoing monitoring during deployment. MEASURE corresponds to ISO 42001’s Clause 9 (Performance Evaluation) and to many of the Annex A controls related to AI system testing, monitoring and documentation.

MANAGE

The MANAGE function implements risk treatment plans and monitors their effectiveness. It involves prioritising identified risks for response, allocating resources for treatment, monitoring risk treatment outcomes, and maintaining documentation of risk management decisions. MANAGE corresponds to ISO 42001’s Clauses 6.1.3 (risk treatment), 8 (Operation), and 10 (Improvement).

Mapping the Frameworks: Where They Align

The table below shows how the key ISO 42001 clauses and the NIST AI RMF functions align, helping organisations use both frameworks efficiently without duplicating effort.

Risk Topic	Examples	Typical Root Cause	ISO 42001 Link
Data Quality	Biased training data, missing data, stale data	Inadequate data governance pre-training	Annex A.4, Cl. 6.1.2
Model Performance	Degraded accuracy, distributional shift, model drift	Deployment environment differs from training	Annex A.6, Cl. 9.1
Human Oversight	Over-reliance on AI, automation bias	Insufficient human-in-the-loop design	Annex A.7, Cl. 6.1.4
Transparency	Inability to explain AI decisions to users	Black-box model architecture	Annex A.8, A.9
Fairness & Bias	Discriminatory outcomes for protected groups	Biased training data or proxy variables	Annex A.5, Cl. 6.1.4
Supply Chain	Third-party AI model failures, vendor lock-in	Inadequate supplier assessment	Annex A.10, Cl. 8
Security	Adversarial attacks, data poisoning, model inversion	Insufficient security by design	Annex A.6, Cl. 6.1.2
Shadow AI	Unvetted AI tools deployed outside governance, employees using unauthorised LLM services with organisational data	No AI use case intake process, absent acceptable use policy	Cl. 4.3, 5.2, Annex A.6
Legal/Regulatory	Non-compliance with EU AI Act, data protection law	Inadequate compliance monitoring	Cl. 4.2, 4.3, 9.1

Building an AI Risk Register

The AI risk register is the central operational tool of an AI risk management process. It documents identified risks, their assessed likelihood and impact, assigned owners, treatment decisions, and current status. An effective AI risk register has several characteristics that distinguish it from conventional IT risk registers.

It is AI-system-specific rather than domain-generic. Each AI system in scope should have its own risk profile, reflecting the unique characteristics of that system — its training data, use case, deployment environment, affected populations, and integration with other systems.

It is dynamic rather than static. AI risk registers must be updated as models are retrained, deployment contexts change, new regulations are enacted, or incidents occur. A static risk register that is updated only at annual review cycles is inadequate for a technology that evolves continuously.

It links to controls. Each risk entry should reference the specific ISO 42001 Annex A controls that have been implemented to treat the risk, with evidence that those controls are operating effectively. This linkage is what transforms a risk register from a documentation exercise into a live governance tool.

Practical Tip: Starting Small

Organisations new to AI risk management often try to build a comprehensive risk register covering all AI systems simultaneously — a project that can stall under its own weight. A more effective approach is to begin with the two or three AI systems that present the greatest potential impact, develop a thorough risk register for those systems, and use that experience to develop templates and processes that can be rolled out to lower-impact systems progressively.

Shadow AI: The Hidden Risk Register Entry

One of the most frequently missed entries in AI risk registers is Shadow AI — AI systems and tools deployed within the organisation without governance oversight. Shadow AI proliferates when business units, individual employees, or vendors integrate AI capabilities outside the organisation’s sanctioned procurement and governance processes. Common examples include employees using external generative AI services to process proprietary documents, teams building model-driven automation on platforms that have not undergone AI system assessment, and software vendors quietly embedding AI features into existing tools without disclosure.

Shadow AI is a governance failure rather than a technical one. Its root cause is typically the absence of an AI use case intake process — a mechanism through which proposed AI deployments are reviewed, assessed, and either approved or declined before going live. ISO 42001 Clause 4.3 (scope of the AIMS) and Clause 5.2 (AI policy) both implicitly require that the organisation have some control over which AI systems fall within its governance boundary. Without an intake process, that boundary has holes that grow with every unsanctioned deployment.

Why Shadow AI Belongs in Your Risk Register

Shadow AI entries in the risk register should capture: the likelihood that employees are already using unsanctioned AI tools (high in most organisations); the potential impact of proprietary or personal data being processed by unvetted third-party AI services; the control gap created by the absence of an AI use case intake process; and the Annex A.6 and Clause 6.1.2 obligations that apply once a system is identified. Risk treatment typically involves both preventive controls (acceptable use policy, procurement requirements) and detective controls (technology scanning, shadow IT discovery).

Risk Proportionality: Triage Before You Treat

Not every AI use case requires the same governance investment. Applying uniform, intensive controls to every AI system regardless of its risk profile wastes resources, creates delays, and can paradoxically reduce governance effectiveness by overloading the teams responsible for it. ISO 42001’s risk-based approach under Clause 6.1 is explicitly proportionate: controls should be selected and calibrated to the level of risk presented by each specific AI system.

In practice, this means triaging AI use cases before determining the governance treatment they require. A simple, high-impact triage process asks three questions for each AI system in scope: What is the consequence of failure? Who is affected if the system performs incorrectly or is compromised? Is the system making or informing decisions with significant individual or organisational impact? Systems that score high on all three warrant the full weight of Annex A controls, human oversight requirements, and formal review cycles. Systems that score low can be governed through lighter-touch controls, periodic sampling, and simplified documentation.

The practical output of risk triage is a tiered governance model: Tier 1 systems (highest risk) receive intensive governance treatment including mandatory AI system impact assessments, formal deployment approval gates, ongoing monitoring, and periodic third-party review. Tier 2 systems receive standard treatment aligned to Annex A defaults. Tier 3 systems (lowest risk) are governed through baseline acceptable use, logging, and periodic spot-check. This tiered approach ensures that governance investment concentrates where it matters most, rather than being diluted uniformly across all AI deployments.

Using Open Frameworks to Populate Your AI Risk Register

One of the practical challenges organisations face when building an AI risk register is knowing where to start: what are the specific threats to AI systems that should inform risk identification? Established, publicly available threat frameworks can provide a systematic starting point, ensuring that risk registers capture the full landscape of AI-specific threats rather than only those that practitioners happen to be aware of.

SAFE-AI, published by the MITRE Corporation in April 2025 and approved for public release and unlimited distribution, is one of the most practical and comprehensive such resources currently available. Building on MITRE’s ATLAS™ adversarial AI threat taxonomy and harmonised with both the NIST Risk Management Framework and the NIST AI RMF, SAFE-AI provides a structured mapping of AI threats to security controls that organisations can use directly to inform both risk identification and control selection.

How SAFE-AI Is Structured

SAFE-AI organises AI threats across four system elements that together describe a complete AI-enabled system: the Environment (infrastructure and network), the AI Platform (components and software), the AI Model (ML models and LLMs), and AI Data (training and validation data). For each threat in its catalogue, SAFE-AI identifies which system elements are affected and maps relevant NIST SP 800-53 Rev 5 controls to each element — giving organisations both a threat description and a ready-made control shortlist.

The SAFE-AI threat catalogue spans the most significant adversarial and non-adversarial AI risk categories: model poisoning (adversaries modifying training data or parameters to embed malicious behaviour); supply chain infiltration through unvetted open-source models, tools, or training data of unclear provenance; model exposure and intellectual property theft via inference APIs; sensitive information disclosure through prompt engineering or training data memorisation; insecure APIs enabling adversarial manipulation of model inputs; configuration errors exposing AI components to attack; data poisoning embedding backdoors into AI system behaviour; and adversarial input attacks designed to cause incorrect or harmful model outputs.

What the Framework Contains

SAFE-AI’s Appendix C provides a detailed threat-and-concern table with residual risk notes and MITRE ATLAS™ identifiers for cross-referencing. Appendix D lists 100 NIST SP 800-53 controls identified as AI-affected, mapped by system element — giving implementers a prioritised control baseline for AI-enabled systems. Appendix E goes further: it provides assessment interview question-and-answer sets that auditors can use when conducting security control assessments of AI systems, covering each control across all four system elements. This represents a ready-made audit toolkit for AI security assessment that would otherwise take significant time to develop from scratch.

SAFE-AI: A Free Threat-to-Control Mapping Resource

SAFE-AI (MITRE Corporation, April 2025, Publication MP250397) is approved for public release and available for unlimited distribution at no cost. It provides: a structured AI threat catalogue covering 20+ threat categories; 100 AI-relevant NIST SP 800-53 security controls mapped to four AI system elements; and assessment interview Q&A sets for auditors evaluating AI-enabled systems. While its control references are drawn from the US federal NIST SP 800-53 framework, the threat taxonomy and control logic apply directly to ISO 42001 Annex A controls and any organisation’s AI risk register, regardless of jurisdiction. It is one of the most useful free resources currently available for AI threat-informed risk assessment.

Organisations implementing ISO 42001 can use SAFE-AI as a threat reference when conducting AI risk assessments required by Clauses 6.1.2 and 8.2. The threat catalogue ensures that AI-specific attack vectors — many of which conventional IT risk frameworks miss entirely — are systematically considered. The control mappings provide a bridge between threat identification and Annex A control selection, shortcutting what would otherwise require significant independent research for each AI system assessed. The assessment Q&A sets in Appendix E are also directly relevant to ISO 42001 internal audit and certification audit preparation.

The Databricks AI Security Framework (DASF)

A complementary free resource is the Databricks AI Security Framework (DASF), which takes a component-level approach to AI risk rather than a threat-taxonomy approach. DASF maps 62 AI-specific risks across 12 AI system components — ranging from raw data and feature engineering through to model serving, inference responses, and platform operations — and pairs those risks with 64 security controls. The component-by-component structure makes DASF particularly useful when organisations are assessing the security posture of a specific AI pipeline rather than conducting a broad threat landscape survey.

DASF also provides a useful taxonomy of AI deployment models, noting that different deployment patterns (internally trained predictive models, fine-tuned LLMs, retrieval-augmented generation systems, external model API calls) carry materially different risk profiles and shared responsibility boundaries. An organisation calling a third-party foundation model API assumes different governance obligations than one training and deploying its own model on proprietary data. DASF’s risk and control structure helps organisations calibrate their governance posture to the deployment pattern they are actually using.

DASF: A Free Component-Level AI Risk Framework

The Databricks AI Security Framework (DASF) is publicly available at no cost. It catalogues 62 AI-specific risks across 12 AI system components and maps 64 security controls to those risks. Unlike threat-taxonomy frameworks, DASF works component by component through the AI pipeline — making it well suited to gap analysis of a specific AI system’s security architecture. Used alongside SAFE-AI, which provides the adversarial threat perspective, DASF gives AI governance practitioners two complementary lenses on the same risk landscape. Both frameworks are free to use and require no licensing or registration.

The Distinctive Challenge of High-Risk AI

Not all AI systems carry the same risk profile. High-risk AI systems — those that make or inform decisions with significant consequences for individuals — require more intensive risk management than systems used for routine process optimisation or administrative tasks. The EU AI Act provides a useful (if geographically scoped) taxonomy of high-risk AI, covering systems used in employment, education, credit, critical infrastructure, healthcare, law enforcement, and immigration.

For high-risk AI systems, risk management under ISO 42001 should involve more frequent risk assessment cycles, more rigorous AI system impact assessments, stronger human oversight mechanisms, more extensive testing and validation before deployment, documented audit trails for significant AI-influenced decisions, and clear escalation and override procedures. ISO 42001’s Annex A controls are designed to scale with risk — organisations implementing a risk-proportionate approach will naturally invest more heavily in controls for higher-risk systems.

Ethical Frameworks for AI Design Trade-offs

AI risk management is not only about identifying and mitigating technical failures. It also involves navigating genuine ethical trade-offs — situations where two legitimate values are in tension and a deliberate choice must be made about which to prioritise. ISO 42001’s requirements around fairness, human oversight and transparency create the governance conditions for these decisions; ethical philosophy provides the tools for making them well.

Three frameworks from moral philosophy have direct practical relevance to AI design trade-offs.

Deontological Ethics: Rules and Rights

Deontological ethics holds that certain actions are intrinsically right or wrong, regardless of their consequences. Applied to AI, a deontological approach asks: does this system violate any individual’s rights or dignity? Are there categories of AI application that are impermissible regardless of their efficiency benefits? The EU AI Act’s prohibited AI applications — social scoring, subliminal manipulation, mass biometric surveillance — reflect a deontological logic: some uses of AI are wrong in themselves, not because they produce bad outcomes in specific cases. Organisations with a strong rights-based culture will tend toward deontological constraints in their AI ethics frameworks: absolute limits that no business case can override.

Consequentialist Ethics: Outcomes and Welfare

Consequentialist ethics judges actions by their outcomes — the right action is the one that produces the greatest overall welfare. Applied to AI, a consequentialist approach asks: across all affected populations, does this system produce a net positive outcome? Are the harms to some individuals justified by greater benefits to others? Fairness and bias assessments are implicitly consequentialist: they evaluate whether the distribution of AI system outputs is acceptable across populations. Consequentialist reasoning is powerful but requires careful definition of whose welfare counts, over what timeframe, and with what weight given to concentrated harms versus diffuse benefits.

Virtue Ethics: Character and Trustworthiness

Virtue ethics asks not whether a specific action is right, but what kind of organisation one wants to be. Applied to AI, it asks: would a trustworthy, responsible organisation deploy AI in this way? Would we be comfortable if our customers, employees, and regulators could see exactly how this system works and why we built it? Virtue ethics is particularly useful for evaluating AI governance posture — whether an organisation’s approach to AI reflects a genuine commitment to responsible practice or merely the minimum required to avoid liability. It also underlies the concept of building a responsible AI culture, which is increasingly recognised as essential to sustainable governance.

Applying Multiple Frameworks

In practice, AI governance benefits from applying all three frameworks rather than choosing one. Deontological thinking sets absolute limits. Consequentialist thinking evaluates impacts across populations. Virtue ethics asks whether the organisation’s overall posture reflects genuine trustworthiness. When all three converge on an answer, organisations can proceed with confidence. When they diverge — when an AI application is consequentially beneficial but rights-concerning, for example — that divergence itself signals the need for escalation to a more senior deliberative process.

Building an Ethical Escalation Pathway

Ethical trade-offs in AI governance do not resolve themselves. Organisations need structured processes for surfacing, evaluating and deciding on ethically contested AI decisions — processes that are embedded in operations, not reserved for external ethics committees that meet quarterly. An ethical escalation pathway operationalises the governance discipline that ISO 42001’s human oversight and incident management requirements imply.

Stage 1: Flag

Any employee, operator or affected user who has a concern about an AI system’s behaviour, outputs, or impacts should have a clear and accessible channel through which to raise that concern. This channel must be genuinely safe — non-retaliation must be enforced — and it must be monitored. Flags can be triggered by specific outputs (the system produced a result that seems wrong or harmful), by patterns (the system seems to be systematically producing certain types of outputs), or by design concerns raised before deployment. ISO 42001’s Clause 7.4 (communication) and Annex A controls on human oversight provide the governance foundation for this stage.

Stage 2: Triage

Not every flag requires the same response. A triage process — conducted by an accountable risk owner — categorises flags by severity, urgency, and type. Purely technical issues (a model producing errors due to distributional shift) can be routed to the AI system team for technical remediation. Potential regulatory violations require legal assessment. Ethical concerns that do not have an obvious technical fix — where a deliberate design choice is being challenged — require escalation to the next stage. The triage process should be documented, time-bound, and have clear criteria for escalation versus resolution at the operational level.

Stage 3: Pause and Escalate

For concerns that cannot be resolved at the operational level, the organisation must have the ability and willingness to pause AI system operation or deployment pending resolution. This is culturally challenging — pausing a revenue-generating AI system has a cost, and there will be organisational pressure to continue operating while the issue is investigated. Effective AI governance requires that this pressure be resisted when the concern is material. The escalation should go to a designated AI ethics or governance function — a role or committee with the authority to direct operational changes and to escalate further to leadership if required. ISO 42001’s management review process (Clause 9.3) and corrective action requirements (Clause 10.2) provide the formal management system basis for this stage.

Stage 4: Override or Deliberate

At the highest level of the escalation pathway, a decision must be made: override the concern (proceed with the AI system, with documented rationale for why the concern does not justify stopping), or deliberate (engage in a structured ethical deliberation process involving the relevant stakeholders and, where appropriate, external perspectives). The decision and its rationale must be documented regardless of outcome. Override decisions that are later found to have been wrong — where an organisation proceeded over legitimate ethical objections and harm resulted — represent a significant governance failure. Documented deliberation at least demonstrates that the organisation took the concern seriously and applied structured judgement to it.

Why Escalation Pathways Matter

The existence of an ethical escalation pathway is not only a governance best practice — it is increasingly a regulatory expectation. The EU AI Act’s human oversight requirements for high-risk AI imply that humans must be able to meaningfully intervene in AI system operation when things go wrong. A documented escalation pathway, with clear roles, triage criteria, and decision authority, is the operational evidence that human oversight is real rather than nominal.

Model Lifecycle Governance: From Staging to Archiving

AI risk management extends beyond the moment of model deployment. A well-governed AI management system treats the full model lifecycle — from development through production to eventual decommissioning — as a continuous risk management activity. This is what MLOps governance means in the context of ISO 42001: not merely the efficient automation of model deployment, but the systematic application of risk controls at each transition in the model’s life.

ISO 42001’s Annex A.6 (AI system lifecycle) requires that development and operational activities for AI systems are planned, controlled, and documented. In practice, this means establishing formal governance gates at the critical transitions in a model’s lifecycle.

Development to Staging

Before a model moves from active development into a staging environment for pre-production evaluation, it must satisfy documented acceptance criteria. These criteria should address model performance against defined benchmarks, bias and fairness metrics across relevant population segments, explainability requirements (is the model’s behaviour sufficiently interpretable for its use case?), and security testing including adversarial robustness evaluation. The gate review should be documented, with a named approver and a record of the criteria assessed. Models that do not satisfy the gate criteria must be returned for further development — not deployed with known deficiencies and a plan to fix them later.

Staging to Production

The production gate is the most consequential transition in the model lifecycle. Once a model is in production, it begins affecting real decisions, real people, and real organisational risk. The production gate should require: sign-off from the risk owner documented in the AI risk register; confirmation that the AI system impact assessment has been completed and reviewed; evidence that human oversight mechanisms are in place and tested; and a documented rollback plan in the event that the model performs unexpectedly in production. This gate aligns directly with ISO 42001’s Clause 8 (Operation) requirements for implementing risk treatment plans before operational deployment.

Production Monitoring and Version Management

In production, models must be continuously monitored against the performance and risk metrics established in the risk assessment. Monitoring should detect concept drift (where the relationship between inputs and the correct output changes over time), data drift (where the statistical properties of inputs change), and performance degradation across different population segments. When monitoring detects a threshold breach, a documented response process must be triggered — which may include model retraining, rollback to a previous version, or escalation to the ethical escalation pathway described earlier in this whitepaper.

Retirement and Archiving

When an AI system is retired — replaced by a successor model or decommissioned entirely — governance obligations do not end. ISO 42001’s documentation requirements imply that records of retired AI systems, their risk assessments, their performance history, and the decisions made during their operation must be retained for a period appropriate to the organisation’s legal, regulatory, and accountability obligations. Model artefacts should be archived rather than deleted, so that the behaviour of the system during its operational life can be reconstructed if required by regulators, auditors, or litigation.

MLOps and ISO 42001: A Natural Alignment

Well-implemented MLOps practices — automated testing, staged deployment, continuous monitoring, version management, and audit logging — are not in tension with ISO 42001 governance requirements. They are the technical infrastructure through which governance controls are operationalised. Organisations with mature MLOps pipelines will find that much of the Annex A.6 lifecycle control evidence they need for certification already exists within their MLOps tooling, provided that tooling is configured to produce and retain auditable records.

AI/ML Asset Cataloging: Governance Through Visibility

You cannot govern what you cannot see. One of the most consistent findings in AI governance assessments is that organisations lack a complete, accurate, and current inventory of the AI systems and models they operate. Without a systematic AI/ML asset catalog, risk assessments are incomplete, human oversight obligations cannot be consistently applied, and audit trails are fragmented. ISO 42001’s Clause 4.3 (determining the scope of the AIMS) and Annex A.8 (documentation of AI systems) together create a governance obligation that can only be satisfied through systematic asset cataloging.

What an AI/ML Asset Catalog Contains

A governance-grade AI/ML asset catalog documents each AI system in scope with enough information to assess and manage its risks. The essential elements include: system identity and ownership (name, version, business owner, technical owner, date deployed); risk classification (risk tier from the AI risk assessment, high-risk determination under applicable regulation); lineage (training dataset versions, feature engineering pipeline reference, experiment tracking reference, model artefact location); current status (active, under review, pending retirement); human oversight specification (the oversight level required by Annex A.7, and the designated oversight roles); and compliance status (which Annex A controls have been implemented and evidenced for this system).

Model Lineage Tracking

Model lineage — the documented chain from raw data through processing, training, validation, and deployment — is both a reproducibility requirement and a risk management tool. If a model is found to produce biased outputs, lineage tracking makes it possible to determine whether the bias originates in the training data, the feature engineering pipeline, or the model architecture. Without lineage, root cause analysis is speculation. Lineage tracking is particularly important where models are retrained on updated datasets or where transfer learning is used — changes in upstream data or base models must flow through the lineage record to all dependent downstream systems.

Metadata Governance

The catalog is only valuable if it is current, accurate, and trusted. Metadata governance — the policies and processes that ensure catalog records are maintained — is therefore as important as the catalog itself. This means defining who is responsible for updating catalog records when models are modified, deployed, or retired; establishing validation processes to check catalog completeness and accuracy at defined intervals; and integrating catalog update requirements into the MLOps deployment gate process so that production deployments cannot proceed without an updated catalog record. ISO 42001’s internal audit requirements (Clause 9.2) should include periodic verification of catalog completeness and accuracy as a standard audit procedure.

The Catalog as Certification Evidence

During an ISO 42001 Stage 2 certification audit, auditors will seek evidence that the organisation knows which AI systems it operates, has assessed their risks, and has implemented appropriate controls. An AI/ML asset catalog that is well-maintained, linked to risk assessment records, and integrated with Annex A control documentation significantly streamlines the audit process. It also demonstrates the kind of systematic, evidence-based governance posture that certification is designed to recognise and validate.

Monitoring and Continual Improvement

Risk management is not a one-time activity. ISO 42001’s Clause 9 requires ongoing monitoring of AI system performance and AIMS effectiveness, and Clause 10 requires continual improvement in response to what monitoring reveals. This aligns with the NIST AI RMF’s MEASURE and MANAGE functions, which emphasise ongoing risk monitoring as a core capability.

Effective AI risk monitoring combines quantitative metrics (model accuracy, fairness indicators, decision audit trails, incident rates) with qualitative assessment (user feedback, stakeholder concerns, regulatory developments). Monitoring results should feed directly into the management review process required by Clause 9.3 and should trigger corrective action under Clause 10.2 when performance falls below acceptable thresholds.

The Role of Speeki

Speeki’s certification audits assess not only whether an organisation has documented an AI risk management process but whether that process is actually operating effectively. Our audit teams look for evidence of real risk assessments — assessments that reflect the specific characteristics of the AI systems in scope, not boilerplate templates. We look for treatment plans that have been implemented, controls that are functioning, and monitoring that is informing management decisions.

For organisations seeking to build or improve their AI risk management capability, Speeki offers pre-assessment services and gap analysis support. We can help organisations understand where their current risk management approach falls short of ISO 42001 requirements and what steps are needed to close the gap.

Conclusion

AI risk management is the operational heart of responsible AI governance. ISO 42001 provides the management system framework; the NIST AI RMF provides a complementary operational lens. Together, they equip organisations to manage the distinctive and evolving risks of AI systematically, proportionately, and with independent verifiability. The organisations that treat AI risk management as a genuine operational discipline — not a documentation exercise — will be better positioned to deploy AI responsibly, avoid costly failures, and demonstrate trustworthiness to regulators, customers and partners.

About Speeki

Speeki is an ISO certification body specialising in AI management systems certification under ISO/IEC 42001:2023. We help organisations design, implement and certify AI governance programs that meet international standards and build stakeholder trust.

Visit speeki.com to learn more, or contact our team to discuss your AI governance journey.