Governing Generative AI: Risk, Compliance, and ISO 42001

Quick Read

Generative AI systems present distinct governance challenges compared to traditional predictive AI because their foundation models rely on opaque, largely undisclosed training datasets that deploying organisations cannot fully characterise or control, creating documentation and data quality gaps that ISO 42001 frameworks must address through compensating controls at the application layer. Hallucination — the tendency of large language models to produce confident but factually incorrect outputs — requires governance programmes to implement output validation, human oversight, and user communication calibrated to the specific risk profile of each use case, particularly in high-stakes domains such as medical, legal, and financial applications. Prompt injection vulnerabilities add a further layer of operational risk that demands security-focused governance design distinct from conventional AI risk management approaches.

Executive Summary

Generative AI — including large language models (LLMs), image generators, multimodal foundation models, and agentic AI systems — presents a governance challenge that is qualitatively different from conventional AI systems. The opacity of foundation model training, the unpredictability of generative outputs, the speed of capability development, and the emerging complexity of agentic systems that chain decisions autonomously all create risks that standard AI governance frameworks address only partially. ISO/IEC 42001:2023 provides a management system backbone that is capable of addressing generative AI governance when its requirements are interpreted and applied with appropriate specificity. This whitepaper explains the distinctive risk profile of generative AI, maps those risks onto ISO 42001 requirements and Annex A controls, and provides practical guidance for organisations governing their use of foundation models and LLM-based applications.

What Makes Generative AI Different

The AI governance frameworks developed over the past decade — ISO 42001, NIST AI RMF, EU AI Act — were largely conceived with predictive and decision-support AI systems in mind: models that classify, score, recommend, or predict based on structured inputs. Generative AI introduces a fundamentally different architecture and a fundamentally different risk profile.

Foundation Models and the Training Opacity Problem

Generative AI systems are typically built on foundation models — large neural networks trained on vast, heterogeneous datasets, often scraped from the public internet. The training datasets for major foundation models are rarely fully disclosed. This creates a provenance problem: organisations deploying an LLM-based application may be unable to fully characterise the knowledge, biases, and capability constraints embedded in the model they are using. Unlike a conventionally trained model with a documented training dataset, a foundation model’s “training data” in any meaningful operational sense is largely unknowable.

The implications for governance are significant. AI system documentation requirements under ISO 42001’s Annex A cannot be fully discharged through self-assessment when the underlying model is a third-party foundation model. Data quality controls (Annex A.4) cannot be applied to training data that the deploying organisation did not curate. Organisations must document the extent of training data visibility, acknowledge the gaps, and implement compensating controls at the application and deployment layer.

Hallucination and Factual Drift

Large language models can produce outputs that are confident, fluent, and factually incorrect. This characteristic — commonly called hallucination — arises because generative models are trained to produce plausible text, not verified truth. The model has no internal access to ground truth and no reliable way to distinguish between things it has reliably learned and things it has confabulated from patterns in training data. Hallucination rates vary by model, by domain, and by the way in which the model is prompted and used.

For governance purposes, hallucination is a model performance risk (Annex A.6) that must be addressed through output validation controls, human oversight mechanisms, and clear user communication about the limitations of generative outputs. In high-stakes applications — medical information, legal research, financial advice, compliance guidance — the consequences of uncorrected hallucination can be severe. Governance programmes must ensure that human oversight requirements are calibrated to the hallucination risk profile of the specific use case.

Prompt Injection and Adversarial Inputs

Generative AI systems that accept natural language inputs are vulnerable to prompt injection attacks — adversarial inputs designed to override the model’s instructions, extract sensitive information, generate harmful content, or cause the model to take unintended actions. Prompt injection is analogous to SQL injection in conventional software security: it exploits the model’s tendency to follow instructions embedded in its input rather than distinguishing between operator instructions and user content.

Prompt injection is particularly acute for agentic AI systems — those that act autonomously by executing tools, querying databases, sending communications, or initiating transactions on behalf of users. In these systems, a successful prompt injection can cause the model to take real-world actions beyond generating text. ISO 42001’s security controls (Annex A.6 and related clauses) must be interpreted to encompass LLM-specific attack vectors, including input validation, output sanitisation, and access control for agentic actions.

Agentic AI: When Models Act, Not Just Generate

The most significant emerging governance challenge in generative AI is the shift from models that generate text to agents that take actions. Agentic AI systems use LLMs as reasoning engines that can call external tools — web search, code execution, email, databases, APIs — and chain decisions across multiple steps without continuous human intervention. An agentic AI system might autonomously research a topic, draft a document, review it against criteria, send it for approval, and file the result — with minimal human involvement in the intermediate steps.

This creates governance challenges that extend well beyond text generation. The human oversight requirements of ISO 42001 (Annex A.7) must address not just whether humans review outputs but whether and how humans are able to supervise, interrupt, and override autonomous action chains. The impact assessment (Clause 6.1.4) must consider the full range of actions an agentic system could take, not just its primary stated function. And accountability must be clearly assigned — when an agentic system takes a harmful action, who is responsible?

The Agentic Governance Gap

Current AI governance frameworks, including ISO 42001, were not designed with fully autonomous agentic systems in mind. As agentic AI moves from research prototype to enterprise deployment, governance programmes must develop explicit policies for agentic system design — defining which actions agents can take autonomously, which require human confirmation, and which are prohibited entirely. This policy layer does not yet exist in most organisations’ AIMS documentation.

The Distinctive Risk Categories of Generative AI

The table below maps the distinctive risk categories of generative AI systems to their governance implications under ISO 42001.

GenAI Risk	Description and Governance Implication	ISO 42001 Controls
Hallucination	Model produces confident but factually incorrect outputs. Risk is highest in high-stakes domains. Requires output validation, human review, and user disclosure controls.	Annex A.6 (performance), A.7 (oversight), A.9 (transparency)
Prompt Injection	Adversarial inputs override model instructions. Risk is highest for models accepting untrusted user input or operating agentically. Requires input validation and access controls.	Annex A.6 (security), Cl. 6.1.2 (risk assessment)
Training Data Provenance	Foundation model training data is unknown or partially known. Creates bias, copyright, and accountability gaps. Requires third-party supplier assessment and compensating controls.	Annex A.4 (data), A.10 (suppliers), Cl. 8
Agentic Action Risk	LLM agents that execute tools or transactions autonomously. Amplifies impact of errors or adversarial inputs. Requires explicit agentic action policies and human intervention points.	Annex A.7 (oversight), Cl. 6.1.4 (impact assessment)
Content Harm	Model generates harmful, offensive, misleading, or discriminatory content. Requires content filtering, output monitoring, and incident reporting.	Annex A.5 (fairness), A.9 (transparency), Cl. 9.1
Intellectual Property	Model outputs may incorporate copyrighted material from training data. Requires IP risk assessment and output review for deployment in IP-sensitive contexts.	Annex A.4 (data governance), Cl. 6.1.2
Model Capability Drift	Foundation model updates by third-party providers can change behaviour without notice. Requires change management controls for third-party models.	Annex A.10 (suppliers), Cl. 8.3 (risk treatment)
Privacy and Data Leakage	Prompts containing personal data may be logged, used for fine-tuning, or exposed. Requires data minimisation policies for prompts and output retention controls.	Annex A.4 (data quality/governance), Cl. 6.1.2

Free Reference: MITRE SAFE-AI Threat-to-Control Mapping

The SAFE-AI framework (MITRE Corporation, April 2025, publicly released for unlimited use) provides a structured mapping of adversarial AI threats to NIST SP 800-53 security controls, including specific coverage of LLM-specific threats such as prompt injection, model exposure via inference APIs, sensitive information disclosure through training data memorisation, and supply chain infiltration through unvetted foundation models. Its threat catalogue references MITRE ATLAS™ identifiers and maps controls across four system elements (Environment, AI Platform, AI Model, AI Data). For organisations assessing GenAI security risks, SAFE-AI provides a ready-made threat catalogue and control shortlist that complements ISO 42001’s Annex A controls.

AI Deployment Models and Shared Governance Responsibility

A foundational principle of generative AI governance — and one that ISO 42001 implementers frequently underestimate — is that governance obligations are not uniform across all deployment patterns. The way an organisation deploys AI materially changes both its risk exposure and its share of accountability for the outcomes that AI system produces.

Generative and ML AI deployments can be categorised into six broad archetypes, each with a distinct risk profile and a different shared responsibility boundary between the deploying organisation and its technology providers.

Predictive and Classical ML

Internally trained models (classification, regression, forecasting) built on the organisation’s own data. The deploying organisation owns the full governance stack: data sourcing and quality, model development, validation, deployment, and monitoring. There is no external model provider to share accountability with. ISO 42001’s Annex A controls apply end-to-end, and reproducibility requirements under A.6 are the organisation’s responsibility alone.

Foundation Model APIs

The organisation calls a third-party foundation model via API, using the model as-is without modification. Governance responsibility for the model’s training, safety alignment, and base capabilities rests with the model provider. The deploying organisation’s obligations focus on application design, prompt engineering controls, output monitoring, and supplier assessment under Annex A.10. The organisation’s data is not used to train the model, which limits certain risks but also limits visibility into model behaviour.

Fine-Tuned LLMs

The organisation adapts a pre-trained foundation model using its own proprietary data. This is a high-governance-investment scenario: the organisation is now introducing its own data into the model’s parameters, which amplifies both the value and the risk. Fine-tuning increases the risk of sensitive information being embedded in model weights and subsequently exposed through inference. It also means the organisation assumes greater responsibility for the fine-tuned model’s behaviour. A full AI system risk assessment (ISO 42001 Clause 6.1.2) is required.

RAG, AI Agents, and External Models

Retrieval-augmented generation (RAG) systems ground LLM responses in documents retrieved from the organisation’s own knowledge bases at inference time. AI agents chain LLM reasoning with tool use and real-world actions. Both architectures significantly expand the governance perimeter: the risk is no longer just what the model was trained on but what it can access and what it can do at runtime. Data access controls, agent permission boundaries, and human oversight requirements (Annex A.7) are critical governance obligations. Systems that use external models — third-party AI services embedded in products the organisation uses rather than deploys — require supplier due diligence and contractual risk allocation even when the organisation’s own development teams are not involved.

The Core Principle: More Enterprise Data, More Governance Obligation

Governance investment should scale with the degree to which the organisation’s own proprietary data is incorporated into the AI system — whether at training time (fine-tuning), at inference time (RAG, agent tool access), or both. A system that calls a third-party API with no organisational data carries the lightest governance burden; a fine-tuned model with RAG access to sensitive internal documents carries the heaviest. ISO 42001’s risk-proportionate approach under Clause 6.1 requires organisations to assess this dimension of AI risk explicitly when scoping their AIMS controls.

How ISO 42001 Applies to Generative AI

ISO 42001 was designed to be technology-neutral and adaptable to emerging AI architectures. Its requirements apply to generative AI systems, but applying them effectively requires interpreting the standard’s requirements with specific awareness of the characteristics described above. The following sections address the most significant application points.

Defining Scope (Clause 4.3)

Organisations using generative AI must ensure that their AIMS scope explicitly addresses LLM-based applications, regardless of whether those applications are internally developed or deployed through third-party APIs. Many organisations erroneously assume that responsibility for a foundation model’s behaviour rests with the model provider rather than the deploying organisation. Under ISO 42001, the deploying organisation is responsible for how AI systems behave in its operational context and must assess and manage the risks of any AI system it deploys, regardless of who built the underlying model.

AI Risk Assessment for LLMs (Clauses 6.1.2 and 8.2)

AI risk assessments for LLM-based applications must explicitly address the generative AI risk categories described above. Generic AI risk templates that address model accuracy, data quality, and bias in the conventional sense are insufficient for generative AI. Risk assessments must specifically evaluate: the hallucination risk profile for the specific use case; the prompt injection attack surface; the extent of training data visibility and its implications for bias and IP risk; the agentic action surface (if any); and the model capability drift risk for third-party foundation models. These assessments should be reviewed whenever the underlying model is updated or the application is significantly modified.

AI System Impact Assessment for Generative Applications (Clause 6.1.4)

The AI system impact assessment must reflect the broad and variable output space of generative models. Unlike a classification model whose output is constrained to a defined label set, an LLM can produce almost any text output — which means the potential impact space is correspondingly broad. Impact assessments must consider: who is exposed to the model’s outputs; what decisions those outputs might influence; what harm could result from hallucination, bias, or harmful content; and — for agentic systems — what actions the system could take and what harm those actions could cause. The assessment should be use-case specific, not generic.

Human Oversight for Generative AI (Annex A.7)

Human oversight requirements are particularly critical for generative AI. The fluency and apparent confidence of LLM outputs creates automation bias: users may trust model outputs more than they should, particularly when the outputs are coherent and plausible even if incorrect. Oversight mechanisms must be designed to counteract this. For high-stakes generative AI applications, this may require mandatory human review of all outputs before action; for lower-stakes applications, it may mean user disclosure, output flagging, and feedback mechanisms. For agentic systems, oversight must explicitly define which steps in an action chain require human confirmation.

Supplier Management for Foundation Model Providers (Annex A.10)

Foundation model providers — whether accessed via API or deployed locally — are AI system suppliers under ISO 42001’s Annex A.10. Organisations must assess the governance practices of their foundation model providers, including: training data governance and bias testing practices; security and safety testing (red-teaming, adversarial robustness); terms governing data use and retention for prompts and outputs; change management and model update notification practices; and availability and reliability commitments. Many foundation model providers publish model cards, system cards, or responsible AI documentation that can inform this assessment. Where such documentation is absent or inadequate, organisations must document the gap and implement compensating controls.

Transparency and Documentation for Generative AI (Annex A.8 and A.9)

Documentation requirements under Annex A.8 must be adapted for generative AI to acknowledge the inherent limitations of foundation model transparency. Where full training data provenance is unavailable, organisations should document what is known, what is unknown, and what compensating controls are in place. User-facing transparency (Annex A.9) must include appropriate disclosure about the generative nature of outputs, their potential for error, and the limitations of the specific application. Users who rely on AI-generated content for decisions must be informed of those limitations.

Governing Third-Party LLM APIs

Most organisations deploying generative AI do not train their own foundation models. They access foundation models through APIs provided by major AI laboratories — and this creates a specific governance challenge. The deploying organisation has limited visibility into the model’s training, limited ability to audit its behaviour, and limited control over updates.

Governance of third-party LLM APIs requires a structured approach that addresses four dimensions. First, provider assessment: evaluating the governance practices of the foundation model provider against the criteria described in the supplier management section above. Second, contractual controls: ensuring that API terms of service address data use, model update notification, security obligations, and liability allocation in a manner consistent with the organisation’s risk appetite. Third, application-layer controls: implementing governance controls at the application layer that compensate for the opacity of the foundation model — input validation, output monitoring, content filtering, human review workflows, and audit logging. Fourth, change management: establishing a process for reviewing and assessing the impact of foundation model updates on the organisation’s AI risk profile and governance controls.

The “Wrap” Governance Model

Organisations cannot govern foundation models directly — but they can govern the applications they build on top of them. The governance approach for LLM-based applications should be thought of as a ‘wrap’: ISO 42001’s requirements are implemented at the application layer, wrapping the foundation model with the controls that the model provider cannot or does not provide. The wrap includes the risk assessment, the impact assessment, the human oversight mechanisms, the content filters, the audit logs, and the user disclosures. The foundation model is treated as a supplier-provided component — assessed, contracted, and monitored — but not assumed to be inherently trustworthy without compensating controls.

Agentic AI: A Governance Framework

Agentic AI systems that execute real-world actions require a governance layer beyond what is needed for generative text applications. The following principles provide a starting framework for organisations developing AIMS documentation for agentic AI.

Define the Action Surface

The first governance step for any agentic AI system is to explicitly enumerate the actions the agent is permitted to take. This action inventory — covering tools, APIs, databases, communication systems, and transactional capabilities the agent can access — is the foundation for impact assessment and human oversight design. Actions should be classified by risk level: low-risk actions (read-only data retrieval) may be permitted without human confirmation; medium-risk actions (drafting communications for human review) require oversight before execution; high-risk actions (sending communications, initiating transactions, modifying records) require human authorisation. This classification should be documented in the AIMS and reviewed whenever the agent’s capabilities change.

Design for Human Interruption

Agentic AI systems must be designed so that humans can interrupt and redirect the agent at any point in an action chain. This is not only a governance best practice but an emerging regulatory expectation under the EU AI Act’s human oversight requirements for high-risk AI. Technical design must implement interrupt mechanisms — checkpoints at which the agent pauses and presents its plan for human review before proceeding. The frequency and placement of these checkpoints should be proportionate to the risk level of the actions in the chain.

Audit Logging for Agentic Actions

Every action taken by an agentic AI system must be logged in sufficient detail to enable retrospective review. Logs must capture the input that triggered each action, the reasoning process (where visible), the action taken, the outcome, and any human interventions or overrides. This logging is essential both for incident investigation and for demonstrating compliance with ISO 42001’s documentation and performance evaluation requirements. Log retention periods must reflect the potential latency between action and consequence in the specific use case.

Connecting GenAI Governance to Certification

Organisations pursuing ISO 42001 certification should ensure that their AIMS documentation explicitly addresses their generative AI activities. A common gap in certification audits is that AIMS documentation was developed for conventional AI systems and does not address the distinctive governance requirements of generative and agentic AI.

Certification auditors will look for: explicit identification of LLM-based applications in the AI system inventory; risk assessments that address hallucination, prompt injection, and model provenance; AI system impact assessments appropriate to the use case; documented human oversight mechanisms for generative applications; supplier assessments for foundation model providers; and application-layer controls that compensate for the governance limitations of third-party foundation models.

Organisations that deploy generative AI without explicit governance documentation for those systems will face significant audit findings, regardless of the quality of their governance for conventional AI systems. Generative AI must be treated as a first-class governance object, not an afterthought in an AIMS built for traditional machine learning.

The Role of Speeki

Speeki’s certification and advisory services include specific expertise in generative AI governance. Our audit teams understand the distinctive risk profile of LLM-based applications and agentic AI systems, and our pre-assessment services can help organisations identify gaps in their AIMS documentation for generative AI before formal certification engagement.

As generative AI governance frameworks develop — through ISO technical reports, sector guidance, and regulatory requirements — Speeki will update our audit methodologies and advisory services to reflect emerging best practice. Organisations that engage with Speeki on generative AI governance will benefit from this evolving expertise as the field matures.

Conclusion

Generative AI is not a niche or emerging technology — it is already deployed at scale across virtually every sector of the economy, with capabilities expanding rapidly. Its governance challenges are real, significant, and distinct from those of conventional AI systems. ISO 42001 provides the management system backbone to address these challenges, but only if its requirements are applied with appropriate specificity to the hallucination risk, prompt injection vulnerability, training data opacity, supplier complexity, and agentic action capability that define the generative AI risk landscape.

Organisations that govern generative AI seriously — that build AIMS documentation that genuinely addresses these characteristics rather than applying generic AI governance templates — will be better positioned to deploy generative AI responsibly, maintain certification credibility, and demonstrate trustworthy AI governance to the customers, regulators, and investors who are increasingly demanding it.

About Speeki

Speeki is an ISO certification body specialising in AI management systems certification under ISO/IEC 42001:2023. We help organisations design, implement and certify AI governance programs that meet international standards and build stakeholder trust.

Visit speeki.com to learn more, or contact our team to discuss your AI governance journey.