# Bias, Fairness and Explainability Evidence for AI Governance

Canonical URL: https://xpertdpo.com/bias-fairness-explainability-evidence-ai-governance/

Content type: Article

Published: 2026-06-25T15:11:16+01:00

Updated: 2026-06-25T17:57:01+01:00

Author: Philipa Jane Farley, Head of Legal and Operations

Summary: AI fairness and explainability work best when they are treated as governance evidence, not slogans. DPOs, legal teams and boards need a clear record of the use case, bias risks, testing, explanations, human oversight and review triggers.

## Article

*This article accompanies Hour 4: Bias, Fairness & Explainability in AI in our full-day CPD programme on [XpertAcademy](https://xpertacademy.com/). Completion of the full one-hour session, including the related learning materials, contributes to the one-hour CPD certificate issued for that session. You can access the course here: [CPD Event B: Full-Day AI, Technical Privacy & Emerging Technology Training](https://xpertacademy.com/cpd-event-b-ai-technical/).*

 Bias, fairness and explainability are often treated as AI governance words before they are treated as evidence questions. A policy may say that AI must be fair. A supplier may say that a model has been tested for bias. A slide deck may say that an AI system is explainable. None of that is enough on its own.

 For a DPO, privacy team, legal team or board, the practical question is different: what evidence shows that the organisation understood the people affected, tested the risks that mattered, explained the system in a way that matched the context, and kept the route open for challenge and review?

 That question matters under GDPR and UK GDPR because fairness, transparency, accountability, accuracy, data protection by design and individual rights need to be visible in the way the AI use case is scoped, tested, deployed and monitored. It also matters under wider AI governance frameworks, including the EU AI Act where relevant, because risk classification, transparency and human oversight depend on real facts about the system and its intended use.

 This is general guidance, not legal advice.

> The governance test is not whether the organisation can say "we considered fairness". It is whether the record shows what fairness meant for this use case, what was tested, what limits remain and who will act if the evidence changes.

### Start With the Decision or Effect on People

 Bias and explainability cannot be assessed properly in the abstract. The first step is to describe what the AI system does and how it may affect people.

 That may be a direct decision, such as whether someone is shortlisted, prioritised, flagged, scored, escalated, investigated or refused. It may also be a decision-support effect, where the system recommends, summarises, ranks, classifies or drafts material that a human then uses. Decision-support systems can still have significant effects if staff rely on the output or if the system is embedded into a high-volume workflow.

 The record should identify who may be affected and whether the use case involves special category data, inferred data, profiling, monitoring, financial vulnerability, employment consequences, healthcare, education, access to essential services or another sensitive context.

 This is where AI governance and DPIA discipline connect. A DPIA that says "AI tool for productivity" is too thin if the real use is summarising employee grievances or ranking customer vulnerability referrals. See XpertDPO's note on [AI governance and data protection impact assessments](https://xpertdpo.com/ai-governance-and-data-protection-impact-assessments-dpias/): the assessment has to follow the governed use case, not the product name.

### Bias Evidence Is Not One Metric

 Bias can enter an AI system in several places. It may sit in the training data, the labels, the way historical decisions were recorded, the sample used for testing, the features selected, the objective function, the measurement method, the user interface, the deployment environment or the way human reviewers respond to the output.

 That is why a single statement that "the model was bias tested" is rarely enough. The organisation needs to know what bias risk was tested, for which groups, with what data, against what benchmark, using which metric, and with what result. It also needs to know what was not tested and why.

 Fairness testing may look different depending on the use case. The issue may be representativeness, proxy variables, subgroup performance, measurement quality or operational design. In some contexts, the most important fairness issue may not be algorithmic at all. It may be who can override the system, whether staff are trained to challenge outputs, and whether affected people can obtain meaningful human review.

 The ICO's AI guidance is useful here because it treats fairness in AI as broader than discrimination alone. It links fairness to processing, outcomes, reasonable expectations, adverse effects, statistical accuracy, safeguards and accountability. Organisations may also have equality, consumer, employment or sector duties outside data protection law, so the DPIA record should not pretend that GDPR is the only lens.

 The evidence should be proportionate but specific. For a low-risk internal drafting tool, the fairness record may be short. For AI used in recruitment, healthcare triage, credit, fraud, education, housing, employment monitoring or public services, the record should be much stronger.

### Fairness Means Process and Outcome

 In data protection terms, fairness is not only about whether the organisation intended to be fair. It is also about whether the processing is misleading, unexpected, unduly detrimental or likely to produce unjustified adverse effects.

 That has two practical consequences.

 First, fairness has to be considered during design and deployment, not only after complaints arise. The organisation should be able to show why the system is suitable, why the data is relevant and necessary, how accuracy and error have been considered, what safeguards exist, and whether less risky alternatives were considered.

 Second, fairness has to be revisited when the use case changes. A model that performs acceptably in a pilot may behave differently when deployed to another location, population, language group, business process or data source. A vendor update, retraining exercise, new connector or changed threshold can alter the fairness position.

 This is one reason [AI DPIAs become harder than they first appear](https://xpertdpo.com/why-ai-dpias-become-harder-than-they-first-appear/). The assessment needs to preserve the judgement trail: what was approved, what evidence supported approval, what limits were accepted, and what would cause reopening.

### Explainability Is Not the Same as Opening the Model

 Explainability is sometimes framed as a technical problem: can the model be interpreted, can feature importance be shown, or can a supplementary explanation tool describe the output? Those questions matter, but they are only part of the governance picture.

 For privacy and legal teams, explainability should start with the audience and the decision context. A senior risk committee, a DPO, a customer, an employee, an auditor and a human reviewer may each need different information. The explanation does not have to disclose trade secrets or overwhelm people with technical detail, but it does need to be meaningful enough for the purpose.

 The ICO and The Alan Turing Institute guidance on explaining decisions made with AI is helpful because it breaks explanations into practical types: rationale, responsibility, data, fairness, safety and performance, and impact. In plain terms, people may need to know why an outcome happened, who is responsible, what data mattered, how unfairness was addressed and what the outcome may mean for them.

 That framework also helps internal governance. If the organisation cannot explain ownership, data use, performance monitoring, fairness testing, human review and what affected people will be told, it probably does not yet have enough evidence for confident sign-off.

 This is true even where the AI system is described as low, limited or minimal risk. Sometimes the question is whether the organisation can give a simple, accurate explanation of what the system does and does not do. XpertDPO has covered that point separately in [When Low, Limited or Minimal Risk AI Still Needs Explaining](https://xpertdpo.com/when-low-limited-or-minimal-risk-ai-still-needs-explaining/).

### What Good Evidence Looks Like

 Good evidence does not mean a longer form. It means a record that connects the use case, risk, control and decision. The following table gives a practical starting point.

| Evidence area | What the record should show |
| --- | --- |
| Use case and affected people | The approved purpose, user group, affected individuals, decision or support function, excluded uses and sensitive contexts. |
| Data and proxy review | The data used, source and quality of that data, inferred data, proxy variables, minimisation decisions and any special category or vulnerable-person risk. |
| Bias and fairness testing | The fairness risks considered, groups or characteristics tested where appropriate, metrics used, limitations, results, residual risk and retesting plan. |
| Explainability plan | What will be explained to affected people, staff, reviewers, DPOs, boards or auditors, and how the explanation will stay accurate after change. |
| Human oversight and challenge | Who reviews outputs, what authority they have, when they must intervene, how people can challenge outcomes and how reviews are recorded. |
| Monitoring and review triggers | Scheduled review, drift or performance monitoring, complaint and incident routes, supplier change controls and events that reopen the DPIA or AI assessment. |

 This evidence should sit with the DPIA, AI impact assessment, vendor review, security review and board or risk sign-off where relevant. It should not be stranded in a data science notebook, a procurement answer or a supplier slide deck.

### What a Good EU-Centred Bias Audit Should Look Like

 There is no single universal "EU bias audit" that works for every model or system. A useful audit is specific to the use case, the people affected, the legal role of the organisation, the data being processed and the decision or service context. But an EU-centred audit should have a recognisable shape.

 First, it should start with scope. The audit should identify whether the organisation is reviewing a model, a configured product, a full AI system, a business process using AI outputs, or a supplier system deployed in a specific setting. That distinction matters. Bias may sit in the model, but it may also sit in the workflow, the threshold chosen by the deployer, the data fed into the system, the human review process or the way the output is used.

 Second, the audit should connect GDPR, AI Act and fundamental-rights questions rather than treating them as separate languages. Under data protection law, the audit should test fairness, transparency, accuracy, data minimisation, lawful basis, necessity, proportionality, individual rights, Article 22 risk where relevant, and DPIA evidence. Under the AI Act, where the system is high-risk or close to a high-risk use case, the audit should test risk management, data governance, dataset quality, logging, documentation, information for deployers, human oversight, robustness, cybersecurity and accuracy. Where people may be excluded, deprioritised, profiled or disadvantaged, the audit should also consider equality, employment, consumer, financial, education, health or public-service obligations that sit outside GDPR.

 For a practical review, the audit pack should usually include:

| Audit area | What good evidence should show |
| --- | --- |
| Use-case and role scope | The approved purpose, whether the organisation is provider, deployer, controller, processor or another actor, and whether the audit covers the model alone or the whole system in use. |
| Population and impact | The people affected, relevant groups, vulnerability factors, protected characteristics where lawful and appropriate to assess, and the decisions or service effects the system may influence. |
| Dataset and proxy review | Data sources, representativeness, missing groups, labelling quality, proxy variables, historic bias, data minimisation choices and limits on what can lawfully be measured. |
| Testing approach | The fairness questions tested, metrics used, subgroup analysis where appropriate, false positive and false negative effects, confidence limits, and why the chosen metrics fit the use case. |
| Human oversight | Who reviews outputs, when they must intervene, what training they receive, how overrides are recorded, and whether human review is meaningful rather than rubber-stamping. |
| Explanation and challenge | What affected people, staff, DPOs, auditors and boards can understand about the system, and how a person can query, challenge or obtain review of an outcome. |
| Residual risk and action | Known limits, unresolved evidence gaps, mitigation actions, sign-off conditions, monitoring thresholds and events that require re-audit. |

 The audit should be honest about data limitations. In some EU contexts, organisations may not be able to process protected-characteristic data freely for testing. That does not mean the audit stops. It means the record should explain what lawful testing was possible, whether representative or proxy evidence was used, what limitations remain, and whether the organisation needs legal, equality or sector-specific advice before proceeding.

 It should also separate model performance from system fairness. A model may score well on a technical benchmark and still create unfair outcomes when used in the wrong population, with poor data quality, weak human oversight or a threshold that has not been tested against real-world consequences. Conversely, a system may have known model limitations but remain proportionate if the use case is narrow, the output is advisory, affected people are not disadvantaged, and review controls are strong.

 For board and legal assurance, the final audit output should not be a dense technical appendix alone. It should provide a short decision record: what was audited, what evidence was reviewed, what bias risks were found, what mitigations are in place, what residual risk remains, who owns the risk, and when the audit must be repeated. That is the part most likely to matter if a complaint, regulator question, procurement challenge, employment dispute or customer harm issue arises later.

### Vendor Assurance Needs Context

 Many organisations will rely on third-party AI systems. That is normal, but it changes the evidence problem.

 A vendor may provide model cards, evaluation reports, audit summaries, instructions for use, technical documentation, human oversight guidance, performance information, security statements and terms about model improvement or data retention. Those materials may be useful. They are not the same as assurance for the organisation's own deployment.

 The organisation still needs to ask whether the vendor's evidence matches the actual use case. Was testing done on a comparable population? Does the system perform reliably in the language, domain and data context in which it will be used? Are the supplied explanations suitable? Does the contract support the transparency, retention, deletion, incident and change-control commitments being made in the DPIA?

 Under the EU AI Act, role mapping may also matter. Providers, deployers and other actors have different obligations, and high-risk systems bring more formal evidence requirements. Privacy teams do not need to own every AI Act workstream, but they should know where AI Act evidence overlaps with DPIAs, transparency, human oversight and board assurance. XpertDPO's article on [EU AI Act provider and deployer obligations for privacy teams](https://xpertdpo.com/eu-ai-act-provider-deployer-obligations-privacy-teams/) is a useful companion piece.

### Board and Legal Assurance: Ask for the Judgement Trail

 Senior teams do not need every technical detail, but they do need a clear judgement trail. That trail should show what the organisation is approving, what it is relying on, what remains uncertain, who owns the residual risk and when the position will be reviewed.

 For higher-risk AI use cases, a board or legal assurance pack should not simply say that bias, fairness and explainability have been considered. It should summarise the evidence: use case, affected people, personal data, fairness risks, testing performed, explanation approach, human oversight, vendor limitations, residual risks, DPO advice and review triggers.

 It should also be honest about gaps. If bias testing is incomplete, protected characteristic data is unsuitable for testing, a supplier cannot provide subgroup performance evidence, explanations are unfinished, or human review is not yet operational, the sign-off record should say so. The choice is then visible: pause, narrow the use case, add controls, approve conditionally, or reject the deployment.

 That is the kind of evidence record that supports [Board / Legal Privacy Assurance](https://xpertdpo.com/board-legal-privacy-assurance/) work. It does not guarantee that every later judgement will be agreed with, but it does show that the organisation made a reasoned decision with the evidence available at the time.

### How This Connects to DPIA and DPO Support

 Bias, fairness and explainability evidence should not be bolted onto a DPIA at the end. Where an AI use case processes personal data and may affect people, these issues are part of the DPIA conversation from the start.

 The DPIA should capture the nature, scope, context and purpose of the processing; necessity and proportionality; risks to individuals; measures to address those risks; residual risk; DPO advice where applicable; and whether prior consultation is needed if high risk cannot be reduced. For AI, that should connect to data quality, statistical accuracy, reasonable expectations, transparency, human review, challenge routes, vendor evidence and monitoring.

 Where the use case is uncertain, the first step may be a short screening record. Where risk is material, the work may need a full DPIA, AI impact assessment, AI Act role mapping, vendor review and board assurance. The important point is that the records should agree with each other. A procurement file should not describe one use case while the DPIA describes another.

 XpertDPO's [AI Governance and DPIA Lifecycle Support](https://xpertdpo.com/ai-governance-dpia-lifecycle-support/) is designed for this connection point: keeping AI use-case review, DPIA evidence, supplier evidence and review cycles aligned. For individual assessments, [DPIA Support](https://xpertdpo.com/data-protection-impact-assessment-dpia-support/) can help structure the risk record. For in-house DPOs, [DPO Support](https://xpertdpo.com/dpo-support/) can provide a second view without transferring ownership away from the controller.

### The Practical Takeaway

 Bias, fairness and explainability should not be treated as decorative AI ethics language. They are governance evidence questions.

 A useful AI governance record should answer, in plain terms: what system is being used, who may be affected, what personal data is involved, what bias or unfairness risks were considered, what testing was done, what explanations will be provided, what human oversight exists, what the organisation has accepted, and what will trigger review.

 The level of evidence should match the risk. A low-risk internal AI feature may need a brief explanation and sensible controls. A high-impact tool affecting employment, health, finance, education, public services or access to essential services needs stronger proof. The difference should be deliberate, documented and reviewable.

 For CPD Event B Hour 4, the core learning point is simple: fairness and explainability are not solved by saying the words. They are evidenced through the lifecycle of the AI use case.

 *This article is intended to support the learning covered in Hour 4 of our [XpertAcademy](https://xpertacademy.com/) CPD programme. The relevant CPD certificate is issued for completion of the full one-hour session on XpertAcademy, rather than for reading this article on its own. You can return to the course here: [CPD Event B: Full-Day AI, Technical Privacy & Emerging Technology Training](https://xpertacademy.com/cpd-event-b-ai-technical/).*

### Sources

- Information Commissioner's Office, Guidance on AI and data protection: [https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/)
- Information Commissioner's Office, How do we ensure fairness in AI?: [https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/how-do-we-ensure-fairness-in-ai/](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/how-do-we-ensure-fairness-in-ai/)
- Information Commissioner's Office, How do we ensure transparency in AI?: [https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/how-do-we-ensure-transparency-in-ai/](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/how-do-we-ensure-transparency-in-ai/)
- Information Commissioner's Office and The Alan Turing Institute, Explaining decisions made with AI: [https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/explaining-decisions-made-with-artificial-intelligence/](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/explaining-decisions-made-with-artificial-intelligence/)
- Information Commissioner's Office, AI and data protection risk toolkit: [https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/ai-and-data-protection-risk-toolkit/](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/ai-and-data-protection-risk-toolkit/)
- European Data Protection Board, Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models: [https://www.edpb.europa.eu/documents/opinion-of-the-board-art-64/opinion-282024-on-certain-data-protection-aspects-related-to_en](https://www.edpb.europa.eu/documents/opinion-of-the-board-art-64/opinion-282024-on-certain-data-protection-aspects-related-to_en)
- European Commission, AI Act overview: [https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai](https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai)
- European Commission, Navigating the AI Act: [https://digital-strategy.ec.europa.eu/en/faqs/navigating-ai-act](https://digital-strategy.ec.europa.eu/en/faqs/navigating-ai-act)
- NIST AI Resource Center, AI Risks and Trustworthiness / AI RMF 1.0: [https://airc.nist.gov/airmf-resources/airmf/3-sec-characteristics/](https://airc.nist.gov/airmf-resources/airmf/3-sec-characteristics/)
- OECD, AI Principles: [https://www.oecd.org/en/topics/sub-issues/ai-principles.html](https://www.oecd.org/en/topics/sub-issues/ai-principles.html)

## General Information Only

This article is provided for general information and does not constitute legal, regulatory, or professional advice. Data protection obligations depend on the specific facts, context, and jurisdiction involved. You should not rely on this content as a substitute for advice tailored to your organisation.

If you would like support with a specific issue, please contact us: https://xpertdpo.com/contact/
