← Back to blog

What Is an AI Fairness Certificate and Why Your Model Needs One

AI fairness certificates provide verifiable proof that machine learning models have been tested for bias. Learn what they contain, how they work, and why regulators are requiring them.

AI fairness fairness certificate machine learning responsible AI

6 min read

Every AI model makes decisions. Some of those decisions affect people’s lives — whether they get a loan, a job interview, or a medical diagnosis. A Fairness Certificate answers one question: was this model tested for bias, and what were the results?

The Problem with AI Fairness Today

Most organizations handle AI fairness one of three ways:

  1. Ignore it — ship the model and hope for the best
  2. Internal reports — run some tests, write a PDF, file it away
  3. Third-party audits — expensive, slow, and the auditor holds the keys

None of these are verifiable. None of them are standardized. And none of them survive a regulator asking “prove it.”

What a Fairness Certificate Contains

A Fairness Certificate is a structured, cryptographically signed document with three layers:

Fairness Metrics

  • Demographic parity: Does the model’s positive prediction rate differ across groups?
  • Equalized odds: Does the model’s accuracy differ across groups?
  • Bottleneck dimensionality (d): The size of the model’s internal representation. This is the critical number — it determines how much protected information the model can encode, and therefore how fair the model is architecturally. (More on this below.)

Integrity Guarantees

  • SHA-256 hash: A unique fingerprint of the certificate contents. Any modification invalidates it.
  • QR code: Links to verification at paragondao.org
  • Timestamp: When the certificate was generated

Model Documentation

  • Dataset size and protected attributes tested
  • Model architecture and bottleneck dimension
  • Per-group performance breakdown

Why Bottleneck Dimensionality Matters

Traditional fairness metrics tell you what happened. Bottleneck dimensionality tells you why.

A model can achieve demographic parity by accident — or by gaming the metric. Bottleneck dimensionality measures something more fundamental: how much capacity does the model’s internal representation have to encode protected attributes?

The parameter d is an integer representing the number of dimensions in the model’s latent space:

  • d=8: Small representation. The model is forced to prioritize — it keeps task-relevant features and discards protected attributes because there is not enough room for both. Fairness is structural.
  • d=128: Large representation. The model has excess capacity. It can store both the task signal and ancestry, gender, or age signals. Any observed fairness at d=128 is coincidental, not guaranteed.

This is the difference between a model that happens to be fair on your test set and one that is architecturally constrained to be fair.

Our research across genomics (5 ancestral populations, 6 clinical traits) and electroencephalography (20 subjects) demonstrates that d accounts for 46.6 percentage points of variation in bias, compared to just 2.2pp from adversarial training strength. Read the full analysis.

Machine-Verifiable, Not Just Human-Readable

A PDF report sits in a folder. A Fairness Certificate lives on infrastructure:

# Verify any certificate by its hash
curl https://paragondao.org/api/v1/verify/abc123...

# Response includes full metrics, timestamp, and integrity status

Regulators, customers, and partners can independently verify any certificate without trusting your internal processes.

Who Needs Fairness Certificates

The EU AI Act (August 2026) is the most immediate regulatory driver, but it is not the only one. High-risk AI systems in healthcare and genomics are the primary use cases today:

  • Genomics and precision medicine: PRS models that predict disease risk across diverse ancestral populations
  • Health AI: Diagnostic models, treatment recommendations, risk scoring deployed to diverse patient populations
  • Clinical trials: FDORA 2022 mandates diversity action plans for AI-assisted trial design

As fairness documentation becomes standard practice, financial services (lending, credit scoring), HR (resume screening, candidate ranking), and insurance (risk assessment) will follow the same trajectory.

Generate Your First Certificate

docker pull ghcr.io/paragon-dao/paragon-fairness:latest

docker run --rm -v $(pwd)/data:/data \
  ghcr.io/paragon-dao/paragon-fairness:latest \
  train --data /data/dataset.csv \
        --target outcome \
        --protected race gender

Your data stays on your machine. The certificate is generated locally. The free tier supports up to 1,000 samples at two bottleneck dimensions (d=32 and d=64).

Try the interactive demo to see how fairness changes as you adjust the bottleneck dimension — or follow the step-by-step Docker tutorial for a complete walkthrough.