AI Ethics & Responsible Use
The questions everyone should ask before trusting AI with anything that matters.
The algorithm that sent innocent people to jail
In 2016, a ProPublica investigation revealed that COMPAS — an AI system used by US courts to predict whether a defendant would re-offend — showed racial disparities in both error types: Black defendants were nearly twice as likely to be falsely labeled high-risk, while white defendants were nearly twice as likely to be falsely labeled low-risk (Angwin et al., ProPublica, 2016). Judges were using these risk scores to make bail and sentencing decisions. Real people went to jail or stayed in jail longer because of a biased algorithm.
Nobody programmed the bias intentionally. The system learned from historical data — decades of arrest records that already reflected racial disparities in policing. The algorithm looked at that data and concluded: "people from this zip code get re-arrested more often." It didn't know it was encoding systemic racism. It just found the pattern.
This is the central challenge of AI ethics: these systems absorb the world as it is — not as it should be. And when you deploy them at scale, they amplify whatever patterns they learned, fair or not.
Bias: the mirror that doesn't lie
AI systems are mirrors. They reflect the data they were trained on — including every prejudice, stereotype, and imbalance in that data.
Think of it this way: if you trained an AI on 100 years of newspaper photos and asked it to generate an image of a "CEO," it would overwhelmingly generate images of white men in suits. Not because the AI is sexist or racist, but because that's what 100 years of CEO photos look like. The AI learned the pattern of the past and projected it into the future.
Where bias hides
| Source of bias | What happens | Real example |
|---|---|---|
| Training data | Historical inequities get baked in | Hiring AI trained on past hires penalises women because past hires were mostly men |
| Label bias | Human labellers bring their own prejudices | Medical AI trained on data labelled primarily by doctors in wealthy countries misses diseases common in developing nations |
| Selection bias | Some groups are underrepresented in training data | Facial recognition works well on light-skinned faces, poorly on dark-skinned faces — because training data was predominantly light-skinned |
| Measurement bias | The thing you're measuring is a proxy for something else | Using zip code as a feature inadvertently encodes race due to residential segregation |
There Are No Dumb Questions
"Can't you just remove race and gender from the training data to eliminate bias?"
Unfortunately, no. This is called "fairness through blindness" and it doesn't work. Even if you remove race explicitly, the model can infer it from correlated features — zip code, name, school attended, hobbies. A model can reconstruct race with high accuracy from these proxies. Real bias mitigation requires active testing and intervention, not just removing columns from a spreadsheet.
"Is all AI bias bad? What about a medical AI that gives different treatment recommendations to different demographics?"
Good question. Some differences are medically appropriate — certain conditions are more prevalent in certain populations, and treatment guidelines genuinely differ. The problem is when the AI treats people differently for reasons that aren't medically justified, like giving lower pain management recommendations to Black patients (which studies show happens in human medicine too, and AI trained on that data perpetuates it). The line between appropriate differentiation and harmful bias requires careful domain expertise to draw.
Bias Audit
25 XPHallucination: the student who never says "I don't know"
You know that kid in class who always raises their hand — even when they don't know the answer? They'll say something that sounds smart, delivered with total confidence, and half the time it's completely wrong. But they never, ever say "I don't know."
That's an LLM hallucinating.
Hallucination is when an AI generates information that sounds plausible but is factually incorrect. The model doesn't know it's wrong — it doesn't have a concept of "knowing." It just predicted the most likely next tokens, and those tokens happened to form a false statement.
Why hallucinations happen
| Reason | What's going on |
|---|---|
| Pattern completion | The model predicts what text usually comes next, even if it's wrong in this specific case |
| Training data conflicts | The internet contains contradictory information — the model can pick the wrong version |
| No knowledge boundary | The model has no internal flag that says "I don't have reliable information about this" |
| Confidence is baked in | Fine-tuning and RLHF (Reinforcement Learning from Human Feedback — where human raters score responses to teach the model what "good" looks like) train the model to sound helpful and confident — even when it shouldn't be |
The hallucination spectrum
Not all hallucinations are equally dangerous:
| Type | Risk level | Example |
|---|---|---|
| Trivial | Low | AI says a movie came out in 2019 when it was 2020 |
| Misleading | Medium | AI provides outdated medical dosage information |
| Fabricated sources | High | AI cites a legal case or scientific paper that doesn't exist |
| Dangerous | Critical | AI provides incorrect drug interaction information to a patient |
In 2023, attorneys in Mata v. Avianca (SDNY, 2023) made headlines when they submitted a legal brief containing six case citations generated by ChatGPT. None of the cases existed. The model had invented them — complete with realistic case names, docket numbers, and legal reasoning. The attorneys were sanctioned by the court.
Design a Hallucination Safety Net
50 XPPrivacy: what goes in can come out
Here's a thought experiment. You paste your company's confidential financial projections into ChatGPT and ask it to summarise them. Where did that data just go?
It went to OpenAI's servers. Depending on the terms of service, it might be used to train future models. Which means fragments of your confidential data could theoretically surface in responses to other users. Even if the company says they won't use your data for training, you've still transmitted it to a third party.
The privacy challenge with AI has three dimensions:
| Dimension | The risk | Example |
|---|---|---|
| Input privacy | Data you send to the AI leaves your control | An employee pastes customer SSNs into an AI tool |
| Training data privacy | People's personal data was used to train the model without consent | Your blog posts, photos, and social media comments were scraped to train an AI |
| Output privacy | The AI might reveal information from its training data | A model trained on medical records could surface patient information |
What organizations are doing about it
- Data classification policies: Define what can and can't be sent to AI tools (no PII, no trade secrets, no customer data)
- Self-hosted models: Run AI models on your own servers so data never leaves your infrastructure
- Enterprise agreements: Contracts that guarantee your data won't be used for training
- Anonymisation: Strip identifying information before sending data to AI tools
There Are No Dumb Questions
"If I use an enterprise version of ChatGPT or Claude, is my data safe?"
Enterprise versions typically guarantee that your data won't be used for training and include stronger security agreements. But "safe" depends on your definition. The data still travels to and from external servers (unless you self-host). For truly sensitive data — classified information, medical records, financial data — many organizations require self-hosted models or on-premise solutions.
"What about the data the model was trained on? Did those people consent?"
This is one of the biggest unresolved questions in AI. Most LLMs were trained on publicly available internet data, which includes content people posted without expecting it to be used for AI training. Lawsuits are pending — the New York Times is suing OpenAI, artists are suing Stability AI, and new regulations like the EU AI Act are establishing rules around training data consent. The legal landscape is evolving rapidly.
Environmental cost: the hidden price tag
Training large frontier models has been estimated to consume tens of thousands of megawatt-hours (estimates vary; OpenAI has not disclosed official figures) — equivalent to the annual energy use of thousands of homes. Early estimates suggested LLM queries used around 10× the energy of a web search (Goldman Sachs, May 2023). More recent analysis finds modern models have become far more efficient — current estimates put a ChatGPT query at roughly 0.3 Wh, comparable to a standard web search (IEA/Goldman Sachs, 2024 — verify as efficiency continues to improve).
| Activity | Energy per query |
|---|---|
| Google search | ~0.3 Wh |
| ChatGPT query | ~0.3 Wh (est.) |
| Training large frontier models (total) | Tens of thousands of MWh (est.; varies by model) |
| Image generation (DALL-E) | ~3-10 Wh |
The environmental cost is real but needs context:
- AI's total energy consumption is still a fraction of many industries (streaming video uses far more electricity globally than AI — for now)
- Efficiency is improving rapidly — newer models do more with less energy
- Many AI companies are investing heavily in renewable energy — Microsoft, Google, and others have made carbon-neutral pledges
The responsible approach: use AI where it creates genuine value. Don't use a 200-billion-parameter model to answer a question you could Google in 5 seconds.
Job impact: tasks change, not (usually) entire jobs
Every time a major technology arrives, people panic about jobs. The printing press. The assembly line. The spreadsheet. ATMs. Self-checkout. Each one eliminated specific tasks but rarely eliminated entire jobs.
AI is following the same pattern — with some important differences.
| Pattern | Historical example | AI example |
|---|---|---|
| Task automation | ATMs automated cash withdrawal, but banks hired MORE tellers (for sales and service) | AI automates first-draft writing, but editors focus on strategy and quality |
| Task augmentation | Spreadsheets made accountants faster, not obsolete | A controlled study found developers completed specific coding tasks up to 55% faster with Copilot (Peng et al., GitHub/Microsoft Research, 2022); broader surveys suggest 30–55% improvement on defined tasks, though gains vary significantly by task type |
| New job creation | The internet created jobs that didn't exist before (social media manager, SEO specialist) | AI creates new roles: prompt engineer, AI safety researcher, AI ethicist |
| Job transformation | Photographers shifted from darkroom skills to digital editing | Designers shift from pixel-pushing to directing AI tools |
The key insight: AI automates tasks, not jobs. Most jobs are bundles of many tasks. AI might automate 30% of a knowledge worker's tasks — the repetitive, routine ones — freeing them to focus on the 70% that requires judgment, creativity, and human connection.
That said, the transition isn't painless. The people whose jobs consist primarily of automatable tasks face real disruption. And the transition period — when old jobs are shrinking but new ones haven't fully formed — is genuinely difficult.
Task vs. Job Analysis
25 XPMisinformation and deepfakes: the trust crisis
AI has made it trivially easy to create convincing fake content:
- Text: Generate a fake news article indistinguishable from real journalism in seconds
- Images: Create photorealistic images of events that never happened
- Audio: Clone someone's voice from a 3-second sample
- Video: Generate realistic video of people saying things they never said
The technology isn't inherently evil — it's also used for movie special effects, accessibility tools, and creative expression. But the potential for harm is enormous.
In 2024, a deepfake audio clip of a school principal appearing to make racist remarks went viral. It was fake — created by the school's athletic director with free AI tools. But by the time it was debunked, the principal had received death threats and the school district was in crisis.
The defense toolkit
| Defense | How it works | Limitation |
|---|---|---|
| Media literacy | Teach people to verify before sharing | Doesn't scale — fakes spread faster than fact-checks |
| Digital watermarking | Embed invisible markers in AI-generated content | Only works if all AI tools participate |
| Detection tools | AI that detects AI-generated content | Arms race — detectors and generators keep improving |
| Provenance tracking | Cryptographic proof of where content originated | Requires industry-wide adoption |
| Platform policies | Social media platforms label or remove AI-generated content | Inconsistent enforcement |
Responsible use principles: the framework that ties it all together
After all these challenges — bias, hallucination, privacy, environmental cost, jobs, misinformation — what does responsible AI use actually look like? Here are five principles that every organization should adopt:
✗ Without AI
- ✗Bias amplified at scale
- ✗Opaque decision-making
- ✗Job displacement
- ✗Misinformation generation
- ✗Privacy erosion
- ✗Concentration of power
✓ With AI
- ✓Medical diagnosis accuracy
- ✓Climate modelling
- ✓Scientific research speed
- ✓Accessibility tools
- ✓Education personalisation
- ✓Economic productivity
1. Transparency
Tell people when they're interacting with AI. Tell them what data you're collecting. Tell them how decisions are made. No hidden algorithms making consequential choices about people's lives.
2. Accountability
Someone — a real human, not "the algorithm" — must be accountable for AI decisions. If the AI denies a loan, a human must be able to explain why and override the decision.
3. Human oversight
Keep humans in the loop for high-stakes decisions. AI can recommend, but humans decide — especially when the decision affects health, freedom, finances, or safety.
4. Fairness testing
Actively test for bias across demographic groups before deployment. Don't wait for a ProPublica investigation to discover your system is discriminating.
5. Proportional use
Use the right tool for the job. Don't deploy a powerful AI system where a simple rule-based system would work. Don't use AI for decisions where the stakes are too high to tolerate any error rate.
<span className="text-xl">⚖️</span>
<div><strong className="block">Fairness</strong><span className="text-sm text-slate-600">AI should treat people equitably and not discriminate based on protected characteristics.</span></div>
<span className="text-xl">🔍</span>
<div><strong className="block">Transparency</strong><span className="text-sm text-slate-600">People affected by AI decisions should be able to understand how those decisions were made.</span></div>
<span className="text-xl">🛡️</span>
<div><strong className="block">Safety</strong><span className="text-sm text-slate-600">AI systems should be reliable, secure, and designed to minimise harm.</span></div>
<span className="text-xl">🔐</span>
<div><strong className="block">Privacy</strong><span className="text-sm text-slate-600">Personal data used to train or operate AI must be handled with consent and care.</span></div>
<span className="text-xl">🧑⚖️</span>
<div><strong className="block">Accountability</strong><span className="text-sm text-slate-600">There must be humans responsible for AI outcomes. "The algorithm decided" is not a defence.</span></div>
Ethics Audit: The College Admissions AI
50 XPThere Are No Dumb Questions
"This all sounds overwhelming. Do I personally need to worry about all of this?"
You don't need to solve every problem. But you DO need to ask the right questions. Before using AI for any consequential decision, ask: "What happens if this is wrong? Who gets hurt? Did we test for bias? Is a human checking the output?" If you can't answer those questions, you're not ready to deploy.
"Isn't this going to slow down AI development?"
Some friction is intentional. Building a bridge takes longer when you add safety inspections — and that's a feature, not a bug. The companies that invest in responsible AI now will avoid the lawsuits, scandals, and regulatory penalties that are hitting companies that moved too fast. Responsibility isn't the opposite of speed — it's what makes speed sustainable.
Back to COMPAS
The COMPAS system didn't set out to be biased — it set out to be accurate. It found real statistical patterns in historical arrest data. The problem is that those patterns encoded decades of racially unequal policing, not individual risk. When the algorithm confidently scored Black defendants as higher-risk based on zip code and social network proxies, it was doing exactly what it was trained to do — and that's precisely the danger. Removing the word "race" from the input fields changed nothing, because the correlated proxies remained. What COMPAS lacked was any of the five principles: no transparency to defendants, no accountability when scores were wrong, no fairness testing across racial groups before deployment, and human oversight that amounted to judges trusting a number they couldn't interrogate. The lesson isn't that algorithms are untrustworthy — it's that unexamined algorithms inherit the world as it is and amplify it at scale.
Key takeaways
- AI mirrors its training data — bias included. If historical data contains discrimination, the AI will learn and amplify that discrimination. Removing protected attributes from the data doesn't fix this because proxies exist.
- Hallucination is a feature of how LLMs work, not a bug to be fixed. Models predict likely text, not truthful text. For any high-stakes use, you need verification layers (RAG, human review, confidence thresholds).
- What goes into an AI can come out. Treat every piece of data you send to an AI service as potentially public. Classify your data before exposing it to AI tools.
- AI changes tasks, not (usually) entire jobs. The transition is real, but it follows historical patterns of technology transforming roles rather than eliminating them wholesale.
- Five principles for responsible use: transparency, accountability, human oversight, fairness testing, and proportional use. Apply all five before deploying AI for any consequential decision.
- Asking the right questions matters more than having all the answers. "What happens if this is wrong?" is the single most important question in AI ethics.
Knowledge Check
1.A hiring AI rejects candidates from certain zip codes at a higher rate. The system never uses race as an input feature. Why might this still constitute racial bias?
2.A lawyer submits a legal brief citing six court cases generated by an LLM. None of the cases exist. Which aspect of how LLMs work explains this failure?
3.An employee pastes confidential customer data into a public AI chatbot to generate a summary. What are the primary privacy risks?
4.Which of the five responsible AI principles specifically addresses the concern that 'the algorithm decided' should never be an acceptable explanation for a consequential decision?