AI Ethics & Responsible Use — Understanding AI

The algorithm that sent innocent people to jail

In 2016, a ProPublica investigation revealed that COMPAS — an AI system used by US courts to predict whether a defendant would re-offend — showed racial disparities in both error types: Black defendants were nearly twice as likely to be falsely labeled high-risk, while white defendants were nearly twice as likely to be falsely labeled low-risk (Angwin et al., ProPublica, 2016). Judges were using these risk scores to make bail and sentencing decisions. Real people went to jail or stayed in jail longer because of a biased algorithm.

Nobody programmed the bias intentionally. The system learned from historical data — decades of arrest records that already reflected racial disparities in policing. The algorithm looked at that data and concluded: "people from this zip code get re-arrested more often." It didn't know it was encoding systemic racism. It just found the pattern.

This is the central challenge of AI ethics: these systems absorb the world as it is — not as it should be. And when you deploy them at scale, they amplify whatever patterns they learned, fair or not.

Bias: the mirror that doesn't lie

AI systems are mirrors. They reflect the data they were trained on — including every prejudice, stereotype, and imbalance in that data.

Think of it this way: if you trained an AI on 100 years of newspaper photos and asked it to generate an image of a "CEO," it would overwhelmingly generate images of white men in suits. Not because the AI is sexist or racist, but because that's what 100 years of CEO photos look like. The AI learned the pattern of the past and projected it into the future.

Where bias hides

Source of bias	What happens	Real example
Training data	Historical inequities get baked in	Hiring AI trained on past hires penalises women because past hires were mostly men
Label bias	Human labellers bring their own prejudices	Medical AI trained on data labelled primarily by doctors in wealthy countries misses diseases common in developing nations
Selection bias	Some groups are underrepresented in training data	Facial recognition works well on light-skinned faces, poorly on dark-skinned faces — because training data was predominantly light-skinned
Measurement bias	The thing you're measuring is a proxy for something else	Using zip code as a feature inadvertently encodes race due to residential segregation

🚨When AI bias causes real harm

In 2019, a healthcare algorithm used by US hospitals was found to systematically underestimate the medical needs of Black patients ([Obermeyer et al., Science, 2019](https://www.science.org/doi/10.1126/science.aax2342)) — because it used healthcare costs as a proxy for health needs, and Black patients historically had less access to care. The bias in the proxy variable became bias in patient care decisions affecting millions of people.

There Are No Dumb Questions

"Can't you just remove race and gender from the training data to eliminate bias?"

Unfortunately, no. This is called "fairness through blindness" and it doesn't work. Even if you remove race explicitly, the model can infer it from correlated features — zip code, name, school attended, hobbies. A model can reconstruct race with high accuracy from these proxies. Real bias mitigation requires active testing and intervention, not just removing columns from a spreadsheet.

"Is all AI bias bad? What about a medical AI that gives different treatment recommendations to different demographics?"

Good question. Some differences are medically appropriate — certain conditions are more prevalent in certain populations, and treatment guidelines genuinely differ. The problem is when the AI treats people differently for reasons that aren't medically justified, like giving lower pain management recommendations to Black patients (which studies show happens in human medicine too, and AI trained on that data perpetuates it). The line between appropriate differentiation and harmful bias requires careful domain expertise to draw.

⚡

Bias Audit

25 XP

For each scenario, identify the type of bias at play and suggest one specific mitigation. 1. A resume screening AI rejects candidates who attended women's colleges at a higher rate. The company never explicitly included gender in the training data. → Bias type: ___ Mitigation: ___ 2. A loan approval AI denies applications from rural areas at a higher rate, even when applicants have similar credit scores to urban applicants. The training data came primarily from urban bank branches. → Bias type: ___ Mitigation: ___ 3. A skin cancer detection AI has 95% accuracy on light skin but only 60% on dark skin. → Bias type: ___ Mitigation: ___ _Hint: For each case, ask: where did the bias enter — in what was measured, in what data was collected, or in how the data was distributed? The source of the bias points directly to where the fix needs to happen._

Hallucination: the student who never says "I don't know"

You know that kid in class who always raises their hand — even when they don't know the answer? They'll say something that sounds smart, delivered with total confidence, and half the time it's completely wrong. But they never, ever say "I don't know."

That's an LLM hallucinating.

Hallucination is when an AI generates information that sounds plausible but is factually incorrect. The model doesn't know it's wrong — it doesn't have a concept of "knowing." It just predicted the most likely next tokens, and those tokens happened to form a false statement.

Why hallucinations happen

Reason	What's going on
Pattern completion	The model predicts what text usually comes next, even if it's wrong in this specific case
Training data conflicts	The internet contains contradictory information — the model can pick the wrong version
No knowledge boundary	The model has no internal flag that says "I don't have reliable information about this"
Confidence is baked in	Fine-tuning and RLHF (Reinforcement Learning from Human Feedback — where human raters score responses to teach the model what "good" looks like) train the model to sound helpful and confident — even when it shouldn't be

The hallucination spectrum

Not all hallucinations are equally dangerous:

Type	Risk level	Example
Trivial	Low	AI says a movie came out in 2019 when it was 2020
Misleading	Medium	AI provides outdated medical dosage information
Fabricated sources	High	AI cites a legal case or scientific paper that doesn't exist
Dangerous	Critical	AI provides incorrect drug interaction information to a patient

In 2023, attorneys in Mata v. Avianca (SDNY, 2023) made headlines when they submitted a legal brief containing six case citations generated by ChatGPT. None of the cases existed. The model had invented them — complete with realistic case names, docket numbers, and legal reasoning. The attorneys were sanctioned by the court.

⚡

Design a Hallucination Safety Net

50 XP

You're building an AI-powered customer support system for a healthcare company. The AI will answer questions about medication dosages, side effects, and drug interactions. **The stakes are high.** A wrong answer about drug interactions could literally kill someone. Design a safety system with at least three layers of protection against hallucination. For each layer, explain: - What it does - When it activates - What happens when it catches a problem | Layer | What it does | When it activates | What happens | |-------|-------------|-------------------|--------------| | 1 | ? | ? | ? | | 2 | ? | ? | ? | | 3 | ? | ? | ? | _Hint: Think about RAG (grounding responses in verified medical databases), confidence thresholds (rejecting responses where the model is uncertain), human review gates (routing high-risk responses to pharmacists), and disclaimers (always telling users to verify with a healthcare provider). The best systems use all of these together._

Privacy: what goes in can come out

Here's a thought experiment. You paste your company's confidential financial projections into ChatGPT and ask it to summarise them. Where did that data just go?

It went to OpenAI's servers. Depending on the terms of service, it might be used to train future models. Which means fragments of your confidential data could theoretically surface in responses to other users. Even if the company says they won't use your data for training, you've still transmitted it to a third party.

The privacy challenge with AI has three dimensions:

Dimension	The risk	Example
Input privacy	Data you send to the AI leaves your control	An employee pastes customer SSNs into an AI tool
Training data privacy	People's personal data was used to train the model without consent	Your blog posts, photos, and social media comments were scraped to train an AI
Output privacy	The AI might reveal information from its training data	A model trained on medical records could surface patient information

What organizations are doing about it

Data classification policies: Define what can and can't be sent to AI tools (no PII, no trade secrets, no customer data)
Self-hosted models: Run AI models on your own servers so data never leaves your infrastructure
Enterprise agreements: Contracts that guarantee your data won't be used for training
Anonymisation: Strip identifying information before sending data to AI tools

There Are No Dumb Questions

"If I use an enterprise version of ChatGPT or Claude, is my data safe?"

Enterprise versions typically guarantee that your data won't be used for training and include stronger security agreements. But "safe" depends on your definition. The data still travels to and from external servers (unless you self-host). For truly sensitive data — classified information, medical records, financial data — many organizations require self-hosted models or on-premise solutions.

"What about the data the model was trained on? Did those people consent?"

This is one of the biggest unresolved questions in AI. Most LLMs were trained on publicly available internet data, which includes content people posted without expecting it to be used for AI training. Lawsuits are pending — the New York Times is suing OpenAI, artists are suing Stability AI, and new regulations like the EU AI Act are establishing rules around training data consent. The legal landscape is evolving rapidly.

Environmental cost: the hidden price tag

Training large frontier models has been estimated to consume tens of thousands of megawatt-hours (estimates vary; OpenAI has not disclosed official figures) — equivalent to the annual energy use of thousands of homes. Early estimates suggested LLM queries used around 10× the energy of a web search (Goldman Sachs, May 2023). More recent analysis finds modern models have become far more efficient — current estimates put a ChatGPT query at roughly 0.3 Wh, comparable to a standard web search (IEA/Goldman Sachs, 2024 — verify as efficiency continues to improve).

Activity	Energy per query
Google search	~0.3 Wh
ChatGPT query	~0.3 Wh (est.)
Training large frontier models (total)	Tens of thousands of MWh (est.; varies by model)
Image generation (DALL-E)	~3-10 Wh

The environmental cost is real but needs context:

AI's total energy consumption is still a fraction of many industries (streaming video uses far more electricity globally than AI — for now)
Efficiency is improving rapidly — newer models do more with less energy
Many AI companies are investing heavily in renewable energy — Microsoft, Google, and others have made carbon-neutral pledges

The responsible approach: use AI where it creates genuine value. Don't use a 200-billion-parameter model to answer a question you could Google in 5 seconds.

Job impact: tasks change, not (usually) entire jobs

Every time a major technology arrives, people panic about jobs. The printing press. The assembly line. The spreadsheet. ATMs. Self-checkout. Each one eliminated specific tasks but rarely eliminated entire jobs.

AI is following the same pattern — with some important differences.

Pattern	Historical example	AI example
Task automation	ATMs automated cash withdrawal, but banks hired MORE tellers (for sales and service)	AI automates first-draft writing, but editors focus on strategy and quality
Task augmentation	Spreadsheets made accountants faster, not obsolete	A controlled study found developers completed specific coding tasks up to 55% faster with Copilot (Peng et al., GitHub/Microsoft Research, 2022); broader surveys suggest 30–55% improvement on defined tasks, though gains vary significantly by task type
New job creation	The internet created jobs that didn't exist before (social media manager, SEO specialist)	AI creates new roles: prompt engineer, AI safety researcher, AI ethicist
Job transformation	Photographers shifted from darkroom skills to digital editing	Designers shift from pixel-pushing to directing AI tools

The key insight: AI automates tasks, not jobs. Most jobs are bundles of many tasks. AI might automate 30% of a knowledge worker's tasks — the repetitive, routine ones — freeing them to focus on the 70% that requires judgment, creativity, and human connection.

That said, the transition isn't painless. The people whose jobs consist primarily of automatable tasks face real disruption. And the transition period — when old jobs are shrinking but new ones haven't fully formed — is genuinely difficult.

⚡

Task vs. Job Analysis

25 XP

For each role, identify which specific tasks AI is likely to automate and which tasks will remain uniquely human. Then assess: is the overall job at risk, or is it being transformed? | Role | Tasks AI will automate | Tasks that stay human | Overall impact | |------|----------------------|----------------------|----------------| | Paralegal | ? | ? | At risk / Transformed? | | Marketing manager | ? | ? | At risk / Transformed? | | Radiologist | ? | ? | At risk / Transformed? | | Software developer | ? | ? | At risk / Transformed? | _Hint: For each role, think about which tasks are routine and pattern-based (AI can do these) vs. which require judgment, empathy, creativity, or novel problem-solving (these stay human). Most roles are "transformed" — very few are entirely "at risk."_

Misinformation and deepfakes: the trust crisis

AI has made it trivially easy to create convincing fake content:

Text: Generate a fake news article indistinguishable from real journalism in seconds
Images: Create photorealistic images of events that never happened
Audio: Clone someone's voice from a 3-second sample
Video: Generate realistic video of people saying things they never said

The technology isn't inherently evil — it's also used for movie special effects, accessibility tools, and creative expression. But the potential for harm is enormous.

In 2024, a deepfake audio clip of a school principal appearing to make racist remarks went viral. It was fake — created by the school's athletic director with free AI tools. But by the time it was debunked, the principal had received death threats and the school district was in crisis.

The defense toolkit

Defense	How it works	Limitation
Media literacy	Teach people to verify before sharing	Doesn't scale — fakes spread faster than fact-checks
Digital watermarking	Embed invisible markers in AI-generated content	Only works if all AI tools participate
Detection tools	AI that detects AI-generated content	Arms race — detectors and generators keep improving
Provenance tracking	Cryptographic proof of where content originated	Requires industry-wide adoption
Platform policies	Social media platforms label or remove AI-generated content	Inconsistent enforcement

Responsible use principles: the framework that ties it all together

After all these challenges — bias, hallucination, privacy, environmental cost, jobs, misinformation — what does responsible AI use actually look like? Here are five principles that every organization should adopt:

✗ Without AI

✗Bias amplified at scale
✗Opaque decision-making
✗Job displacement
✗Misinformation generation
✗Privacy erosion
✗Concentration of power

✓ With AI

✓Medical diagnosis accuracy
✓Climate modelling
✓Scientific research speed
✓Accessibility tools
✓Education personalisation
✓Economic productivity

1. Transparency

Tell people when they're interacting with AI. Tell them what data you're collecting. Tell them how decisions are made. No hidden algorithms making consequential choices about people's lives.

2. Accountability

Someone — a real human, not "the algorithm" — must be accountable for AI decisions. If the AI denies a loan, a human must be able to explain why and override the decision.

3. Human oversight

Keep humans in the loop for high-stakes decisions. AI can recommend, but humans decide — especially when the decision affects health, freedom, finances, or safety.

4. Fairness testing

Actively test for bias across demographic groups before deployment. Don't wait for a ProPublica investigation to discover your system is discriminating.

5. Proportional use

Use the right tool for the job. Don't deploy a powerful AI system where a simple rule-based system would work. Don't use AI for decisions where the stakes are too high to tolerate any error rate.

<span className="text-xl">⚖️</span>
<div><strong className="block">Fairness</strong><span className="text-sm text-slate-600">AI should treat people equitably and not discriminate based on protected characteristics.</span></div>

<span className="text-xl">🔍</span>
<div><strong className="block">Transparency</strong><span className="text-sm text-slate-600">People affected by AI decisions should be able to understand how those decisions were made.</span></div>

<span className="text-xl">🛡️</span>
<div><strong className="block">Safety</strong><span className="text-sm text-slate-600">AI systems should be reliable, secure, and designed to minimise harm.</span></div>

<span className="text-xl">🔐</span>
<div><strong className="block">Privacy</strong><span className="text-sm text-slate-600">Personal data used to train or operate AI must be handled with consent and care.</span></div>

<span className="text-xl">🧑‍⚖️</span>
<div><strong className="block">Accountability</strong><span className="text-sm text-slate-600">There must be humans responsible for AI outcomes. "The algorithm decided" is not a defence.</span></div>

⚡

Ethics Audit: The College Admissions AI

50 XP

A university wants to use AI to help screen 50,000 undergraduate applications. The system would read essays, review transcripts, and produce a "fit score" from 0-100 for each applicant. Admissions officers would still make final decisions, but they'd start by reviewing the top-scoring applicants. **Conduct an ethics audit using the five principles:** | Principle | Specific concern for this system | Your recommendation | |-----------|--------------------------------|---------------------| | Transparency | ? | ? | | Accountability | ? | ? | | Human oversight | ? | ? | | Fairness testing | ? | ? | | Proportional use | ? | ? | **Bonus questions:** 1. The university's historical admissions data shows that legacy applicants (children of alumni) were admitted at 5x the rate of non-legacy applicants. If you train the AI on this data, what happens? ___ 2. The AI gives an applicant a score of 23/100. The applicant asks "why?" Can the system explain? Should it be required to? ___ _Hint: For transparency — do applicants know AI is screening their essays? For fairness — the historical data encodes existing biases (legacy preference, socioeconomic advantages). For accountability — if the AI filters out a qualified applicant, who is responsible? These are genuinely hard questions with no perfect answers._

There Are No Dumb Questions

"This all sounds overwhelming. Do I personally need to worry about all of this?"

You don't need to solve every problem. But you DO need to ask the right questions. Before using AI for any consequential decision, ask: "What happens if this is wrong? Who gets hurt? Did we test for bias? Is a human checking the output?" If you can't answer those questions, you're not ready to deploy.

"Isn't this going to slow down AI development?"

Some friction is intentional. Building a bridge takes longer when you add safety inspections — and that's a feature, not a bug. The companies that invest in responsible AI now will avoid the lawsuits, scandals, and regulatory penalties that are hitting companies that moved too fast. Responsibility isn't the opposite of speed — it's what makes speed sustainable.

Back to COMPAS

The COMPAS system didn't set out to be biased — it set out to be accurate. It found real statistical patterns in historical arrest data. The problem is that those patterns encoded decades of racially unequal policing, not individual risk. When the algorithm confidently scored Black defendants as higher-risk based on zip code and social network proxies, it was doing exactly what it was trained to do — and that's precisely the danger. Removing the word "race" from the input fields changed nothing, because the correlated proxies remained. What COMPAS lacked was any of the five principles: no transparency to defendants, no accountability when scores were wrong, no fairness testing across racial groups before deployment, and human oversight that amounted to judges trusting a number they couldn't interrogate. The lesson isn't that algorithms are untrustworthy — it's that unexamined algorithms inherit the world as it is and amplify it at scale.

Key takeaways

AI mirrors its training data — bias included. If historical data contains discrimination, the AI will learn and amplify that discrimination. Removing protected attributes from the data doesn't fix this because proxies exist.
Hallucination is a feature of how LLMs work, not a bug to be fixed. Models predict likely text, not truthful text. For any high-stakes use, you need verification layers (RAG, human review, confidence thresholds).
What goes into an AI can come out. Treat every piece of data you send to an AI service as potentially public. Classify your data before exposing it to AI tools.
AI changes tasks, not (usually) entire jobs. The transition is real, but it follows historical patterns of technology transforming roles rather than eliminating them wholesale.
Five principles for responsible use: transparency, accountability, human oversight, fairness testing, and proportional use. Apply all five before deploying AI for any consequential decision.
Asking the right questions matters more than having all the answers. "What happens if this is wrong?" is the single most important question in AI ethics.

Knowledge Check

1.A hiring AI rejects candidates from certain zip codes at a higher rate. The system never uses race as an input feature. Why might this still constitute racial bias?

2.A lawyer submits a legal brief citing six court cases generated by an LLM. None of the cases exist. Which aspect of how LLMs work explains this failure?

3.An employee pastes confidential customer data into a public AI chatbot to generate a summary. What are the primary privacy risks?

4.Which of the five responsible AI principles specifically addresses the concern that 'the algorithm decided' should never be an acceptable explanation for a consequential decision?

The algorithm that sent innocent people to jail

Bias: the mirror that doesn't lie

AI systems are mirrors. They reflect the data they were trained on — including every prejudice, stereotype, and imbalance in that data.

Where bias hides

Source of bias	What happens	Real example
Training data	Historical inequities get baked in	Hiring AI trained on past hires penalises women because past hires were mostly men
Label bias	Human labellers bring their own prejudices	Medical AI trained on data labelled primarily by doctors in wealthy countries misses diseases common in developing nations
Selection bias	Some groups are underrepresented in training data	Facial recognition works well on light-skinned faces, poorly on dark-skinned faces — because training data was predominantly light-skinned
Measurement bias	The thing you're measuring is a proxy for something else	Using zip code as a feature inadvertently encodes race due to residential segregation

🚨When AI bias causes real harm

There Are No Dumb Questions

"Can't you just remove race and gender from the training data to eliminate bias?"

Unfortunately, no. This is called "fairness through blindness" and it doesn't work. Even if you remove race explicitly, the model can infer it from correlated features — zip code, name, school attended, hobbies. A model can reconstruct race with high accuracy from these proxies. Real bias mitigation requires active testing and intervention, not just removing columns from a spreadsheet.

"Is all AI bias bad? What about a medical AI that gives different treatment recommendations to different demographics?"

Good question. Some differences are medically appropriate — certain conditions are more prevalent in certain populations, and treatment guidelines genuinely differ. The problem is when the AI treats people differently for reasons that aren't medically justified, like giving lower pain management recommendations to Black patients (which studies show happens in human medicine too, and AI trained on that data perpetuates it). The line between appropriate differentiation and harmful bias requires careful domain expertise to draw.

⚡

Bias Audit

25 XP

Hallucination: the student who never says "I don't know"

That's an LLM hallucinating.

Why hallucinations happen

Reason	What's going on
Pattern completion	The model predicts what text usually comes next, even if it's wrong in this specific case
Training data conflicts	The internet contains contradictory information — the model can pick the wrong version
No knowledge boundary	The model has no internal flag that says "I don't have reliable information about this"
Confidence is baked in	Fine-tuning and RLHF (Reinforcement Learning from Human Feedback — where human raters score responses to teach the model what "good" looks like) train the model to sound helpful and confident — even when it shouldn't be

The hallucination spectrum

Not all hallucinations are equally dangerous:

Type	Risk level	Example
Trivial	Low	AI says a movie came out in 2019 when it was 2020
Misleading	Medium	AI provides outdated medical dosage information
Fabricated sources	High	AI cites a legal case or scientific paper that doesn't exist
Dangerous	Critical	AI provides incorrect drug interaction information to a patient

⚡

Design a Hallucination Safety Net

50 XP

Privacy: what goes in can come out

Here's a thought experiment. You paste your company's confidential financial projections into ChatGPT and ask it to summarise them. Where did that data just go?

The privacy challenge with AI has three dimensions:

Dimension	The risk	Example
Input privacy	Data you send to the AI leaves your control	An employee pastes customer SSNs into an AI tool
Training data privacy	People's personal data was used to train the model without consent	Your blog posts, photos, and social media comments were scraped to train an AI
Output privacy	The AI might reveal information from its training data	A model trained on medical records could surface patient information

What organizations are doing about it

Data classification policies: Define what can and can't be sent to AI tools (no PII, no trade secrets, no customer data)
Self-hosted models: Run AI models on your own servers so data never leaves your infrastructure
Enterprise agreements: Contracts that guarantee your data won't be used for training
Anonymisation: Strip identifying information before sending data to AI tools

There Are No Dumb Questions

"If I use an enterprise version of ChatGPT or Claude, is my data safe?"

Enterprise versions typically guarantee that your data won't be used for training and include stronger security agreements. But "safe" depends on your definition. The data still travels to and from external servers (unless you self-host). For truly sensitive data — classified information, medical records, financial data — many organizations require self-hosted models or on-premise solutions.

"What about the data the model was trained on? Did those people consent?"

This is one of the biggest unresolved questions in AI. Most LLMs were trained on publicly available internet data, which includes content people posted without expecting it to be used for AI training. Lawsuits are pending — the New York Times is suing OpenAI, artists are suing Stability AI, and new regulations like the EU AI Act are establishing rules around training data consent. The legal landscape is evolving rapidly.

Environmental cost: the hidden price tag

Activity	Energy per query
Google search	~0.3 Wh
ChatGPT query	~0.3 Wh (est.)
Training large frontier models (total)	Tens of thousands of MWh (est.; varies by model)
Image generation (DALL-E)	~3-10 Wh

The environmental cost is real but needs context:

AI's total energy consumption is still a fraction of many industries (streaming video uses far more electricity globally than AI — for now)
Efficiency is improving rapidly — newer models do more with less energy
Many AI companies are investing heavily in renewable energy — Microsoft, Google, and others have made carbon-neutral pledges

The responsible approach: use AI where it creates genuine value. Don't use a 200-billion-parameter model to answer a question you could Google in 5 seconds.

Job impact: tasks change, not (usually) entire jobs

AI is following the same pattern — with some important differences.

Pattern	Historical example	AI example
Task automation	ATMs automated cash withdrawal, but banks hired MORE tellers (for sales and service)	AI automates first-draft writing, but editors focus on strategy and quality
Task augmentation	Spreadsheets made accountants faster, not obsolete	A controlled study found developers completed specific coding tasks up to 55% faster with Copilot (Peng et al., GitHub/Microsoft Research, 2022); broader surveys suggest 30–55% improvement on defined tasks, though gains vary significantly by task type
New job creation	The internet created jobs that didn't exist before (social media manager, SEO specialist)	AI creates new roles: prompt engineer, AI safety researcher, AI ethicist
Job transformation	Photographers shifted from darkroom skills to digital editing	Designers shift from pixel-pushing to directing AI tools

⚡

Task vs. Job Analysis

25 XP

Misinformation and deepfakes: the trust crisis

AI has made it trivially easy to create convincing fake content:

Text: Generate a fake news article indistinguishable from real journalism in seconds
Images: Create photorealistic images of events that never happened
Audio: Clone someone's voice from a 3-second sample
Video: Generate realistic video of people saying things they never said

The technology isn't inherently evil — it's also used for movie special effects, accessibility tools, and creative expression. But the potential for harm is enormous.

The defense toolkit

Defense	How it works	Limitation
Media literacy	Teach people to verify before sharing	Doesn't scale — fakes spread faster than fact-checks
Digital watermarking	Embed invisible markers in AI-generated content	Only works if all AI tools participate
Detection tools	AI that detects AI-generated content	Arms race — detectors and generators keep improving
Provenance tracking	Cryptographic proof of where content originated	Requires industry-wide adoption
Platform policies	Social media platforms label or remove AI-generated content	Inconsistent enforcement

Responsible use principles: the framework that ties it all together

✗ Without AI

✗Bias amplified at scale
✗Opaque decision-making
✗Job displacement
✗Misinformation generation
✗Privacy erosion
✗Concentration of power

✓ With AI

✓Medical diagnosis accuracy
✓Climate modelling
✓Scientific research speed
✓Accessibility tools
✓Education personalisation
✓Economic productivity

1. Transparency

Tell people when they're interacting with AI. Tell them what data you're collecting. Tell them how decisions are made. No hidden algorithms making consequential choices about people's lives.

2. Accountability

Someone — a real human, not "the algorithm" — must be accountable for AI decisions. If the AI denies a loan, a human must be able to explain why and override the decision.

3. Human oversight

Keep humans in the loop for high-stakes decisions. AI can recommend, but humans decide — especially when the decision affects health, freedom, finances, or safety.

4. Fairness testing

Actively test for bias across demographic groups before deployment. Don't wait for a ProPublica investigation to discover your system is discriminating.

5. Proportional use

Use the right tool for the job. Don't deploy a powerful AI system where a simple rule-based system would work. Don't use AI for decisions where the stakes are too high to tolerate any error rate.

<span className="text-xl">⚖️</span>
<div><strong className="block">Fairness</strong><span className="text-sm text-slate-600">AI should treat people equitably and not discriminate based on protected characteristics.</span></div>

<span className="text-xl">🔍</span>
<div><strong className="block">Transparency</strong><span className="text-sm text-slate-600">People affected by AI decisions should be able to understand how those decisions were made.</span></div>

<span className="text-xl">🛡️</span>
<div><strong className="block">Safety</strong><span className="text-sm text-slate-600">AI systems should be reliable, secure, and designed to minimise harm.</span></div>

<span className="text-xl">🔐</span>
<div><strong className="block">Privacy</strong><span className="text-sm text-slate-600">Personal data used to train or operate AI must be handled with consent and care.</span></div>

<span className="text-xl">🧑‍⚖️</span>
<div><strong className="block">Accountability</strong><span className="text-sm text-slate-600">There must be humans responsible for AI outcomes. "The algorithm decided" is not a defence.</span></div>

⚡

Ethics Audit: The College Admissions AI

50 XP

There Are No Dumb Questions

"This all sounds overwhelming. Do I personally need to worry about all of this?"

You don't need to solve every problem. But you DO need to ask the right questions. Before using AI for any consequential decision, ask: "What happens if this is wrong? Who gets hurt? Did we test for bias? Is a human checking the output?" If you can't answer those questions, you're not ready to deploy.

"Isn't this going to slow down AI development?"

Some friction is intentional. Building a bridge takes longer when you add safety inspections — and that's a feature, not a bug. The companies that invest in responsible AI now will avoid the lawsuits, scandals, and regulatory penalties that are hitting companies that moved too fast. Responsibility isn't the opposite of speed — it's what makes speed sustainable.

Back to COMPAS

Key takeaways

AI mirrors its training data — bias included. If historical data contains discrimination, the AI will learn and amplify that discrimination. Removing protected attributes from the data doesn't fix this because proxies exist.
Hallucination is a feature of how LLMs work, not a bug to be fixed. Models predict likely text, not truthful text. For any high-stakes use, you need verification layers (RAG, human review, confidence thresholds).
What goes into an AI can come out. Treat every piece of data you send to an AI service as potentially public. Classify your data before exposing it to AI tools.
AI changes tasks, not (usually) entire jobs. The transition is real, but it follows historical patterns of technology transforming roles rather than eliminating them wholesale.
Five principles for responsible use: transparency, accountability, human oversight, fairness testing, and proportional use. Apply all five before deploying AI for any consequential decision.
Asking the right questions matters more than having all the answers. "What happens if this is wrong?" is the single most important question in AI ethics.

Knowledge Check

1.A hiring AI rejects candidates from certain zip codes at a higher rate. The system never uses race as an input feature. Why might this still constitute racial bias?

2.A lawyer submits a legal brief citing six court cases generated by an LLM. None of the cases exist. Which aspect of how LLMs work explains this failure?

3.An employee pastes confidential customer data into a public AI chatbot to generate a summary. What are the primary privacy risks?

4.Which of the five responsible AI principles specifically addresses the concern that 'the algorithm decided' should never be an acceptable explanation for a consequential decision?