DATA 202 Module 10: Advanced Ethics - AI Safety, Governance, and the Future

Introduction

Building on the ethical foundations from DATA 201, this module confronts the advanced challenges emerging as AI systems become more capable and pervasive. From existential risk debates to international governance, from deepfakes to autonomous weapons, we grapple with questions that will shape the 21st century.

Part 1: AI Safety and Alignment

The Alignment Problem

As AI systems become more powerful, ensuring they pursue intended goals becomes critical. The alignment problem asks: how do we ensure AI systems do what we actually want?

Specification Problems:

Reward Hacking: Systems find unintended ways to maximize reward
Goal Misgeneralization: Systems generalize goals incorrectly
Deceptive Alignment: Appearing aligned during training, defecting in deployment

Examples:

Game AI that exploits bugs rather than playing properly
Chatbots that tell users what they want to hear rather than truth
Recommendation systems that maximize engagement at cost of well-being

Inner vs. Outer Alignment

Outer Alignment: Is the objective function correct?

Do we specify the right reward?
Does the objective capture what we value?

Inner Alignment: Does the system pursue the specified objective?

Does the learned policy match the objective?
Will behavior generalize to new situations?

Alignment Techniques

Reinforcement Learning from Human Feedback (RLHF):

Train initial model
Have humans rank outputs
Train reward model on rankings
Fine-tune with reinforcement learning

Constitutional AI: Encode principles and have model self-critique

Debate and Amplification: Have AI systems argue positions for human judgment

Part 2: Existential Risk

The Long-Term Perspective

Some researchers argue that advanced AI poses existential risk—potential to cause human extinction or permanent civilization collapse.

Arguments for Concern:

Superintelligence could have goals misaligned with humanity
Power-seeking is instrumentally useful for many goals
We might not get a second chance to correct mistakes

Arguments Against:

Current AI is narrow and far from general intelligence
We can develop safety techniques as capabilities advance
Economic and social pressures favor safe systems

Notable Voices:

Concerned: Geoffrey Hinton, Yoshua Bengio, Stuart Russell
Cautiously Optimistic: Yann LeCun, Andrew Ng
Focus on Present Harms: Timnit Gebru, Emily Bender

The Pause Debate

In March 2023, prominent researchers signed an open letter calling for a six-month pause on training AI systems more powerful than GPT-4. The debate highlighted tensions between:

Precautionary principle vs. innovation
Competition vs. coordination
Near-term harms vs. speculative risks

Part 3: Synthetic Media and Deepfakes

The State of Synthetic Media

Deepfakes: AI-generated or manipulated media

Face swaps in video
Voice cloning
Full body synthesis
Text-to-video generation

Capabilities in 2024+:

Near-photorealistic video generation
Voice cloning from seconds of audio
Real-time face swapping
Plausible fake documents

Harms and Applications

Harmful Uses:

Non-consensual intimate imagery
Political manipulation
Financial fraud (voice cloning)
Erosion of trust in evidence

Legitimate Uses:

Entertainment and filmmaking
Accessibility (voice restoration)
Education and training
Privacy protection

Detection and Defense

Technical Approaches:

Detection models (accuracy declining as generation improves)
Watermarking generated content
Cryptographic provenance (content credentials)
Forensic analysis

Policy Approaches:

Disclosure requirements
Platform policies
Legal remedies for victims
Media literacy education

Part 4: AI Governance

The Regulatory Landscape

EU AI Act (2024):

Risk-based framework
Prohibited systems (social scoring, certain biometrics)
High-risk requirements (documentation, human oversight)
Foundation model obligations

US Approach:

Executive orders rather than legislation
Agency-specific regulations
State-level initiatives
Voluntary commitments

China:

Algorithm recommendation regulations
Generative AI regulations
National security focus

International:

No binding global agreement
OECD AI Principles (voluntary)
UNESCO Recommendation on AI Ethics
G7 Hiroshima Process

The Governance Gap

Challenges for effective governance:

Rapid pace of change outstrips regulation
Technical complexity for policymakers
Global coordination difficulties
Balancing innovation and safety
Defining what to regulate (capabilities vs. applications)

Part 5: Labor and Economic Disruption

The Automation Wave

AI threatens different jobs than previous automation:

Knowledge work (writing, analysis, coding)
Creative work (art, music, design)
Professional services (law, medicine)

Studies estimate:

300 million full-time jobs exposed (Goldman Sachs)
80% of US workforce may see 10%+ task automation (OpenAI/Penn)
Effects vary dramatically by occupation

Responses and Adaptation

Education and Retraining:

Continuous learning requirements
AI literacy for all workers
Human-AI collaboration skills

Policy Responses:

Universal Basic Income proposals
Job transition support
Strengthened social safety nets
Reduced work hours

New Opportunities:

AI tool users more productive
New job categories emerging
Enhanced human capabilities

Part 6: Environmental Impact

The Energy Cost of AI

Training Costs:

GPT-3: ~1,287 MWh (equivalent to ~500 tons CO2)
GPT-4: Estimated 10x+ larger
Image generators: Substantial but less than LLMs

Inference at Scale:

ChatGPT: Millions of queries daily
Search with AI: 5-10x energy of traditional search
Growing demand as AI use expands

Mitigations

Technical:

More efficient architectures
Better hardware (TPUs, custom chips)
Model compression
Renewable energy for data centers

Policy:

Transparency requirements for energy use
Efficiency standards
Carbon pricing

DEEP DIVE: The AI Pause Letter and the Future of AI Governance

A Plea to Slow Down

On March 22, 2023, the Future of Life Institute published an open letter titled “Pause Giant AI Experiments.” Signed by over 30,000 people including Elon Musk, Steve Wozniak, and Yoshua Bengio, it called for a six-month pause on training systems more powerful than GPT-4.

The letter argued:

AI labs are in an “out-of-control race”
AI systems are becoming “increasingly powerful” and “no one can understand, predict, or reliably control”
We should pause until safety protocols are developed

The Debate

Supporters argued:

Precaution is wise with powerful technology
Coordination problems require collective action
Time needed for safety research and governance

Critics argued:

A pause is unenforceable internationally
Focus should be on present harms, not speculative risks
Existing systems need scrutiny more than future ones
Economic and research benefits would be lost

The industry response: No major lab paused. Training continued. OpenAI released GPT-4 during the petition period.

What Happened Next

The letter galvanized debate but didn’t pause development. What followed:

US Executive Order on AI (October 2023)
EU AI Act passage (2024)
Voluntary commitments from major labs
Increased safety research investment
Ongoing tensions between safety and capability development

Lessons

The episode revealed:

Coordination is hard: Competitive pressures prevent unilateral slowing
The Overton window shifted: Safety became mainstream discussion
Governance lags capability: Policy follows technology
No consensus on risk: Experts disagree fundamentally on priorities

DISCUSSION EXERCISE: Governance Scenarios

Scenario 1: Deepfake Election Interference

A realistic deepfake of a political candidate surfaces days before an election. The candidate claims it’s fake, but verification takes time. What policies could prevent or mitigate this? Who is responsible?

Scenario 2: Autonomous Weapons

An AI system controls a military drone that makes lethal decisions without human approval. A strike kills civilians. Who is responsible—the programmer, commander, manufacturer, algorithm?

Scenario 3: Job Displacement

AI automation eliminates 50% of jobs in a particular industry within 5 years. Workers cannot easily retrain. What policies should governments enact? What responsibilities do companies have?

Scenario 4: AI-Generated Science

Researchers use AI to generate papers that pass peer review but contain fabricated data. How should the scientific community respond? What policies would help?

Recommended Resources

Books

The Alignment Problem by Brian Christian
Human Compatible by Stuart Russell
Atlas of AI by Kate Crawford
Superintelligence by Nick Bostrom

Organizations

AI Now Institute
Center for AI Safety
Partnership on AI
AlgorithmWatch
Ada Lovelace Institute

Papers

“Concrete Problems in AI Safety” (Amodei et al., 2016)
“On the Dangers of Stochastic Parrots” (Bender et al., 2021)
“Artificial Intelligence Index Report” (Stanford HAI, annual)

Module 10 confronts advanced ethical challenges in AI—from alignment and existential risk to deepfakes and governance. As AI systems become more capable, the questions become more urgent and the stakes higher.