SlideShare a Scribd company logo
1 of 36
Download to read offline
@dw2 Page 1
Assessing the risks
of AI catastrophe
?
@dw2
David Wood
And how best
to respond
4pm Sat 16 March
Updated from
BGI24
unconference
@dw2 Page 2
@dw2
David Wood
Sustainable superabundance for all
Transcendent
possibilities
@dw2 Page 3
The risks of AI
catastrophe
Understanding
which risks
are the most
credible and
most serious
?
@dw2
David Wood
@dw2 Page 4
It`s Time to
Take Back Control
of Technology
@dw2 Page 5
https://news.sky.com/story/south-korea-man-crushed-to-death-by-robot-that-mistook-him-for-a-box-13003635
South Korea: Man crushed to death by robot that mistook him for a box
The man was reportedly inspecting the mechanical arm when the accident happened
The mechanical arm pushed
the man’s upper body onto a
conveyor belt and crushed his
face and chest
AIs and robots
sometimes
have bugs!
Superintelligence
won’t have
bugs!?
@dw2 Page 6
AI image recognition: hits and misses
https://gradientscience.org/intro_adversarial/
+0.005 x
“pig” “airliner”
AI sometimes makes mistakes that are “alien” – very different to human mistakes
@dw2 Page 7
https://edition.cnn.com/2021/04/29/tech/nijeer-parks-facial-recognition-police-arrest
Photo on fake driving license left at
the scene of a crime
Photo of Nijeer Parks, Paterson, New Jersey
11 days in prison Year-long legal nightmare
@dw2 Page 8
Page 8
“Google accused of directing motorist to drive off collapsed bridge”
https://www.bbc.co.uk/news/world-us-canada-66873982, 22nd Sept 2023, Philip Paxson, Hickory, North Carolina
Human vandals had recently damaged some warning signs
Bad human behaviour + Bad AI implementation -> Catastrophe
Misguided humans + Misguided AI -> Catastrophe
@dw2 Page 9
AI technology that deeply exploits human psychology?
AI technology designed to
make money for social
media platforms by
keeping users engaged
Molly
Russell
@dw2 Page 10
AI technology that deeply exploits human psychology?
Rohingya refugees in refugee camp in
Bangladesh, 2017
“The military and the local Rakhine
population killed at least 25,000 Rohingya
people and perpetrated gang rapes and
other forms of sexual violence against
18,000 Rohingya women and girls.
They estimated that 116,000 Rohingya were
beaten, and 36,000 were thrown into fires”
en.wikipedia.org/wiki/Rohingya_genocide
@dw2 Page 11
Lieutenant-Colonel Stanislav Petrov
https://en.wikipedia.org/wiki/Stanislav_Petrov
Yuri Andropov, USSR Premier, Nov 1982 to Feb 1984
KAL 007
1 Sept 1983
Shot down by
Soviet missile
All 269 killed
Including
member of US
House of
Representatives
Ronald Reagan:
“The Korean Air
Massacre”
26 Sept 1983
Alarm system indicated
incoming US missile(s)
Protocol dictated that Petrov
urgently inform his superiors
Petrov declined to follow orders
World
Citizen
Award
“The
Man
Who
Saved
The
World”
Future
of Life
Award
@dw2 Page 12
https://www.rac.co.uk/drive/advice/road-safety/autonomous-emergency-braking-what-you-need-to-know/
Autonomous Emergency Braking
Expected to save over 1,000 lives
and 100,000 casualties in the UK over the next decade
@dw2 Page 13
Lion Air Flight 610
Domestic flight inside Indonesia
29 October 2018
189 people on board
Ethiopian Airlines Flight 302
Addis Ababa, Ethiopia to Nairobi, Kenya
10 March 2019
157 people on board
Both flights used Boeing 737 Max aircraft
A (very safe) Boeing 737 design, pushed to the “max”
Airplane could become unstable in some circumstances
Hence introduced MCAS: Maneuvering Characteristics Augmentation System (AI)
Automatically push down the airplane nose in some emergency(?) situations
Pilots could in theory override this, but needed specialist training (skipped)
Jan 2021: Boeing paid fines of over $2.5 billion after being charged with fraud
Responding to competitive
pressure from Airbus
Victims of deteriorating corporate culture
deteriorating societal culture
@dw2 Page 14
Bhopal, India, 2 December 1984
“Accidental” release of 30 tons of a highly toxic chemical gas (methyl isocyanate)
2,259 deaths in short term, up to 14,000 more later, numerous birth defects
Safety systems in disrepair; inadequate training of staff in safety processes
Previous leaks not fully investigated; internal audit warning report not followed up
Company management had little long-term interest in the plant
“The World’s Worst Industrial Disaster”
www.theatlantic.com/photo/2014/12/bhopal-the-worlds-worst-industrial-disaster-30-years-later/100864/
Management
blamed
sabotage from
disgruntled
employees
Canary signal
Victims of deteriorating corporate culture
@dw2 Page 15
“The World’s Worst Ransomware Disaster?”
WannaCry – May 2017, devasted NHS hospitals throughout the UK
Seemingly earned the North Koreans very little actual money
Ransomware incompletely understood, out of control…
“Bad guys” re-used hacking tools developed by some “good guys”
Disgruntled
nation state
Incompetent
managers
@dw2 Page 16
“The religion for the elite” Disgruntled cult
Shoko Asahara
(1955-2018)
Founder and leader of
Aum Shinrikyo
20 March 1995
Bombs including sarin gas were exploded on five different Tokyo subway trains
13 people killed, 50 others severely injured (some of whom later died)
https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack
The group also assembled:
• Traditional explosives
• Chemical weapons
• A Russian military helicopter
• Hydrogen cyanide poison
• Samples of Ebola
• Samples of anthrax
Motivation
+ Technology
+ Knowledge
+ Vulnerability
= Catastrophe
AI++?
++?
@dw2 Page 17
Three objections to this narrative
Real-world disasters typically have
multiple overlapping causes
It’s the combination of human failures
and tech failures that could cause the
biggest catastrophes
Don’t just consider “Normal distributions”
Consider situations involving fat tails
Sudden mass extinctions / tipping points
Particularly deadly pandemics
The first nuclear explosions…
Consider potential exponential escalation
Yes, but: proceed with care!
Sometimes “solutions” make things worse
Novel tech failures (e.g. AI) can trigger
unexpected complications
The problem is with humans, not AI (?)
1
AI is the solution, not the problem (?)
2
These all just show minor catastrophes (?)
3
AI-in-a-rush is not the solution
@dw2 Page 18
Leslie Groves: Are you saying that there’s a chance that when we push that button... we destroy the world?
J. Robert Oppenheimer: The chances are near zero...
Groves: Near zero?
Oppenheimer: What do you want from theory alone?
Groves: Zero would be nice!
“Oppenheimer’s ‘NEAR ZERO’ probability, Explained!” - https://www.youtube.com/watch?v=wx1DkmIdKLI
@dw2 Page 19
Calculating consequences can be hard
• First hydrogen bomb test, 1st March 1954, Bikini Atoll
‒ Explosive yield was expected to be from 4 to 6 Megatons
‒ Was 15 Megatons, two and a half times
the expected maximum
‒ “Physics error” by the designers at Los
Alamos National Lab
‒ Wrongly considered the lithium-7 isotope
to be inert in bomb
‒ The crew in a nearby Japanese fishing boat
became ill in the wake of direct contact
with the fallout. One of the crew died
http://en.wikipedia.org/wiki/Castle_Bravo
AI catastrophic risk
Cascading catastrophe
Flawed human
reasoning
Flawed human
emotions
+
Flawed social
systems
+
Existentially
powerful AI
+
Fragile
infrastructure
+
This is changing
much more quickly
We can slow down the
most dangerous aspects
We can learn to harness
the best outcomes
Nukes, biofailure, geoengineering failure, hatred…
Improve!?
Improve!?
Improve!?
Improve!?
Singularity
Activism
@dw2 Page 21
@dw2 Page 22
Jail breaking, according to Dilbert
@dw2 Page 23
@dw2 Page 24
Upton Sinclair
1935
It is difficult to get
a man to understand
something, when
his salary depends
on his not
understanding it
https://libraries.indiana.edu/lilly-library/upton-sinclair
Salary?
Ideology?
Worldview?
Identity?
Tribal status?
We are a rationalizing species
at least as much as a rational one
@dw2 Page 25
Primary beliefs:
• World government would be awful
• Open source must be preserved
• We’ll all die of aging soon, without AGI
Assumptions:
• AI regulations would imply world government
• AI regulations would kill open source
• AGI is the best solution to various x-risks
Conclusion(?!):
• There must be no real risk of AI catastrophe
@dw2 Page 26
Primary beliefs:
• World government would be awful
• Open source must be preserved
• We’ll all die of aging soon, without AGI
Assumptions:
• AI regulations would imply world government
• AI regulations would kill open source
• AGI is the best solution to various x-risks
Conclusion(?!):
• There must be no real risk of AI catastrophe
Consensual safe AI is possible
Consensual safe AI is vital
@dw2 Page 27
Consensual safe AI is possible
Consensual safe AI is vital
Catastrophic AI risks are by no means science fiction
Catastrophic AI risks arise straightforwardly from
assumptions that everyone shares
There are solutions that are technically possible
There are solutions that are possible politically
and geo-politically, and that respect human values
@dw2 Page 28
The narrow corridor
Social wellbeing faces threats from powerful groups:
• Big Armaments
• Big Tobacco
• Big Oil
• Big Finance
• Big Crime
• Big Theology
• Big Media
• Big Money
The state needs power to control these potential
powerful cancers: Big State
Society needs power to control the state!
• Independent media
• Independent judiciary
• Independent academia
• Independent opposition parties
The separation of powers!
• Checks and balances
@dw2 Page 29
https://www.history.com/news/gorbachev-reagan-cold-war
Carl Sagan
Open, vivid, credible
communications of
future possibilities
“Trust, but verify”
@dw2 Page 30
Verifiably safe AI?!
AI is an inscrutable black box!?
• Distinguish the inner workings (inscrutable)
from the individual recommendations (testable)
• Compare: finding a solution to a puzzle, to verifying that solution
• Test critical recommendations in a safe environment (emulation) first
But AI will never consent to being confined!?
• That depends on whether the AI (tool) has volition / agency / sentience
• Hence importance to study of the possible emergence of volition etc
But risky AI might be developed below the radar!?
• So build-in tamper-proof remote shutdown capabilities
• (We don’t yet know how/whether this will be possible…)
• But the scale of these risks makes these investigations vital
Take back
control of
technology!
Adv AI –
Adv AI +
Reducing risks of catastrophic harm from Adv AI
Default path
Adv AI potentially
disinterested in
or hostile to
human wellbeing
Redesigning the context for the development of Adv AI
Engaging education for all: spreading acute awareness of Adv AI risks and possibilities
Transforming mental dispositions worldwide: compassion, openness, humble ambition
Re-engineering incentives: bonuses for public goods, stronger infrastructure, monitoring, sanctions
Help from trusted narrow AIs: review & suggest improvements in designs & implementations
Two possible design choices to depart from the default dangerous path for the creation of Advanced AI
In practice a combination of Adv AI+
and Adv AI– may prove best
Both design choices are likely very difficult and will require considerable analysis. To avoid other (dangerous)
designs getting to Adv AI first, the redesigned context is an essential part of the overall recommendation
Adv AI –
Avoid the inclusion or
subsequent acquisition of
features that would make
the Adv AI truly dangerous
E.g. autonomous will,
fully general reasoning (?)
?
Adv AI +
Design in extra features that
will be preserved through
all subsequent evolution
E.g. benevolence,
compassion, superwisdom
alongside superintelligence
?
@dw2 Page 32
Distraction
Confusion
Myths
Fear
Focus
Collaboration
Science
Vision
2040
?
SS4A SS4A
About
50%
chance?
Sustainable
Superabundance
for All
Chaos,
Catastrophe,
Humanity
diminished
Promote! Transcend!
@dw2 Page 33
Success modes
Failure modes
Desire to build superintelligence as fast as
possible (“accelerate regardless”)
Being pulled into a “Moloch” race
(“accelerate despite our better intentions”)
Doom-mongering: adverse psychology
Distraction by concerns of simpler, more
immediate failures of AI systems
Virtue signaling (talk without action)
Inflexible, unhelpful, heavy legislation
Not listening to key insights (closed minds)
Discussions without data (ideology first)
Spread deep understanding of the risks
associated with superintelligence
Establish higher-level incentives that
reward and penalize appropriately
Strengthen vision of good outcomes too
Assign sufficient time (and respect) to
consider all varieties of AI failures
Virtue action (step-by-step)
Adaptive, agile, lean legislation
Embrace diverse creativity and criticism
Gather data and analyze it (with AI help!)
https://magazine.mindplex.ai/cautionary-tales-and-a-ray-of-hope/
“Cautionary Tales And A Ray Of Hope: 4 scenarios for the transition to AGI”
@dw2 Page 34
Agreeing a path forward – transcendent goals
Define and pay attention to canary signals (wake-up calls)
• Beware the distractions of (e.g.) partisan tribalism, political correctness
• Deepen our understanding of which landmines need most care
Humanity should protect our most vulnerable infrastructure
(against action from dangerous humans and/or dangerous AI)
• Access to nuclear weapons
• Access to creation/jailbreaking of especially dangerous biopathogens
• The IT infrastructure on which we all depend
• The health of the environment
Protect human lives (the most basic of human rights)
• Enable all-round health and flourishing (not just GDP)
Prioritize differential development (steering plus acceleration)
• Mechanisms for safety, auditing, disabling, and building trust
@dw2 Page 35
Think harder about the consequences in advance
Monitor closely once deployed, ready to intervene
Question desirability
Clarify externalities
Require peer reviews
Involve multiple perspectives
Analyse the whole system
Anticipate fat tails
Reject opacity
Promote resilience
Promote verifiability
Promote auditability
Clarify risks to users
Clarify trade-offs
Insist on accountability
Penalise disinformation
Design for cooperation
Analyse via simulations
Maintain human oversight
Build consensus
regarding principles
Provide incentives
to address omissions
Halt development
if principles not upheld
Consolidate progress
via legal frameworks
Assessing the risks of AI catastrophe - presentation given by David Wood on 16th March 2024

More Related Content

More from David Wood

Roadmapping the UK's future, 2019-2025-2035
Roadmapping the UK's future, 2019-2025-2035Roadmapping the UK's future, 2019-2025-2035
Roadmapping the UK's future, 2019-2025-2035
David Wood
 

More from David Wood (20)

AI - summary of focus groups.pdf
AI - summary of focus groups.pdfAI - summary of focus groups.pdf
AI - summary of focus groups.pdf
 
From the Eclipse Foundation to the Symbian Foundation
From the Eclipse Foundation to the Symbian FoundationFrom the Eclipse Foundation to the Symbian Foundation
From the Eclipse Foundation to the Symbian Foundation
 
Anticipating and managing the future of AI
Anticipating and managing the future of AIAnticipating and managing the future of AI
Anticipating and managing the future of AI
 
The Singularity Principles for WTEF
The Singularity Principles for WTEFThe Singularity Principles for WTEF
The Singularity Principles for WTEF
 
Vital Syllabus project update 220410.pdf
Vital Syllabus project update 220410.pdfVital Syllabus project update 220410.pdf
Vital Syllabus project update 220410.pdf
 
The Abolition of Aging - An update for 2022.pdf
The Abolition of Aging - An update for 2022.pdfThe Abolition of Aging - An update for 2022.pdf
The Abolition of Aging - An update for 2022.pdf
 
Vital Syllabus project update 220315
Vital Syllabus project update 220315Vital Syllabus project update 220315
Vital Syllabus project update 220315
 
UK node MPPC 2021 v1
UK node MPPC 2021 v1UK node MPPC 2021 v1
UK node MPPC 2021 v1
 
Transhumanism 2024: A new future for politics?
Transhumanism 2024: A new future for politics?Transhumanism 2024: A new future for politics?
Transhumanism 2024: A new future for politics?
 
DW Augmented Humanity - Opportunity or Threat
DW Augmented Humanity - Opportunity or ThreatDW Augmented Humanity - Opportunity or Threat
DW Augmented Humanity - Opportunity or Threat
 
DW New Kind of Thinking
DW New Kind of ThinkingDW New Kind of Thinking
DW New Kind of Thinking
 
AI in 5-10 years time: 12 ways it could be very different from today
AI in 5-10 years time: 12 ways it could be very different from todayAI in 5-10 years time: 12 ways it could be very different from today
AI in 5-10 years time: 12 ways it could be very different from today
 
DW H+Summit 2020
DW H+Summit 2020DW H+Summit 2020
DW H+Summit 2020
 
Uk node mpcc 2020 v2
Uk node mpcc 2020 v2Uk node mpcc 2020 v2
Uk node mpcc 2020 v2
 
Roadmapping the UK's future, 2019-2025-2035
Roadmapping the UK's future, 2019-2025-2035Roadmapping the UK's future, 2019-2025-2035
Roadmapping the UK's future, 2019-2025-2035
 
The roadmap to abolish aging by 2040
The roadmap to abolish aging by 2040The roadmap to abolish aging by 2040
The roadmap to abolish aging by 2040
 
Progressive ethics in the digital age
Progressive ethics in the digital ageProgressive ethics in the digital age
Progressive ethics in the digital age
 
Lessons from 10 years of public meetups addressing existential risk
Lessons from 10 years of public meetups addressing existential riskLessons from 10 years of public meetups addressing existential risk
Lessons from 10 years of public meetups addressing existential risk
 
DW Establishing Futuristic Rights
DW Establishing Futuristic RightsDW Establishing Futuristic Rights
DW Establishing Futuristic Rights
 
State of the Future 2015-16: Report from the Millennium Project
State of the Future 2015-16: Report from the Millennium ProjectState of the Future 2015-16: Report from the Millennium Project
State of the Future 2015-16: Report from the Millennium Project
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 

Assessing the risks of AI catastrophe - presentation given by David Wood on 16th March 2024

  • 1. @dw2 Page 1 Assessing the risks of AI catastrophe ? @dw2 David Wood And how best to respond 4pm Sat 16 March Updated from BGI24 unconference
  • 2. @dw2 Page 2 @dw2 David Wood Sustainable superabundance for all Transcendent possibilities
  • 3. @dw2 Page 3 The risks of AI catastrophe Understanding which risks are the most credible and most serious ? @dw2 David Wood
  • 4. @dw2 Page 4 It`s Time to Take Back Control of Technology
  • 5. @dw2 Page 5 https://news.sky.com/story/south-korea-man-crushed-to-death-by-robot-that-mistook-him-for-a-box-13003635 South Korea: Man crushed to death by robot that mistook him for a box The man was reportedly inspecting the mechanical arm when the accident happened The mechanical arm pushed the man’s upper body onto a conveyor belt and crushed his face and chest AIs and robots sometimes have bugs! Superintelligence won’t have bugs!?
  • 6. @dw2 Page 6 AI image recognition: hits and misses https://gradientscience.org/intro_adversarial/ +0.005 x “pig” “airliner” AI sometimes makes mistakes that are “alien” – very different to human mistakes
  • 7. @dw2 Page 7 https://edition.cnn.com/2021/04/29/tech/nijeer-parks-facial-recognition-police-arrest Photo on fake driving license left at the scene of a crime Photo of Nijeer Parks, Paterson, New Jersey 11 days in prison Year-long legal nightmare
  • 8. @dw2 Page 8 Page 8 “Google accused of directing motorist to drive off collapsed bridge” https://www.bbc.co.uk/news/world-us-canada-66873982, 22nd Sept 2023, Philip Paxson, Hickory, North Carolina Human vandals had recently damaged some warning signs Bad human behaviour + Bad AI implementation -> Catastrophe Misguided humans + Misguided AI -> Catastrophe
  • 9. @dw2 Page 9 AI technology that deeply exploits human psychology? AI technology designed to make money for social media platforms by keeping users engaged Molly Russell
  • 10. @dw2 Page 10 AI technology that deeply exploits human psychology? Rohingya refugees in refugee camp in Bangladesh, 2017 “The military and the local Rakhine population killed at least 25,000 Rohingya people and perpetrated gang rapes and other forms of sexual violence against 18,000 Rohingya women and girls. They estimated that 116,000 Rohingya were beaten, and 36,000 were thrown into fires” en.wikipedia.org/wiki/Rohingya_genocide
  • 11. @dw2 Page 11 Lieutenant-Colonel Stanislav Petrov https://en.wikipedia.org/wiki/Stanislav_Petrov Yuri Andropov, USSR Premier, Nov 1982 to Feb 1984 KAL 007 1 Sept 1983 Shot down by Soviet missile All 269 killed Including member of US House of Representatives Ronald Reagan: “The Korean Air Massacre” 26 Sept 1983 Alarm system indicated incoming US missile(s) Protocol dictated that Petrov urgently inform his superiors Petrov declined to follow orders World Citizen Award “The Man Who Saved The World” Future of Life Award
  • 12. @dw2 Page 12 https://www.rac.co.uk/drive/advice/road-safety/autonomous-emergency-braking-what-you-need-to-know/ Autonomous Emergency Braking Expected to save over 1,000 lives and 100,000 casualties in the UK over the next decade
  • 13. @dw2 Page 13 Lion Air Flight 610 Domestic flight inside Indonesia 29 October 2018 189 people on board Ethiopian Airlines Flight 302 Addis Ababa, Ethiopia to Nairobi, Kenya 10 March 2019 157 people on board Both flights used Boeing 737 Max aircraft A (very safe) Boeing 737 design, pushed to the “max” Airplane could become unstable in some circumstances Hence introduced MCAS: Maneuvering Characteristics Augmentation System (AI) Automatically push down the airplane nose in some emergency(?) situations Pilots could in theory override this, but needed specialist training (skipped) Jan 2021: Boeing paid fines of over $2.5 billion after being charged with fraud Responding to competitive pressure from Airbus Victims of deteriorating corporate culture deteriorating societal culture
  • 14. @dw2 Page 14 Bhopal, India, 2 December 1984 “Accidental” release of 30 tons of a highly toxic chemical gas (methyl isocyanate) 2,259 deaths in short term, up to 14,000 more later, numerous birth defects Safety systems in disrepair; inadequate training of staff in safety processes Previous leaks not fully investigated; internal audit warning report not followed up Company management had little long-term interest in the plant “The World’s Worst Industrial Disaster” www.theatlantic.com/photo/2014/12/bhopal-the-worlds-worst-industrial-disaster-30-years-later/100864/ Management blamed sabotage from disgruntled employees Canary signal Victims of deteriorating corporate culture
  • 15. @dw2 Page 15 “The World’s Worst Ransomware Disaster?” WannaCry – May 2017, devasted NHS hospitals throughout the UK Seemingly earned the North Koreans very little actual money Ransomware incompletely understood, out of control… “Bad guys” re-used hacking tools developed by some “good guys” Disgruntled nation state Incompetent managers
  • 16. @dw2 Page 16 “The religion for the elite” Disgruntled cult Shoko Asahara (1955-2018) Founder and leader of Aum Shinrikyo 20 March 1995 Bombs including sarin gas were exploded on five different Tokyo subway trains 13 people killed, 50 others severely injured (some of whom later died) https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack The group also assembled: • Traditional explosives • Chemical weapons • A Russian military helicopter • Hydrogen cyanide poison • Samples of Ebola • Samples of anthrax Motivation + Technology + Knowledge + Vulnerability = Catastrophe AI++? ++?
  • 17. @dw2 Page 17 Three objections to this narrative Real-world disasters typically have multiple overlapping causes It’s the combination of human failures and tech failures that could cause the biggest catastrophes Don’t just consider “Normal distributions” Consider situations involving fat tails Sudden mass extinctions / tipping points Particularly deadly pandemics The first nuclear explosions… Consider potential exponential escalation Yes, but: proceed with care! Sometimes “solutions” make things worse Novel tech failures (e.g. AI) can trigger unexpected complications The problem is with humans, not AI (?) 1 AI is the solution, not the problem (?) 2 These all just show minor catastrophes (?) 3 AI-in-a-rush is not the solution
  • 18. @dw2 Page 18 Leslie Groves: Are you saying that there’s a chance that when we push that button... we destroy the world? J. Robert Oppenheimer: The chances are near zero... Groves: Near zero? Oppenheimer: What do you want from theory alone? Groves: Zero would be nice! “Oppenheimer’s ‘NEAR ZERO’ probability, Explained!” - https://www.youtube.com/watch?v=wx1DkmIdKLI
  • 19. @dw2 Page 19 Calculating consequences can be hard • First hydrogen bomb test, 1st March 1954, Bikini Atoll ‒ Explosive yield was expected to be from 4 to 6 Megatons ‒ Was 15 Megatons, two and a half times the expected maximum ‒ “Physics error” by the designers at Los Alamos National Lab ‒ Wrongly considered the lithium-7 isotope to be inert in bomb ‒ The crew in a nearby Japanese fishing boat became ill in the wake of direct contact with the fallout. One of the crew died http://en.wikipedia.org/wiki/Castle_Bravo
  • 20. AI catastrophic risk Cascading catastrophe Flawed human reasoning Flawed human emotions + Flawed social systems + Existentially powerful AI + Fragile infrastructure + This is changing much more quickly We can slow down the most dangerous aspects We can learn to harness the best outcomes Nukes, biofailure, geoengineering failure, hatred… Improve!? Improve!? Improve!? Improve!? Singularity Activism
  • 22. @dw2 Page 22 Jail breaking, according to Dilbert
  • 24. @dw2 Page 24 Upton Sinclair 1935 It is difficult to get a man to understand something, when his salary depends on his not understanding it https://libraries.indiana.edu/lilly-library/upton-sinclair Salary? Ideology? Worldview? Identity? Tribal status? We are a rationalizing species at least as much as a rational one
  • 25. @dw2 Page 25 Primary beliefs: • World government would be awful • Open source must be preserved • We’ll all die of aging soon, without AGI Assumptions: • AI regulations would imply world government • AI regulations would kill open source • AGI is the best solution to various x-risks Conclusion(?!): • There must be no real risk of AI catastrophe
  • 26. @dw2 Page 26 Primary beliefs: • World government would be awful • Open source must be preserved • We’ll all die of aging soon, without AGI Assumptions: • AI regulations would imply world government • AI regulations would kill open source • AGI is the best solution to various x-risks Conclusion(?!): • There must be no real risk of AI catastrophe Consensual safe AI is possible Consensual safe AI is vital
  • 27. @dw2 Page 27 Consensual safe AI is possible Consensual safe AI is vital Catastrophic AI risks are by no means science fiction Catastrophic AI risks arise straightforwardly from assumptions that everyone shares There are solutions that are technically possible There are solutions that are possible politically and geo-politically, and that respect human values
  • 28. @dw2 Page 28 The narrow corridor Social wellbeing faces threats from powerful groups: • Big Armaments • Big Tobacco • Big Oil • Big Finance • Big Crime • Big Theology • Big Media • Big Money The state needs power to control these potential powerful cancers: Big State Society needs power to control the state! • Independent media • Independent judiciary • Independent academia • Independent opposition parties The separation of powers! • Checks and balances
  • 29. @dw2 Page 29 https://www.history.com/news/gorbachev-reagan-cold-war Carl Sagan Open, vivid, credible communications of future possibilities “Trust, but verify”
  • 30. @dw2 Page 30 Verifiably safe AI?! AI is an inscrutable black box!? • Distinguish the inner workings (inscrutable) from the individual recommendations (testable) • Compare: finding a solution to a puzzle, to verifying that solution • Test critical recommendations in a safe environment (emulation) first But AI will never consent to being confined!? • That depends on whether the AI (tool) has volition / agency / sentience • Hence importance to study of the possible emergence of volition etc But risky AI might be developed below the radar!? • So build-in tamper-proof remote shutdown capabilities • (We don’t yet know how/whether this will be possible…) • But the scale of these risks makes these investigations vital Take back control of technology! Adv AI – Adv AI +
  • 31. Reducing risks of catastrophic harm from Adv AI Default path Adv AI potentially disinterested in or hostile to human wellbeing Redesigning the context for the development of Adv AI Engaging education for all: spreading acute awareness of Adv AI risks and possibilities Transforming mental dispositions worldwide: compassion, openness, humble ambition Re-engineering incentives: bonuses for public goods, stronger infrastructure, monitoring, sanctions Help from trusted narrow AIs: review & suggest improvements in designs & implementations Two possible design choices to depart from the default dangerous path for the creation of Advanced AI In practice a combination of Adv AI+ and Adv AI– may prove best Both design choices are likely very difficult and will require considerable analysis. To avoid other (dangerous) designs getting to Adv AI first, the redesigned context is an essential part of the overall recommendation Adv AI – Avoid the inclusion or subsequent acquisition of features that would make the Adv AI truly dangerous E.g. autonomous will, fully general reasoning (?) ? Adv AI + Design in extra features that will be preserved through all subsequent evolution E.g. benevolence, compassion, superwisdom alongside superintelligence ?
  • 32. @dw2 Page 32 Distraction Confusion Myths Fear Focus Collaboration Science Vision 2040 ? SS4A SS4A About 50% chance? Sustainable Superabundance for All Chaos, Catastrophe, Humanity diminished Promote! Transcend!
  • 33. @dw2 Page 33 Success modes Failure modes Desire to build superintelligence as fast as possible (“accelerate regardless”) Being pulled into a “Moloch” race (“accelerate despite our better intentions”) Doom-mongering: adverse psychology Distraction by concerns of simpler, more immediate failures of AI systems Virtue signaling (talk without action) Inflexible, unhelpful, heavy legislation Not listening to key insights (closed minds) Discussions without data (ideology first) Spread deep understanding of the risks associated with superintelligence Establish higher-level incentives that reward and penalize appropriately Strengthen vision of good outcomes too Assign sufficient time (and respect) to consider all varieties of AI failures Virtue action (step-by-step) Adaptive, agile, lean legislation Embrace diverse creativity and criticism Gather data and analyze it (with AI help!) https://magazine.mindplex.ai/cautionary-tales-and-a-ray-of-hope/ “Cautionary Tales And A Ray Of Hope: 4 scenarios for the transition to AGI”
  • 34. @dw2 Page 34 Agreeing a path forward – transcendent goals Define and pay attention to canary signals (wake-up calls) • Beware the distractions of (e.g.) partisan tribalism, political correctness • Deepen our understanding of which landmines need most care Humanity should protect our most vulnerable infrastructure (against action from dangerous humans and/or dangerous AI) • Access to nuclear weapons • Access to creation/jailbreaking of especially dangerous biopathogens • The IT infrastructure on which we all depend • The health of the environment Protect human lives (the most basic of human rights) • Enable all-round health and flourishing (not just GDP) Prioritize differential development (steering plus acceleration) • Mechanisms for safety, auditing, disabling, and building trust
  • 35. @dw2 Page 35 Think harder about the consequences in advance Monitor closely once deployed, ready to intervene Question desirability Clarify externalities Require peer reviews Involve multiple perspectives Analyse the whole system Anticipate fat tails Reject opacity Promote resilience Promote verifiability Promote auditability Clarify risks to users Clarify trade-offs Insist on accountability Penalise disinformation Design for cooperation Analyse via simulations Maintain human oversight Build consensus regarding principles Provide incentives to address omissions Halt development if principles not upheld Consolidate progress via legal frameworks