Assessing the risks of AI catastrophe - presentation given by David Wood on 16th March 2024

@dw2 Page 1
Assessing the risks
of AI catastrophe
?
@dw2
David Wood
And how best
to respond
4pm Sat 16 March
Updated from
BGI24
unconference

@dw2 Page 2
@dw2
David Wood
Sustainable superabundance for all
Transcendent
possibilities

@dw2 Page 3
The risks of AI
catastrophe
Understanding
which risks
are the most
credible and
most serious
?
@dw2
David Wood

@dw2 Page 4
It`s Time to
Take Back Control
of Technology

@dw2 Page 5
https://news.sky.com/story/south-korea-man-crushed-to-death-by-robot-that-mistook-him-for-a-box-13003635
South Korea: Man crushed to death by robot that mistook him for a box
The man was reportedly inspecting the mechanical arm when the accident happened
The mechanical arm pushed
the man’s upper body onto a
conveyor belt and crushed his
face and chest
AIs and robots
sometimes
have bugs!
Superintelligence
won’t have
bugs!?

@dw2 Page 6
AI image recognition: hits and misses
https://gradientscience.org/intro_adversarial/
+0.005 x
“pig” “airliner”
AI sometimes makes mistakes that are “alien” – very different to human mistakes

@dw2 Page 7
https://edition.cnn.com/2021/04/29/tech/nijeer-parks-facial-recognition-police-arrest
Photo on fake driving license left at
the scene of a crime
Photo of Nijeer Parks, Paterson, New Jersey
11 days in prison Year-long legal nightmare

@dw2 Page 8
Page 8
“Google accused of directing motorist to drive off collapsed bridge”
https://www.bbc.co.uk/news/world-us-canada-66873982, 22nd Sept 2023, Philip Paxson, Hickory, North Carolina
Human vandals had recently damaged some warning signs
Bad human behaviour + Bad AI implementation -> Catastrophe
Misguided humans + Misguided AI -> Catastrophe

@dw2 Page 9
AI technology that deeply exploits human psychology?
AI technology designed to
make money for social
media platforms by
keeping users engaged
Molly
Russell

@dw2 Page 10
AI technology that deeply exploits human psychology?
Rohingya refugees in refugee camp in
Bangladesh, 2017
“The military and the local Rakhine
population killed at least 25,000 Rohingya
people and perpetrated gang rapes and
other forms of sexual violence against
18,000 Rohingya women and girls.
They estimated that 116,000 Rohingya were
beaten, and 36,000 were thrown into fires”
en.wikipedia.org/wiki/Rohingya_genocide

@dw2 Page 11
Lieutenant-Colonel Stanislav Petrov
https://en.wikipedia.org/wiki/Stanislav_Petrov
Yuri Andropov, USSR Premier, Nov 1982 to Feb 1984
KAL 007
1 Sept 1983
Shot down by
Soviet missile
All 269 killed
Including
member of US
House of
Representatives
Ronald Reagan:
“The Korean Air
Massacre”
26 Sept 1983
Alarm system indicated
incoming US missile(s)
Protocol dictated that Petrov
urgently inform his superiors
Petrov declined to follow orders
World
Citizen
Award
“The
Man
Who
Saved
The
World”
Future
of Life
Award

@dw2 Page 12
https://www.rac.co.uk/drive/advice/road-safety/autonomous-emergency-braking-what-you-need-to-know/
Autonomous Emergency Braking
Expected to save over 1,000 lives
and 100,000 casualties in the UK over the next decade

@dw2 Page 13
Lion Air Flight 610
Domestic flight inside Indonesia
29 October 2018
189 people on board
Ethiopian Airlines Flight 302
Addis Ababa, Ethiopia to Nairobi, Kenya
10 March 2019
157 people on board
Both flights used Boeing 737 Max aircraft
A (very safe) Boeing 737 design, pushed to the “max”
Airplane could become unstable in some circumstances
Hence introduced MCAS: Maneuvering Characteristics Augmentation System (AI)
Automatically push down the airplane nose in some emergency(?) situations
Pilots could in theory override this, but needed specialist training (skipped)
Jan 2021: Boeing paid fines of over $2.5 billion after being charged with fraud
Responding to competitive
pressure from Airbus
Victims of deteriorating corporate culture
deteriorating societal culture

@dw2 Page 14
Bhopal, India, 2 December 1984
“Accidental” release of 30 tons of a highly toxic chemical gas (methyl isocyanate)
2,259 deaths in short term, up to 14,000 more later, numerous birth defects
Safety systems in disrepair; inadequate training of staff in safety processes
Previous leaks not fully investigated; internal audit warning report not followed up
Company management had little long-term interest in the plant
“The World’s Worst Industrial Disaster”
www.theatlantic.com/photo/2014/12/bhopal-the-worlds-worst-industrial-disaster-30-years-later/100864/
Management
blamed
sabotage from
disgruntled
employees
Canary signal
Victims of deteriorating corporate culture

@dw2 Page 15
“The World’s Worst Ransomware Disaster?”
WannaCry – May 2017, devasted NHS hospitals throughout the UK
Seemingly earned the North Koreans very little actual money
Ransomware incompletely understood, out of control…
“Bad guys” re-used hacking tools developed by some “good guys”
Disgruntled
nation state
Incompetent
managers

@dw2 Page 16
“The religion for the elite” Disgruntled cult
Shoko Asahara
(1955-2018)
Founder and leader of
Aum Shinrikyo
20 March 1995
Bombs including sarin gas were exploded on five different Tokyo subway trains
13 people killed, 50 others severely injured (some of whom later died)
https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack
The group also assembled:
• Traditional explosives
• Chemical weapons
• A Russian military helicopter
• Hydrogen cyanide poison
• Samples of Ebola
• Samples of anthrax
Motivation
+ Technology
+ Knowledge
+ Vulnerability
= Catastrophe
AI++?
++?

@dw2 Page 17
Three objections to this narrative
Real-world disasters typically have
multiple overlapping causes
It’s the combination of human failures
and tech failures that could cause the
biggest catastrophes
Don’t just consider “Normal distributions”
Consider situations involving fat tails
Sudden mass extinctions / tipping points
Particularly deadly pandemics
The first nuclear explosions…
Consider potential exponential escalation
Yes, but: proceed with care!
Sometimes “solutions” make things worse
Novel tech failures (e.g. AI) can trigger
unexpected complications
The problem is with humans, not AI (?)
1
AI is the solution, not the problem (?)
2
These all just show minor catastrophes (?)
3
AI-in-a-rush is not the solution

@dw2 Page 18
Leslie Groves: Are you saying that there’s a chance that when we push that button... we destroy the world?
J. Robert Oppenheimer: The chances are near zero...
Groves: Near zero?
Oppenheimer: What do you want from theory alone?
Groves: Zero would be nice!
“Oppenheimer’s ‘NEAR ZERO’ probability, Explained!” - https://www.youtube.com/watch?v=wx1DkmIdKLI

@dw2 Page 19
Calculating consequences can be hard
• First hydrogen bomb test, 1st March 1954, Bikini Atoll
‒ Explosive yield was expected to be from 4 to 6 Megatons
‒ Was 15 Megatons, two and a half times
the expected maximum
‒ “Physics error” by the designers at Los
Alamos National Lab
‒ Wrongly considered the lithium-7 isotope
to be inert in bomb
‒ The crew in a nearby Japanese fishing boat
became ill in the wake of direct contact
with the fallout. One of the crew died
http://en.wikipedia.org/wiki/Castle_Bravo

AI catastrophic risk
Cascading catastrophe
Flawed human
reasoning
Flawed human
emotions
+
Flawed social
systems
+
Existentially
powerful AI
+
Fragile
infrastructure
+
This is changing
much more quickly
We can slow down the
most dangerous aspects
We can learn to harness
the best outcomes
Nukes, biofailure, geoengineering failure, hatred…
Improve!?
Improve!?
Improve!?
Improve!?
Singularity
Activism

@dw2 Page 22
Jail breaking, according to Dilbert

@dw2 Page 24
Upton Sinclair
1935
It is difficult to get
a man to understand
something, when
his salary depends
on his not
understanding it
https://libraries.indiana.edu/lilly-library/upton-sinclair
Salary?
Ideology?
Worldview?
Identity?
Tribal status?
We are a rationalizing species
at least as much as a rational one

@dw2 Page 25
Primary beliefs:
• World government would be awful
• Open source must be preserved
• We’ll all die of aging soon, without AGI
Assumptions:
• AI regulations would imply world government
• AI regulations would kill open source
• AGI is the best solution to various x-risks
Conclusion(?!):
• There must be no real risk of AI catastrophe

@dw2 Page 26
Primary beliefs:
• World government would be awful
• Open source must be preserved
• We’ll all die of aging soon, without AGI
Assumptions:
• AI regulations would imply world government
• AI regulations would kill open source
• AGI is the best solution to various x-risks
Conclusion(?!):
• There must be no real risk of AI catastrophe
Consensual safe AI is possible
Consensual safe AI is vital

@dw2 Page 27
Consensual safe AI is possible
Consensual safe AI is vital
Catastrophic AI risks are by no means science fiction
Catastrophic AI risks arise straightforwardly from
assumptions that everyone shares
There are solutions that are technically possible
There are solutions that are possible politically
and geo-politically, and that respect human values

@dw2 Page 28
The narrow corridor
Social wellbeing faces threats from powerful groups:
• Big Armaments
• Big Tobacco
• Big Oil
• Big Finance
• Big Crime
• Big Theology
• Big Media
• Big Money
The state needs power to control these potential
powerful cancers: Big State
Society needs power to control the state!
• Independent media
• Independent judiciary
• Independent academia
• Independent opposition parties
The separation of powers!
• Checks and balances

@dw2 Page 29
https://www.history.com/news/gorbachev-reagan-cold-war
Carl Sagan
Open, vivid, credible
communications of
future possibilities
“Trust, but verify”

@dw2 Page 30
Verifiably safe AI?!
AI is an inscrutable black box!?
• Distinguish the inner workings (inscrutable)
from the individual recommendations (testable)
• Compare: finding a solution to a puzzle, to verifying that solution
• Test critical recommendations in a safe environment (emulation) first
But AI will never consent to being confined!?
• That depends on whether the AI (tool) has volition / agency / sentience
• Hence importance to study of the possible emergence of volition etc
But risky AI might be developed below the radar!?
• So build-in tamper-proof remote shutdown capabilities
• (We don’t yet know how/whether this will be possible…)
• But the scale of these risks makes these investigations vital
Take back
control of
technology!
Adv AI –
Adv AI +

Reducing risks of catastrophic harm from Adv AI
Default path
Adv AI potentially
disinterested in
or hostile to
human wellbeing
Redesigning the context for the development of Adv AI
Engaging education for all: spreading acute awareness of Adv AI risks and possibilities
Transforming mental dispositions worldwide: compassion, openness, humble ambition
Re-engineering incentives: bonuses for public goods, stronger infrastructure, monitoring, sanctions
Help from trusted narrow AIs: review & suggest improvements in designs & implementations
Two possible design choices to depart from the default dangerous path for the creation of Advanced AI
In practice a combination of Adv AI+
and Adv AI– may prove best
Both design choices are likely very difficult and will require considerable analysis. To avoid other (dangerous)
designs getting to Adv AI first, the redesigned context is an essential part of the overall recommendation
Adv AI –
Avoid the inclusion or
subsequent acquisition of
features that would make
the Adv AI truly dangerous
E.g. autonomous will,
fully general reasoning (?)
?
Adv AI +
Design in extra features that
will be preserved through
all subsequent evolution
E.g. benevolence,
compassion, superwisdom
alongside superintelligence
?

@dw2 Page 32
Distraction
Confusion
Myths
Fear
Focus
Collaboration
Science
Vision
2040
?
SS4A SS4A
About
50%
chance?
Sustainable
Superabundance
for All
Chaos,
Catastrophe,
Humanity
diminished
Promote! Transcend!

@dw2 Page 33
Success modes
Failure modes
Desire to build superintelligence as fast as
possible (“accelerate regardless”)
Being pulled into a “Moloch” race
(“accelerate despite our better intentions”)
Doom-mongering: adverse psychology
Distraction by concerns of simpler, more
immediate failures of AI systems
Virtue signaling (talk without action)
Inflexible, unhelpful, heavy legislation
Not listening to key insights (closed minds)
Discussions without data (ideology first)
Spread deep understanding of the risks
associated with superintelligence
Establish higher-level incentives that
reward and penalize appropriately
Strengthen vision of good outcomes too
Assign sufficient time (and respect) to
consider all varieties of AI failures
Virtue action (step-by-step)
Adaptive, agile, lean legislation
Embrace diverse creativity and criticism
Gather data and analyze it (with AI help!)
https://magazine.mindplex.ai/cautionary-tales-and-a-ray-of-hope/
“Cautionary Tales And A Ray Of Hope: 4 scenarios for the transition to AGI”

@dw2 Page 34
Agreeing a path forward – transcendent goals
Define and pay attention to canary signals (wake-up calls)
• Beware the distractions of (e.g.) partisan tribalism, political correctness
• Deepen our understanding of which landmines need most care
Humanity should protect our most vulnerable infrastructure
(against action from dangerous humans and/or dangerous AI)
• Access to nuclear weapons
• Access to creation/jailbreaking of especially dangerous biopathogens
• The IT infrastructure on which we all depend
• The health of the environment
Protect human lives (the most basic of human rights)
• Enable all-round health and flourishing (not just GDP)
Prioritize differential development (steering plus acceleration)
• Mechanisms for safety, auditing, disabling, and building trust

@dw2 Page 35
Think harder about the consequences in advance
Monitor closely once deployed, ready to intervene
Question desirability
Clarify externalities
Require peer reviews
Involve multiple perspectives
Analyse the whole system
Anticipate fat tails
Reject opacity
Promote resilience
Promote verifiability
Promote auditability
Clarify risks to users
Clarify trade-offs
Insist on accountability
Penalise disinformation
Design for cooperation
Analyse via simulations
Maintain human oversight
Build consensus
regarding principles
Provide incentives
to address omissions
Halt development
if principles not upheld
Consolidate progress
via legal frameworks

Assessing the risks of AI catastrophe - presentation given by David Wood on 16th March 2024

Assessing the risks of AI catastrophe - presentation given by David Wood on 16th March 2024

Recommended

Recommended

More Related Content

More from David Wood

More from David Wood (20)

Recently uploaded

Recently uploaded (20)

Assessing the risks of AI catastrophe - presentation given by David Wood on 16th March 2024