SlideShare a Scribd company logo
1 of 68
Meredith L.
Patterson
BSidesLV
August 5, 2014
STRATEGIES
WITHOUT
FRONTIERS
 I hate boring problems
 I especially hate solving tiny variations on the same
boring problem over and over again
 The internet is full of the same boring problems over
and over again
 Both in the cloud …
 … and in the circus
 Not my circus, not my monkeys
MOTIVATION
 Information theory
 Probability theory
 Formal language theory (of course)
 Control theory
 First-order logic
 Haskell
ALSO APPEARING IN THIS TALK
 When an unknown agent acts, how do you react?
 Observation of side effects
 Signals the agent sends
 Past interactions with others
 Formal language theory
(if you’re a computer)
 Systematic knowledge about the
structure of interactions and the
incentives involved in them
IT IS PITCH BLACK. YOU ARE LIKELY TO
BE EATEN BY A GRUE.
 Everything You Actually Need to Know About
Classical Game Theory
 in math …
 … and psychology
 Changing the Game
 Extensive form and signaling games
 Multiplayer and long-running games
 Reasoning Under Uncertainty, Over Real Data
OUTLINE
EVERYTHING YOU ACTUALLY
NEED TO KNOW ABOUT
CLASSICAL GAME THEORY
 Players
 Information available at each decision point
 Possible actions at each decision point
 Payoffs for each outcome
 Strategies (pure or mixed)
 Or behaviour, in iterated or turn-taking games
 Equilibria
 Different kinds of games have different kinds of equilibria
WHAT’S IN A GAME?
a, b c, d
e, f g, h
A NORMAL FORM GAME
Cooperate
Defect
Cooperate Defect
 Pure strategy: fully specified set of moves for every
situation
 Mixed strategy: probability assigned to each possible
move, random path through game tree
 Behaviour strategies: probabilities assigned at
information sets
STRATEGIES
PRISONER’S DILEMMA
-1, -1 -3, 0
0, -3 -2, -2
Cooperate
Defect
Cooperate Defect
d, e > a, b > g, h > c, f
MATCHING PENNIES
1, -1 -1, 1
-1, 1 1, -1
Heads
Tails
Heads Tails
a = d = f = g > b = c = e = h
DEADLOCK
1, 1 0, 3
3, 0 2, 2
Cooperate
Defect
Cooperate Defect
e > g > a > c and d > h > b > f
STAG HUNT
2, 2 0, 1
1, 0 1, 1
Stag
Hare
Stag Hare
a = b > d = e = g = h > c = f
CHICKEN
0, 0 -1, 1
1, -1 -10, -10
Swerve
Straight
Swerve Straight
e > a > c > g and d > b > f > h
HAWK/DOVE
𝑽
𝟐
,
𝑽
𝟐
0, V
V, 0
𝑉−𝐶
2
,
𝑉−𝐶
2
Share
Fight
Share Fight
e > a > c > g and d > b > f > h
BATTLE OF THE SEXES
3, 2 0, 0
0, 0 2, 3
Opera
Football
Opera Football
(a > g and h > b) > c = d = e = f
 Games can be zero-sum or non-zero-sum
 Games can be about conflict or cooperation
 Actions are not inherently morally valenced
 Payoffs determine type of game, strategy
WHAT HAVE WE SEEN SO FAR?
 Cournot equilibrium: each actor’s output maximizes
its profit given the outputs of other actors
 Nash equilibrium: each actor is making the best
decision they can, given what they know about each
other’s decisions
 Subgame perfect equilibrium: eliminates non-
credible threats
 Trembling hand equilibrium: considers the possibility
that a player might make an unintended move
EQUILIBRIUM
TRANSACTIONAL
ANALYSIS:
GAMES PEOPLE PLAY
MIND GAMES
“As far as the theory of games is concerned,
the principle which emerges here is that any
social intercourse whatsoever has a biological
advantage over no intercourse at all.”
 Procedures
 Operations
 Rituals
 Pastimes
 (Predatory) Games
TYPES OF INTERACTIONS
 “Hands” or roles = players
 Extensive form; players move in response to each
other
 Advantages
 Existential advantage: confirmation of existing beliefs
 Internal psychological advantage: direct emotional payoff
 External psychological advantage: avoiding a feared
situation
 Internal social advantage: structure/position with respect
to other players
 External social advantage: as above, wrt non-players
BERNE’S GAMES: STRUCTURE
 Kick Me
 Goal: Sympathy
 Find someone to beat on you, then whine about it
 “My misfortunes are better than yours”
 Ain’t It Awful
 Can be a pastime, but also manifests as a game
 Player displays distress; payoff is sympathy and help
 Why Don’t You – Yes, But
 Player claims to want advice. Player doesn’t really want it.
 Goal: Reassurance
BERNE’S GAMES: EXAMPLES
 Now I’ve Got You, You Son Of A Bitch
 Goal: Justification (or just money)
 Three-handed version is the badger game
 Roles
 Victim
 Aggressor
 Confederate
 Moves
 Provocation → Accusation
 Defence → Accusation
 Defence → Punishment
THE BADGER GAME
 “Schlemiel,” in Berne’s glossary
 Moves:
 Provocation → resentment
 (repeat)
 If B responds with anger, A appears justified in more
anger
 If B keeps their cool, A still keeps pushing
TROLLING
 Social media
 Organic responses against predatory games
 Predator Alert Tool
 /r/TumblrInAction “known trolls” wiki
 Those just happen to be ones I know about
 A truly generic reputation system is probably a pipe dream
 Wikipedia
 eBay
 But for these, we have to extend the basic
mathematical model.
OTHER MONKEY GAMEBOARDS
DISSECTING A
SIGNALING GAME
THE SETUP
THE TYPE
Split Steal
1
BOTH SPLIT
BOTH SPLIT
Split Steal
1
1 1
A
B
Split
Split
2
2
6800,
6800
6800,
6800
ONE SPLITS, ONE STEALS
ONE SPLITS, ONE STEALS
Split Steal
1
1 1
A
B
Split
Split
6800,
6800
6800,
6800
2
2
A
Split
2
Steal
Steal
B
Split
2
0,
13600
0,
13600
13600,
0
13600,
0
BOTH STEAL
BOTH STEAL
Split Steal
1
1 1
A
B
Split
Split
6800,
6800
6800,
6800
2
2
A
Split
2
Steal
Steal
B
Split
2
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0
NORMAL FORM
Also known as the Friend-or-Foe game.
1, 1 0, 2
2, 0 0, 0
Split
Steal
Split Steal
d = e > a = b > c = f = g = h
OBSERVATION
FIRST MOVE: NICK’S CHOICE
Split Steal
1
1 1
“I’m likely to split”
“I’m likely to steal”
Split
Split
6800,
6800
6800,
6800
2
SplitSteal
Steal
“I’m likely to steal”
Split
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0
“I’m likely to split”
2
SIGNALING
SECOND MOVE: NICK’S SIGNAL
Split Steal
1
1 1
“I’m likely to split”
“I’m likely to steal”
Split
Split
6800,
6800
6800,
6800
2
SplitSteal
Steal
“I’m likely to steal”
Split
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0
“I’m likely to split”
2
THE BIG REVEAL
THE COMPLETE PATH
Split Steal
1
1 1
“I’m likely to split”
“I’m likely to steal”
Split
Split
6800,
6800
6800,
6800
2
SplitSteal
Steal
“I’m likely to steal”
Split
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0
“I’m likely to split”
2
GAMES IN THE
TRANSPARENT SOCIETY
 Strategies now depend on payoff matrix and history
 Axelrod, 1981: how well do these strategies perform
against each other over time?
 “Ecological” tournaments: players abandon bad strategies
 Rapoport: if the only information you have is how
player X interacted with you last time, the best you
can do is Tit-for-Tat
 TFT cannot score higher than its opponent
 Axelrod: “Don’t be envious”
 Against TFT, no one can do better than cooperate
 Axelrod: “Don’t be too clever”
ITERATED GAMES
 Nice: S is a nice strategy iff it will not defect on
someone who has not defected on it
 Retaliatory: S is a retaliatory strategy iff it will
defect on someone who defects on it
 Forgiving: S is a forgiving strategy iff it will stop
defecting on someone who stops defecting on it
PROPERTIES
 Ord/Blair, 2002: what happens when strategies can
take into account all past interactions?
 We can express strategies in convenient first-order
logic, as it turns out
 Tit-for-Tat: D(c, r, p)
 Tit-for-Two-Tats: D(c, r, p) ∧ D(c, r, b(p))
 Grim: ∃t D(c, r, t)
 Bully: ¬∃t D(c, r, t)
 Spiteful-Bully: ¬∃t D(c, r, t) ∨ ∃s (D(c, r, s) ∧ D(c, r, b(s)) ∧
D(c, r, b(b(s))))
 Vigilante: ¬∃j D(c, j, p)
 Police: D(c, r, p) ∨ ∃j (D(c, j, p) ∧ ¬∃k(D(j, k, b(p)))
SOCIETAL ITERATED GAME THEORY
EVOLUTION IS A HARSH MISTRESS
Tit-for-Tat All-Cooperate Spiteful-Bully
PEACEKEEPING
Police All-Cooperate Spiteful-Bully
 In a society, niceness is more nuanced
 Individually nice: will not defect on someone who has not
defected on it
 Meta-individually nice: will not defect on individually nice
 Communally nice: will not defect on someone who has not
defected at all
 Meta-communally nice: will not defect on communally nice
 Same applies to forgiveness and retaliation
 Loyalty: will not defect on the same strategy as itself
NICENESS AND LOYALTY
 Peacekeepers don’t always agree
 Police will defect on Vigilantes and vice versa
 Peacekeepers protect non-peacekeeping strategies
at their own expense
META-PEACEKEEPING
Police
All-Cooperate
Spiteful-Bully
Tit-for-Tat
REDUCTIO AD ABSURDUM: ABSOLUTIST
∃t ∃j D(r, j, t) ⊕ D(c, j, t)
Tit-for-Tat All-Cooperate Spiteful-Bully Absolutist
ABSOLUTISM UBER ALLES
Tit-for-Tat All-Cooperate Spiteful-Bully Absolutist
REASONING UNDER
UNCERTAINTY
 Frequentist: probability is the long-term frequency of
events
 Reasoning from absolute probabilities
 What happens if an event only happens once?
 Returns an estimate
 Bayesian: probability is a measure of confidence that
an event will occur
 Reasoning from relative probabilities
 Returns a probability distribution over outcomes
 Update beliefs (confidence) as new evidence arrives
TWO INTERPRETATIONS OF PROBABILITY
P(A|X) =
P X A P(A)
P(X)
 Probability distribution function: assigns
probabilities to outcomes
 Discrete: a finite set of values (enumeration)
 Function also called a probability mass function
 Poisson, binomial, Bernoulli, discrete uniform…
 Continuous: arbitrary-precision values
 Function also called a probability density function
 Exponential, Gaussian (normal), chi-squared, continuous
uniform…
 Mixed: both discrete and continuous
 Narrower distribution = greater certainty
DISTRIBUTIONS
𝐸 𝑍 𝜆 = 𝜆 𝐸 𝑍 𝜆 =
1
𝜆
 Game theory is great when you know the payoffs
 What can you do if you don’t know the payoffs?
 Or what the game tree looks like?
 Well…
 You usually have some educated guesses about who the
players are
 You have some idea what your possible actions are, as
well as the other players’
 You can look at past interactions and make inferences
 Which of these can be random variables? All of them.
 Deterministic: if all inputs are known, value is known
 Stochastic: even if all inputs are known, still random
YOU DON’T KNOW WHAT YOU DON’T KNOW
 Figure out what distribution to use
 Figure out what parameter you need to estimate
 Figure out a distribution for it, and any parameters
 Observing data tells you what your priors are
 Fixing values for stochastic variables
 Markov Chain Monte Carlo: sampling the posterior
distribution thousands of times
DON’T WAIT — SIMULATE
 Prerequisites:
 A Markov chain with an equilibrium distribution
 A function f proportional to the density of the distribution
you care about
 Choose some initial set of values for all variables
(state, S)
 Modify S according to Markov chain state transitions
 If f(S’)/f(S) ≥ 1, S’ is more likely than S, so accept
 Otherwise, accept S’ with probability f(S’)/f(S)
 Repeat
CONVERGING ON EXPECTED VALUES
A GAME WITHOUT PAYOFFS
type Outcome = Measure (Bool, Bool)
type Trust = Double
type Strategy = Trust -> Bool -> Bool -> Measure Bool
tit :: Trust -> Bool -> Bool -> Measure Bool
tit me True _ = conditioned $ bern 0.9
tit me False _ = conditioned $ bern me
CHOOSING WHICH HOLE TO FILL IN
play :: Strategy -> Strategy ->
(Bool, Bool) -> (Trust, Trust) -> Outcome
play strat_a strat_b (last_a,last_b) (a,b) = do
a_action <- strat_a a last_b last_a
b_action <- strat_b b last_a last_b
return (a_action, b_action)
iterated_game :: Measure (Double, Double)
iterated_game = do
let a_initial = False
let b_initial = False
a <- unconditioned $ uniform 0 1
b <- unconditioned $ uniform 0 1
rounds <- replicateM 10 $ return (a, b)
foldM_ (play tit tit) (a_initial, b_initial) rounds
return (a, b)
LET’S PLAY A GAME
games = [Just (toDyn False), Just (toDyn False),
Just (toDyn False), Just (toDyn True),
Just (toDyn False), Just (toDyn False),
Just (toDyn False), Just (toDyn True),
Just (toDyn False), Just (toDyn True),
Just (toDyn False), Just (toDyn False),
Just (toDyn False), Just (toDyn True),
Just (toDyn False), Just (toDyn True),
Just (toDyn False), Just (toDyn True),
Just (toDyn False), Just (toDyn False)]
do
l <- mcmc iterated_game games
return [makeHistogram 30 (Data.Vector.fromList $ map fst
(take 5000 l)) "A's paranoia",
makeHistogram 30 (Data.Vector.fromList $ map snd
(take 5000 l)) "B's paranoia"]
HOW MUCH TINFOIL IS IN THAT HAT?
MORE STRATEGIES
allCooperate :: Trust -> Bool -> Bool -> Measure Bool
allCooperate _ _ _ = conditioned $ bern 0.1
allDefect :: Trust -> Bool -> Bool -> Measure Bool
allDefect _ _ _ = conditioned $ bern 0.9
grimTrigger :: Trust -> Bool -> Bool -> Measure Bool
grimTrigger me True False = conditioned $ bern 0.9
grimTrigger me False False = conditioned $ bern 0.1
grimTrigger me _ True = conditioned $ bern 0.9
STRATEGY AS A RANDOM VARIABLE
data SChoice = Tit | GrimTrigger | AllDefect | AllCooperate
deriving (Eq, Ord, Enum, Typeable, Show)
chooseStrategy :: SChoice -> Strategy
chooseStrategy Tit = tit
chooseStrategy AllDefect = allDefect
chooseStrategy AllCooperate = allCooperate
chooseStrategy GrimTrigger = grimTrigger
strat :: Measure SChoice
strat = unconditioned $ categorical [(AllCooperate, 0.25),
(AllDefect, 0.25),
(GrimTrigger, 0.25),
(Tit, 0.25)]
LET’S PLAY ANOTHER GAME
iterated_game2 :: Measure (SChoice, SChoice)
iterated_game2 = do
let a_initial = False
let b_initial = False
a <- unconditioned $ uniform 0 1
b <- unconditioned $ uniform 0 1
na <- strat
let a_strat = chooseStrategy na
nb <- strat
let b_strat = chooseStrategy nb
rounds <- replicateM 10 $ return (a, b)
foldM_ (play a_strat b_strat) (a_initial, b_initial) rounds
return (na, nb)
do
l <- mcmc iterated_game2 games
return [makeDiscrete (map fst (take 1000 l)) "A strategy",
makeDiscrete (map snd (take 1000 l)) "B strategy"]
WHO’S WHO?
 Probabilistic SIPD
 Extensive form SIPD with signaling
 And channels with decidable vs. heuristic recognisers
 Coordination. Enough said.
 System 1/System 2 conflict
 Sentiment analysis → payoff data
 Start small: the stroke is the smallest unit of interaction
 Data where information about players is limited
 IP flows
 Anonymity networks
 Signaling game about type: are two actors the same
person?
FUTURE WORK
QUESTIONS?
mlp@upstandinghackers.com
@maradydd

More Related Content

What's hot

Introduction to the Strategy of Game Theory
Introduction to the Strategy of Game TheoryIntroduction to the Strategy of Game Theory
Introduction to the Strategy of Game TheoryJonathon Flegg
 
Lecture 3 MMX3043 Game Design and Development
Lecture 3 MMX3043 Game Design and DevelopmentLecture 3 MMX3043 Game Design and Development
Lecture 3 MMX3043 Game Design and DevelopmentLaili Farhana M.I.
 
Learning to Play Complex Games
Learning to Play Complex GamesLearning to Play Complex Games
Learning to Play Complex Gamesbutest
 
2012.12 Games We Play: Defenses & Disincentives
2012.12 Games We Play: Defenses & Disincentives2012.12 Games We Play: Defenses & Disincentives
2012.12 Games We Play: Defenses & DisincentivesAllison Miller
 
GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...
GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...
GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...SOURAV DAS
 
Forecasting Online Game Addictiveness
Forecasting Online Game AddictivenessForecasting Online Game Addictiveness
Forecasting Online Game AddictivenessAcademia Sinica
 
Game theory in network security
Game theory in network securityGame theory in network security
Game theory in network securityRahmaSallam
 
Oligopoly and Game Theory
Oligopoly and Game TheoryOligopoly and Game Theory
Oligopoly and Game Theorytutor2u
 
Motivations for AR Gaming - Presentation at NZ GDC 2004
Motivations for AR Gaming - Presentation at NZ GDC 2004Motivations for AR Gaming - Presentation at NZ GDC 2004
Motivations for AR Gaming - Presentation at NZ GDC 2004Trond Nilsen
 
UX: USA Network Playing House Website
UX: USA Network Playing House WebsiteUX: USA Network Playing House Website
UX: USA Network Playing House WebsiteDarren Lou
 

What's hot (20)

Introduction to the Strategy of Game Theory
Introduction to the Strategy of Game TheoryIntroduction to the Strategy of Game Theory
Introduction to the Strategy of Game Theory
 
203CR
203CR203CR
203CR
 
report
reportreport
report
 
Got game
Got gameGot game
Got game
 
Lecture 3 MMX3043 Game Design and Development
Lecture 3 MMX3043 Game Design and DevelopmentLecture 3 MMX3043 Game Design and Development
Lecture 3 MMX3043 Game Design and Development
 
Learning to Play Complex Games
Learning to Play Complex GamesLearning to Play Complex Games
Learning to Play Complex Games
 
2012.12 Games We Play: Defenses & Disincentives
2012.12 Games We Play: Defenses & Disincentives2012.12 Games We Play: Defenses & Disincentives
2012.12 Games We Play: Defenses & Disincentives
 
GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...
GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...
GAME THEORY NOTES FOR ECONOMICS HONOURS FOR ALL UNIVERSITIES BY SOURAV SIR'S ...
 
Forecasting Online Game Addictiveness
Forecasting Online Game AddictivenessForecasting Online Game Addictiveness
Forecasting Online Game Addictiveness
 
Week Five, Game Design
Week Five, Game DesignWeek Five, Game Design
Week Five, Game Design
 
Game theory
Game theoryGame theory
Game theory
 
Page 7
Page 7Page 7
Page 7
 
Game theory in network security
Game theory in network securityGame theory in network security
Game theory in network security
 
Libratus
LibratusLibratus
Libratus
 
Game Theory
Game TheoryGame Theory
Game Theory
 
Oligopoly and Game Theory
Oligopoly and Game TheoryOligopoly and Game Theory
Oligopoly and Game Theory
 
3. research
3. research3. research
3. research
 
Motivations for AR Gaming - Presentation at NZ GDC 2004
Motivations for AR Gaming - Presentation at NZ GDC 2004Motivations for AR Gaming - Presentation at NZ GDC 2004
Motivations for AR Gaming - Presentation at NZ GDC 2004
 
UX: USA Network Playing House Website
UX: USA Network Playing House WebsiteUX: USA Network Playing House Website
UX: USA Network Playing House Website
 
Game Theory
Game TheoryGame Theory
Game Theory
 

Similar to Strategies Without Frontiers

Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx
Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docxLecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx
Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docxSHIVA101531
 
Advanced Game Theory guest lecture
Advanced Game Theory guest lectureAdvanced Game Theory guest lecture
Advanced Game Theory guest lectureJonas Heide Smith
 
Designing Ethical Dilemmas - Long
Designing Ethical Dilemmas - LongDesigning Ethical Dilemmas - Long
Designing Ethical Dilemmas - LongManveer Heir
 
Game theory intro_and_questions_2009[1]
Game theory intro_and_questions_2009[1]Game theory intro_and_questions_2009[1]
Game theory intro_and_questions_2009[1]evamstrauss
 
navingameppt-191018085333.pdf
navingameppt-191018085333.pdfnavingameppt-191018085333.pdf
navingameppt-191018085333.pdfDebadattaPanda4
 
Designing balance (takeaway version)
Designing balance (takeaway version)Designing balance (takeaway version)
Designing balance (takeaway version)Kacper Szymczak
 
AI Strategies for Solving Poker Texas Hold'em
AI Strategies for Solving Poker Texas Hold'emAI Strategies for Solving Poker Texas Hold'em
AI Strategies for Solving Poker Texas Hold'emGiovanni Murru
 
Making Causal Claims as a Data Scientist: Tips and Tricks Using R
Making Causal Claims as a Data Scientist: Tips and Tricks Using RMaking Causal Claims as a Data Scientist: Tips and Tricks Using R
Making Causal Claims as a Data Scientist: Tips and Tricks Using RLucy D'Agostino McGowan
 

Similar to Strategies Without Frontiers (20)

Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx
Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docxLecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx
Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx
 
Advanced Game Theory guest lecture
Advanced Game Theory guest lectureAdvanced Game Theory guest lecture
Advanced Game Theory guest lecture
 
gt_2007
gt_2007gt_2007
gt_2007
 
Lecture 2 Social Preferences I
Lecture 2 Social Preferences ILecture 2 Social Preferences I
Lecture 2 Social Preferences I
 
Designing Ethical Dilemmas - Long
Designing Ethical Dilemmas - LongDesigning Ethical Dilemmas - Long
Designing Ethical Dilemmas - Long
 
Designing
DesigningDesigning
Designing
 
Game theory intro_and_questions_2009[1]
Game theory intro_and_questions_2009[1]Game theory intro_and_questions_2009[1]
Game theory intro_and_questions_2009[1]
 
navingameppt-191018085333.pdf
navingameppt-191018085333.pdfnavingameppt-191018085333.pdf
navingameppt-191018085333.pdf
 
Game Ethology 2
Game Ethology 2Game Ethology 2
Game Ethology 2
 
game THEORY ppt
game THEORY pptgame THEORY ppt
game THEORY ppt
 
Economic cognition
Economic cognitionEconomic cognition
Economic cognition
 
Moral Apes2
Moral Apes2Moral Apes2
Moral Apes2
 
Designing balance (takeaway version)
Designing balance (takeaway version)Designing balance (takeaway version)
Designing balance (takeaway version)
 
AI Strategies for Solving Poker Texas Hold'em
AI Strategies for Solving Poker Texas Hold'emAI Strategies for Solving Poker Texas Hold'em
AI Strategies for Solving Poker Texas Hold'em
 
Structural Language
Structural LanguageStructural Language
Structural Language
 
Making Causal Claims as a Data Scientist: Tips and Tricks Using R
Making Causal Claims as a Data Scientist: Tips and Tricks Using RMaking Causal Claims as a Data Scientist: Tips and Tricks Using R
Making Causal Claims as a Data Scientist: Tips and Tricks Using R
 
Week2 class2010
Week2 class2010Week2 class2010
Week2 class2010
 
Game theory
Game theoryGame theory
Game theory
 
nips-gg
nips-ggnips-gg
nips-gg
 
Theory of decision making
Theory of decision makingTheory of decision making
Theory of decision making
 

Recently uploaded

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 

Recently uploaded (20)

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 

Strategies Without Frontiers

  • 1. Meredith L. Patterson BSidesLV August 5, 2014 STRATEGIES WITHOUT FRONTIERS
  • 2.  I hate boring problems  I especially hate solving tiny variations on the same boring problem over and over again  The internet is full of the same boring problems over and over again  Both in the cloud …  … and in the circus  Not my circus, not my monkeys MOTIVATION
  • 3.  Information theory  Probability theory  Formal language theory (of course)  Control theory  First-order logic  Haskell ALSO APPEARING IN THIS TALK
  • 4.  When an unknown agent acts, how do you react?  Observation of side effects  Signals the agent sends  Past interactions with others  Formal language theory (if you’re a computer)  Systematic knowledge about the structure of interactions and the incentives involved in them IT IS PITCH BLACK. YOU ARE LIKELY TO BE EATEN BY A GRUE.
  • 5.  Everything You Actually Need to Know About Classical Game Theory  in math …  … and psychology  Changing the Game  Extensive form and signaling games  Multiplayer and long-running games  Reasoning Under Uncertainty, Over Real Data OUTLINE
  • 6. EVERYTHING YOU ACTUALLY NEED TO KNOW ABOUT CLASSICAL GAME THEORY
  • 7.  Players  Information available at each decision point  Possible actions at each decision point  Payoffs for each outcome  Strategies (pure or mixed)  Or behaviour, in iterated or turn-taking games  Equilibria  Different kinds of games have different kinds of equilibria WHAT’S IN A GAME?
  • 8. a, b c, d e, f g, h A NORMAL FORM GAME Cooperate Defect Cooperate Defect
  • 9.  Pure strategy: fully specified set of moves for every situation  Mixed strategy: probability assigned to each possible move, random path through game tree  Behaviour strategies: probabilities assigned at information sets STRATEGIES
  • 10. PRISONER’S DILEMMA -1, -1 -3, 0 0, -3 -2, -2 Cooperate Defect Cooperate Defect d, e > a, b > g, h > c, f
  • 11. MATCHING PENNIES 1, -1 -1, 1 -1, 1 1, -1 Heads Tails Heads Tails a = d = f = g > b = c = e = h
  • 12. DEADLOCK 1, 1 0, 3 3, 0 2, 2 Cooperate Defect Cooperate Defect e > g > a > c and d > h > b > f
  • 13. STAG HUNT 2, 2 0, 1 1, 0 1, 1 Stag Hare Stag Hare a = b > d = e = g = h > c = f
  • 14. CHICKEN 0, 0 -1, 1 1, -1 -10, -10 Swerve Straight Swerve Straight e > a > c > g and d > b > f > h
  • 16. BATTLE OF THE SEXES 3, 2 0, 0 0, 0 2, 3 Opera Football Opera Football (a > g and h > b) > c = d = e = f
  • 17.  Games can be zero-sum or non-zero-sum  Games can be about conflict or cooperation  Actions are not inherently morally valenced  Payoffs determine type of game, strategy WHAT HAVE WE SEEN SO FAR?
  • 18.  Cournot equilibrium: each actor’s output maximizes its profit given the outputs of other actors  Nash equilibrium: each actor is making the best decision they can, given what they know about each other’s decisions  Subgame perfect equilibrium: eliminates non- credible threats  Trembling hand equilibrium: considers the possibility that a player might make an unintended move EQUILIBRIUM
  • 20. MIND GAMES “As far as the theory of games is concerned, the principle which emerges here is that any social intercourse whatsoever has a biological advantage over no intercourse at all.”
  • 21.  Procedures  Operations  Rituals  Pastimes  (Predatory) Games TYPES OF INTERACTIONS
  • 22.  “Hands” or roles = players  Extensive form; players move in response to each other  Advantages  Existential advantage: confirmation of existing beliefs  Internal psychological advantage: direct emotional payoff  External psychological advantage: avoiding a feared situation  Internal social advantage: structure/position with respect to other players  External social advantage: as above, wrt non-players BERNE’S GAMES: STRUCTURE
  • 23.  Kick Me  Goal: Sympathy  Find someone to beat on you, then whine about it  “My misfortunes are better than yours”  Ain’t It Awful  Can be a pastime, but also manifests as a game  Player displays distress; payoff is sympathy and help  Why Don’t You – Yes, But  Player claims to want advice. Player doesn’t really want it.  Goal: Reassurance BERNE’S GAMES: EXAMPLES
  • 24.  Now I’ve Got You, You Son Of A Bitch  Goal: Justification (or just money)  Three-handed version is the badger game  Roles  Victim  Aggressor  Confederate  Moves  Provocation → Accusation  Defence → Accusation  Defence → Punishment THE BADGER GAME
  • 25.  “Schlemiel,” in Berne’s glossary  Moves:  Provocation → resentment  (repeat)  If B responds with anger, A appears justified in more anger  If B keeps their cool, A still keeps pushing TROLLING
  • 26.  Social media  Organic responses against predatory games  Predator Alert Tool  /r/TumblrInAction “known trolls” wiki  Those just happen to be ones I know about  A truly generic reputation system is probably a pipe dream  Wikipedia  eBay  But for these, we have to extend the basic mathematical model. OTHER MONKEY GAMEBOARDS
  • 31. BOTH SPLIT Split Steal 1 1 1 A B Split Split 2 2 6800, 6800 6800, 6800
  • 32. ONE SPLITS, ONE STEALS
  • 33. ONE SPLITS, ONE STEALS Split Steal 1 1 1 A B Split Split 6800, 6800 6800, 6800 2 2 A Split 2 Steal Steal B Split 2 0, 13600 0, 13600 13600, 0 13600, 0
  • 35. BOTH STEAL Split Steal 1 1 1 A B Split Split 6800, 6800 6800, 6800 2 2 A Split 2 Steal Steal B Split 2 0, 13600 0, 13600 13600, 0 13600, 0 Steal Steal 0, 0 0, 0
  • 36. NORMAL FORM Also known as the Friend-or-Foe game. 1, 1 0, 2 2, 0 0, 0 Split Steal Split Steal d = e > a = b > c = f = g = h
  • 38. FIRST MOVE: NICK’S CHOICE Split Steal 1 1 1 “I’m likely to split” “I’m likely to steal” Split Split 6800, 6800 6800, 6800 2 SplitSteal Steal “I’m likely to steal” Split 0, 13600 0, 13600 13600, 0 13600, 0 Steal Steal 0, 0 0, 0 “I’m likely to split” 2
  • 40. SECOND MOVE: NICK’S SIGNAL Split Steal 1 1 1 “I’m likely to split” “I’m likely to steal” Split Split 6800, 6800 6800, 6800 2 SplitSteal Steal “I’m likely to steal” Split 0, 13600 0, 13600 13600, 0 13600, 0 Steal Steal 0, 0 0, 0 “I’m likely to split” 2
  • 42. THE COMPLETE PATH Split Steal 1 1 1 “I’m likely to split” “I’m likely to steal” Split Split 6800, 6800 6800, 6800 2 SplitSteal Steal “I’m likely to steal” Split 0, 13600 0, 13600 13600, 0 13600, 0 Steal Steal 0, 0 0, 0 “I’m likely to split” 2
  • 44.  Strategies now depend on payoff matrix and history  Axelrod, 1981: how well do these strategies perform against each other over time?  “Ecological” tournaments: players abandon bad strategies  Rapoport: if the only information you have is how player X interacted with you last time, the best you can do is Tit-for-Tat  TFT cannot score higher than its opponent  Axelrod: “Don’t be envious”  Against TFT, no one can do better than cooperate  Axelrod: “Don’t be too clever” ITERATED GAMES
  • 45.  Nice: S is a nice strategy iff it will not defect on someone who has not defected on it  Retaliatory: S is a retaliatory strategy iff it will defect on someone who defects on it  Forgiving: S is a forgiving strategy iff it will stop defecting on someone who stops defecting on it PROPERTIES
  • 46.  Ord/Blair, 2002: what happens when strategies can take into account all past interactions?  We can express strategies in convenient first-order logic, as it turns out  Tit-for-Tat: D(c, r, p)  Tit-for-Two-Tats: D(c, r, p) ∧ D(c, r, b(p))  Grim: ∃t D(c, r, t)  Bully: ¬∃t D(c, r, t)  Spiteful-Bully: ¬∃t D(c, r, t) ∨ ∃s (D(c, r, s) ∧ D(c, r, b(s)) ∧ D(c, r, b(b(s))))  Vigilante: ¬∃j D(c, j, p)  Police: D(c, r, p) ∨ ∃j (D(c, j, p) ∧ ¬∃k(D(j, k, b(p))) SOCIETAL ITERATED GAME THEORY
  • 47. EVOLUTION IS A HARSH MISTRESS Tit-for-Tat All-Cooperate Spiteful-Bully
  • 49.  In a society, niceness is more nuanced  Individually nice: will not defect on someone who has not defected on it  Meta-individually nice: will not defect on individually nice  Communally nice: will not defect on someone who has not defected at all  Meta-communally nice: will not defect on communally nice  Same applies to forgiveness and retaliation  Loyalty: will not defect on the same strategy as itself NICENESS AND LOYALTY
  • 50.  Peacekeepers don’t always agree  Police will defect on Vigilantes and vice versa  Peacekeepers protect non-peacekeeping strategies at their own expense META-PEACEKEEPING Police All-Cooperate Spiteful-Bully Tit-for-Tat
  • 51. REDUCTIO AD ABSURDUM: ABSOLUTIST ∃t ∃j D(r, j, t) ⊕ D(c, j, t) Tit-for-Tat All-Cooperate Spiteful-Bully Absolutist
  • 52. ABSOLUTISM UBER ALLES Tit-for-Tat All-Cooperate Spiteful-Bully Absolutist
  • 54.  Frequentist: probability is the long-term frequency of events  Reasoning from absolute probabilities  What happens if an event only happens once?  Returns an estimate  Bayesian: probability is a measure of confidence that an event will occur  Reasoning from relative probabilities  Returns a probability distribution over outcomes  Update beliefs (confidence) as new evidence arrives TWO INTERPRETATIONS OF PROBABILITY P(A|X) = P X A P(A) P(X)
  • 55.  Probability distribution function: assigns probabilities to outcomes  Discrete: a finite set of values (enumeration)  Function also called a probability mass function  Poisson, binomial, Bernoulli, discrete uniform…  Continuous: arbitrary-precision values  Function also called a probability density function  Exponential, Gaussian (normal), chi-squared, continuous uniform…  Mixed: both discrete and continuous  Narrower distribution = greater certainty DISTRIBUTIONS 𝐸 𝑍 𝜆 = 𝜆 𝐸 𝑍 𝜆 = 1 𝜆
  • 56.  Game theory is great when you know the payoffs  What can you do if you don’t know the payoffs?  Or what the game tree looks like?  Well…  You usually have some educated guesses about who the players are  You have some idea what your possible actions are, as well as the other players’  You can look at past interactions and make inferences  Which of these can be random variables? All of them.  Deterministic: if all inputs are known, value is known  Stochastic: even if all inputs are known, still random YOU DON’T KNOW WHAT YOU DON’T KNOW
  • 57.  Figure out what distribution to use  Figure out what parameter you need to estimate  Figure out a distribution for it, and any parameters  Observing data tells you what your priors are  Fixing values for stochastic variables  Markov Chain Monte Carlo: sampling the posterior distribution thousands of times DON’T WAIT — SIMULATE
  • 58.  Prerequisites:  A Markov chain with an equilibrium distribution  A function f proportional to the density of the distribution you care about  Choose some initial set of values for all variables (state, S)  Modify S according to Markov chain state transitions  If f(S’)/f(S) ≥ 1, S’ is more likely than S, so accept  Otherwise, accept S’ with probability f(S’)/f(S)  Repeat CONVERGING ON EXPECTED VALUES
  • 59. A GAME WITHOUT PAYOFFS type Outcome = Measure (Bool, Bool) type Trust = Double type Strategy = Trust -> Bool -> Bool -> Measure Bool tit :: Trust -> Bool -> Bool -> Measure Bool tit me True _ = conditioned $ bern 0.9 tit me False _ = conditioned $ bern me
  • 60. CHOOSING WHICH HOLE TO FILL IN play :: Strategy -> Strategy -> (Bool, Bool) -> (Trust, Trust) -> Outcome play strat_a strat_b (last_a,last_b) (a,b) = do a_action <- strat_a a last_b last_a b_action <- strat_b b last_a last_b return (a_action, b_action) iterated_game :: Measure (Double, Double) iterated_game = do let a_initial = False let b_initial = False a <- unconditioned $ uniform 0 1 b <- unconditioned $ uniform 0 1 rounds <- replicateM 10 $ return (a, b) foldM_ (play tit tit) (a_initial, b_initial) rounds return (a, b)
  • 61. LET’S PLAY A GAME games = [Just (toDyn False), Just (toDyn False), Just (toDyn False), Just (toDyn True), Just (toDyn False), Just (toDyn False), Just (toDyn False), Just (toDyn True), Just (toDyn False), Just (toDyn True), Just (toDyn False), Just (toDyn False), Just (toDyn False), Just (toDyn True), Just (toDyn False), Just (toDyn True), Just (toDyn False), Just (toDyn True), Just (toDyn False), Just (toDyn False)] do l <- mcmc iterated_game games return [makeHistogram 30 (Data.Vector.fromList $ map fst (take 5000 l)) "A's paranoia", makeHistogram 30 (Data.Vector.fromList $ map snd (take 5000 l)) "B's paranoia"]
  • 62. HOW MUCH TINFOIL IS IN THAT HAT?
  • 63. MORE STRATEGIES allCooperate :: Trust -> Bool -> Bool -> Measure Bool allCooperate _ _ _ = conditioned $ bern 0.1 allDefect :: Trust -> Bool -> Bool -> Measure Bool allDefect _ _ _ = conditioned $ bern 0.9 grimTrigger :: Trust -> Bool -> Bool -> Measure Bool grimTrigger me True False = conditioned $ bern 0.9 grimTrigger me False False = conditioned $ bern 0.1 grimTrigger me _ True = conditioned $ bern 0.9
  • 64. STRATEGY AS A RANDOM VARIABLE data SChoice = Tit | GrimTrigger | AllDefect | AllCooperate deriving (Eq, Ord, Enum, Typeable, Show) chooseStrategy :: SChoice -> Strategy chooseStrategy Tit = tit chooseStrategy AllDefect = allDefect chooseStrategy AllCooperate = allCooperate chooseStrategy GrimTrigger = grimTrigger strat :: Measure SChoice strat = unconditioned $ categorical [(AllCooperate, 0.25), (AllDefect, 0.25), (GrimTrigger, 0.25), (Tit, 0.25)]
  • 65. LET’S PLAY ANOTHER GAME iterated_game2 :: Measure (SChoice, SChoice) iterated_game2 = do let a_initial = False let b_initial = False a <- unconditioned $ uniform 0 1 b <- unconditioned $ uniform 0 1 na <- strat let a_strat = chooseStrategy na nb <- strat let b_strat = chooseStrategy nb rounds <- replicateM 10 $ return (a, b) foldM_ (play a_strat b_strat) (a_initial, b_initial) rounds return (na, nb) do l <- mcmc iterated_game2 games return [makeDiscrete (map fst (take 1000 l)) "A strategy", makeDiscrete (map snd (take 1000 l)) "B strategy"]
  • 67.  Probabilistic SIPD  Extensive form SIPD with signaling  And channels with decidable vs. heuristic recognisers  Coordination. Enough said.  System 1/System 2 conflict  Sentiment analysis → payoff data  Start small: the stroke is the smallest unit of interaction  Data where information about players is limited  IP flows  Anonymity networks  Signaling game about type: are two actors the same person? FUTURE WORK

Editor's Notes

  1. This is mostly a talk about game theory, founded by John von Neumann and Oskar Morgenstern in 1944. Game theory is part of econ, which is way more than just macro/micro “where money goes” Weird that the study of decision-making is called “the dismal science,” though to be fair the more you look at the problem of allocating finite resources, the more hard truths you run up against about physics and human nature Game theory provides a framework for refining our decision-making models as more information about data’s structure comes in
  2. “the circus” = social media I’m largely giving this talk because I’m tired of assholes being better at coordination than people who aren’t assholes. Keith Alexander is consulting for $600K/month on the grounds of some kind of behaviour analysis secret sauce. So, other people are thinking about these problems too.
  3. Keep the Shannon/Weaver model of communication in your head: two endpoints communicating over a possibly noisy channel of finite bandwidth, who have to serialize their messages to the channel and parse incoming messages off the channel. Both serialization and parsing can produce errors. This isn’t really a langsec talk, but we’ll still be talking about boundaries of competence. In a signaling game, how much confidence you can have in the signal you received being the one that was transmitted depends on how reliably you can receive signals in the language of the channel – and how reliably the sender serializes them. We won’t be getting all that deeply into feedback loops, but if you know how they work, keep them in mind. I kinda lied about the only math you need being the ability to compare two numbers; it’ll help later in the talk if you can read first-order logic notation, but it’s not really necessary.
  4. 1: I.e., effects on the environment. 2: So important, they named a class of games after them. 3: The quality of your data is really important here. 4: Langsec won’t be making much of an appearance in this talk, but when all the agents are machines, it’s relevant. Who do you think is going to be driving all those automated exploit generators DARPA is soliciting? People? At first, maybe, but not for long. Drones are expensive and hard to build. More servers are not. And in any case, being able to tell where FLT matters and where it doesn’t is an important distinction. Decidable problems are priceless; for everything else there’s heuristics, and when those inevitably fail, there’s Mastercard. 5: Game theory is the framework we’ll be building up this knowledge around, but we’ll be pulling from all the fields I mentioned earlier.
  5. The four elements at the top are all you need to define a game. Strategies and equilibria are derived from the structure of the game you’re playing.
  6. Behavior strategies and mixed strategies are functionally equivalent as long as the player has perfect recall. (Kuhn’s theorem) So behavior strategies are a bit more like how people act in real life.
  7. First described in 1950 by Merrill Flood and Melvin Dresher Four payoffs: Temptation, for screwing the other guy, Reward, for cooperating, Punishment, for defecting, and Sucker, for being defected on. Because Reward > Punishment, mutual cooperation is better than mutual defection Because Temptation > Reward and Punishment > Sucker, defection is the dominant strategy for both agents It’s a dilemma because mutual cooperation is better than mutual defection, but at the *individual* level, defection is superior to cooperation.
  8. Basically rock-paper-scissors but with only two options. There is no pure strategy that is a best response here, since what you always want is to choose the opposite of what your opponent picked.
  9. Here, the mutually beneficial outcome is also the dominant outcome: there is no conflict between self-interest and mutual benefit. Still, it’s an interesting basis for a signaling game, since there’s still some incentive to screw the other guy.
  10. The classic social cooperation game, originally described by Jean-Jacques Rousseau. Two pure-strategy equilibria: both cooperate or both defect. Cooperating is payoff dominant, defecting is risk dominant.
  11. Chicken is more of an “anti-coordination game” – choosing the same action creates negative externalities, so you want to not coordinate
  12. Proposed by John Maynard Smith and George Price in 1973 in Nature to describe conflict among animals over resources V is the value of the contested resource, C is the cost of getting into a fight Often considered as a signaling game – there’s a round of threatening each other before choosing their moves
  13. Also known as “conflicting interest coordination” One partner wants to go to the opera, the other wants to go to the ball game, but they’d both rather be together than go to different events. They forgot which one to go to, each knows that the other forgot, and they can’t communicate. Where should each go? Two pure strategy equilibria: both opera or both football. But this is unfair, since one person consistently gets a higher payoff than the other. One mixed strategy: go to your preferred event with 60% probability. But this is inefficient, because players miscoordinate 52% of the time, so the expected utility is 1.2, which is worse than if either person always goes to their non-preferred event.
  14. Types of games overlap in various ways Zero-sum: the gains/losses of all players balance out to zero. Matching Pennies is zero-sum; Prisoner’s Dilemma and Stag Hunt are non-zero sum. All zero-sum games are competitive; non-zero-sum games can be competitive or noncompetitive An action is just an action. There’s nothing inherently good or bad about choosing Heads or Tails in Matching Pennies; the morality of snitching in PD depends on your ethical framework around snitching, the morality of going off to hunt rabbits in Stag Hunt depends on whether you agreed to hunt a stag beforehand and how seriously you take keeping your word. As we go on, we’ll look at more complicated games – ones that go on longer, have more players, where players have uncertain information about each other, and even ones where the game being played changes form as the game goes on.
  15. Cournot equilibrium: Antoine Augustin Cournot, 1838. He was talking about businesses, e.g. factories, but it generalises. Nash equilibrium: nobody can do better by changing their strategy. In the Prisoner’s Dilemma, this is clear: any player who wants to cooperate knows that the other guy can defect on him and screw him, so he’s better off defecting. A subgame is a subset of the tree of a game. In subgame perfect equilibrium, all subgames have a Nash equilibrium. Start at the outcomes, work backward, removing branches that involve a player making a non-optimal move. “Trembling hand” – i.e., you might miss and hit the big red button instead
  16. Traditional game theory assumes that all agents are rational. But in the 1960s, Eric Berne looked at irrational games – the sorts of social games that people entice each other into for attention, sympathy, and other kinds of psychological payoffs, while hiding their true motives. Berne drops the assumption that players are driven by the most rational angels of their nature, and looks at the payoffs of ulterior-motive social games as ways for players to satisfy unmet emotional needs. So in effect we’re now considering players to have two sets of preferences that impact their decision-making: one that the rational System 2 uses when making considered decisions, one that the prerational System 1 uses when making quick heuristic decisions.
  17. Humans are social animals. We all have biological drives to interact with other members of our species to some extent or another – and when that drive is demanding to be satisfied, an argument can serve the same purpose as a productive discussion or even a hug, if what a person is fundamentally looking for is external recognition that they exist. “Payoff” comes in the form of neurotransmitter activity. Berne didn’t go into that, and the imaging equipment we need to investigate this directly doesn’t exist yet, but we can black-box it (Skinner-box it?) with behaviorism: each player experiences some consequences from each interaction, as reinforcement or as punishment. Positive reinforcement – a rewarding stimulus (a chocolate, a kiss, &c) Negative reinforcement – removal of an aversive stimulus (eg when someone stops yelling at you) Positive punishment – an aversive stimulus Negative reinforcement – removal of a rewarding stimulus Berne identified stimulus hunger, recognition hunger, and structure hunger. Status hunger is probably a combination of the latter two.
  18. Procedure: a series of complementary transactions toward some physical end. Operation: a set of transactions undertaken for a specific, stated purpose. If you ask explicitly for something, like reassurance or support, and you get it, that’s an operation. Ritual: “a stereotyped series of simple complementary transactions programmed by external social forces” Pastime: an iterated ritual, with state; can turn into status gaming (establishment of a “pecking order”) People spend a *lot* of time on pastimes – that’s why they’re called that. Facebook is largely a pastime for most people. So is Twitter. When different clusters’ pastimes collide, you get fireworks because pastimes have a ritual quality (jargon, signaling certain beliefs, &c) and people don’t know what pre-existing state they’re walking into. Game: “an ongoing series of complementary ulterior transactions progressing to a well-defined predictable outcome.” IOW, the initiator of the game has a goal in mind and isn’t being upfront about it. If you ask for reassurance and then turn that against the person, that’s a game.
  19. Berne’s work is pretty heavily based in Freud; he’s got this parent/child/adult triad of “ego states”, and posits that people fall into authoritarian parent modes or contrarian child modes when they play power games with each other. It’s kind of a just-so story, so we’re not really going to get into it. But we will look at the roles that the context of various mind games establishes for the players. Since games are a series of complementary ulterior transactions, that means there’s turn-taking. Each move is considered to be a stroke, i.e., something that affects the other player in some way. Advantages ~ payoffs. Existential advantage is that sense that events in the world are confirming your beliefs about how the world works, even if you manipulated the events to that end. Emotional payoff here is analogous to positive reinforcement, external psychological advantage is analogous to negative reinforcement. If you win the game, you’re raising the likelihood that you’ll behave that way again, because you’ve reinforced the evidence that playing games works. Internal and external social advantage are about status and limiting other players’ moves. If you signal as “oppressed”, people who prioritize oppression will limit what they do on your behalf.
  20. “Ain’t It Awful” taken to the pathological extreme manifests as things like Munchausen syndrome or M-by-proxy In “Why Don’t You – Yes But”, the initiator really wants reassurance that their problem is not their fault, but they get it manipulatively by challenging people to present solutions they can’t find fault with. Obviously they can nitpick anything to death. “Courtroom” – pick a victim/scapegoat and pick them apart, most effectively in front of a “jury of their peers”
  21. Introduce the idea of changing the game here – the mark thinks it’s one game (the one where if he wins he gets laid at the end), but what he doesn’t know is that he’s playing a different game (the one where if he wins he doesn’t get beaten up but does lose his wallet). Can be played with just a victim and an aggressor, as long as the victim does something that the aggressor can construe as the victim screwing up in some way Confederate lures the victim into provoking the aggressor.
  22. Often about getting the target to embarrass themselves in some way – typically by overreacting and saying something they’ll regret later. (I’m doubtful as to whether the target ever does actually regret it later, but we’ll set that aside for now.) Berne talks about there being an “apology->forgiveness” phase of the game, though trolls really aren’t in it for the forgiveness. So this might be better considered a modification. Note that a troll’s actions revolve around sending signals to some receiver in an attempt to provoke an overreaction. Engaging is therefore a feedback loop providing the troll with more material to feed into its signal generation function. Proceed with caution. And on that note, let’s take a closer look at the class of games that we can use to model interactions involving two-way communication: signaling games.
  23. Get it out of your system now, because you’re going to hear “balls” more often than any other noun in the clips that follow. I counted.
  24. This is the beginning of an extensive form game tree for this game. The unfilled dot in the center is the root. It indicates who makes the first move – in this case player 1. Traditionally the first move is made by “Nature” and is taken to be the type of the player – in a job interview, whether the candidate being interviewed is competent or incompetent; when you buy someone a drink, whether they’re interested in you or not interested in you; when you’re deciding whether to tell someone a secret, whether they’re trustworthy or untrustworthy. But since player 1 has already decided whether he’s going to split or steal, he’s making the first move.
  25. Similar to Prisoner’s Dilemma, except that if you decide to screw each other, you both get screwed just as badly as you would if you cooperated but the other guy defected. Being a sucker isn’t any worse for you – materially, at least – than betting you can screw the other guy and being wrong.
  26. Poll the audience after this segment is over. What do they think Ibrahim will pick? What do they think Nick will pick? Radiolab interviewed both these guys after the show. In the studio, the argument went on for 45 minutes and the audience was booing Nick over and over again. He stuck to his guns the whole time, so in uncompressed time, his signal was fairly unambiguous.
  27. We don’t know whether Nick has actually chosen Split or Steal at this point. He’s signaled unambiguously that he plans to steal, which means that if Ibrahim decides his signal is credible, Ibrahim can only operate on the lower right quadrant of the graph. At this point, Nick’s signal has changed the structure of the game they’re playing: it’s no longer Friend-or-Foe, it’s Ultimatum. <stuff about Ultimatum here> So the risk Nick is taking now is whether Ibrahim will decide that the ultimatum is so insulting that he should punish Nick by forcing them both to go home with nothing, or whether the promise of £6800 after the show is a credible enough incentive that he should cooperate. Takeaway: extensive form helps you see how a game’s structure changes as branches of the decision tree are pruned away
  28. Axelrod’s initial tournaments just played strategies against each other 200x and totaled up points at the end. In ecological (or evolutionary) tournaments, each strategy’s success in the previous round determines how prevalent it is in the current round – and cooperative strategies outcompeted non-cooperative ones. It would be really great if players in the real world abandoned bad strategies as soon as they recognised the strategies weren’t working, but in practice people are actually pretty bad at recognising this. People are unusually invested in the strategies they choose. Confirmation bias, choice-supportive bias, &c. Complex inferences just didn’t work very well – the inferences were usually wrong.
  29. In Axelrod’s IPD, success – i.e., doing the best you can possibly do – requires a strategy that satisfies all these properties. Such strategies also outcompete strategies that don’t satisfy these properties. But can we do better than an eye for an eye and a tooth for a tooth? Certainly in the real world there are plenty of people whose modus operandi is moving from victim to victim, opportunistically defecting whenever they think they can get away with it; and remember Berne’s games. Are there strategies that can incorporate other information to expose social predators?
  30. c is the column player, r is the row player (ie you); p is the last round, b() is a predecessor function TFT: “Defect on them if they defected on me last round.” TFTT: “Defect on them if they defected on me last round and the round before.” Grim: “Defect on them if they ever defected on me in the past.” Bully: “Defect on them if they’ve *never* defected on me in the past.” Spiteful-Bully similar, but also defects if it’s been defected on 3x Vigilante: “Defect on them if they defected on anyone else last round.” Police: “Defect on them if they defected on me last round, or if last round they defected on someone who had just cooperated with everyone.” Vigilante and Police are peacekeeping strategies: they ignore who someone defected on, only care that they did it
  31. All individually nice strategies are communally nice, but not necessarily vice versa. All individually forgiving strategies are communally forgiving, and all communally retaliatory strategies are individually retaliatory. Individually retaliatory: defects on someone who defects on it. Communally retaliatory: defects on someone who defects on anyone. Individually forgiving: stops defecting on someone who stops defecting on it Communally forgiving: stops defecting on someone who stops defecting on everyone TFT is loyal; if it plays another TFT, they’ll cooperate forever. Same for Police, but Vigilante is not loyal – Vigilantes will defect on other Vigilantes. TFT is individually nice, retaliatory and forgiving; Vigilante is communally nice, retaliatory and forgiving.
  32. Absolutist: “Defect on c iff c has ever cooperated with someone when you defected, or vice versa.” Absolutist is loyal: it doesn’t defect on other Absolutists of its own kind. Note that if you put two groups of Absolutists into a population, they’ll defect on each other. It’s also unforgiving: it never stops defecting on someone once it’s started, like Grim. Neither individually nice nor communally nice, since it will defect on All-C (cooperated in the past with a defector) Really only works when there’s no noise in players’ information or actions
  33. The frequentist perspective operates under the assumption that the long-term absolute probability of an event occurring can be known. The Bayesian interpretation is a subjective one, depending entirely on the information available to the agent. For a large enough number of samples – as evidence accumulates – the Bayesian and frequentist interpretations typically converge. But you don’t always have all that many samples to choose from. Really big data problems can be solved by frequentist analysis. But for medium-sized data and really small data, Bayesian analysis performs much better. A is the parameters, X is the evidence. P(A): prior probability of A. A belief, i.e., a measure of confidence. P(A|X): posterior probability of A, given X – the conditional probability of A, based on evidence X. P(X|A): posterior probability of X, given A – the likelihood, or the probability of the evidence given the parameters. (Avoiding the post hoc ergo propter hoc fallacy, statistically.) P(X) decomposes to P(X|A)P(A) + P(X|~A)P(~A): the probability that X occurs whether A happens or not
  34. Probability mass function: gives the probability that a discrete random variable has some particular value Poisson is basically the bell curve for discrete outcomes; binomial gives the probability of an event occurring over N trials given probability p that it occurs in one trial; Bernoulli is binomial with one trial. Expected value of Z in the Poisson distribution is equal to its parameter, lambda; in the exponential distribution, it’s equal to the inverse of the parameter. Probability density function: gives the probability that a continuous random variable has some particular value; for a range, take the integral of the variable’s density over that range. All that we see is Z. We have to estimate lambda, and that’s why Bayesian analysis is useful: it gives us useful tools for updating our beliefs about lambda even though we can’t see it. Figuring out the right distribution to use with your data is important. There are a lot of them, useful in different situations, and that’s outside the scope of this talk.
  35. We’re treating “input” here as anything that influences the value of a variable. Deterministic entails decidability.
  36. So you’ve got some data! What are you going to do with it? Questions to ask yourself when modeling: What am I interested in? What does it look like? What influences it? Data conditions the values of random variables: the conditional distribution of Y given X is the probability distribution of Y when X is known to be a particular value. You can keep on assigning distributions to parameters as long as it’s useful, but if you don’t have any strong beliefs about a parameter, this is probably not useful. Pick an average value and let inference update it for you. Or you can also use a uniform distribution for it, and infer what its value is likely to be. It’s just another prior, after all. Monte Carlo simulation: also discovered by John Von Neumann. In normal MC, variables are independent and identically distributed; sample and average. in MCMC, variables can condition each other, conditioning defines the chain. When you combine probabilities, you’re reducing the effective volume of your search space; MCMC helps you narrow the search to the areas where you’re likely to find values that satisfy the data and the conditions.
  37. With this definition, the payoffs are completely hidden; all we assume is that the players consider some actions to be “cooperating” and others to be “defecting,” and that whether they consider an action to be cooperative or defecting is conditioned on how trusting they are. In this case, a higher value means “more paranoid.” If the other player defects on them (the True case), then the probability distribution of this player defecting is a Bernoulli distribution with p = 0.9 – this parameter could have been a random variable as well, but for this toy example we’re fixing its value. If the other player cooperates, then the probability that this player defects is also a Bernoulli distribution, with p = whatever the player’s paranoia is.
  38. Here, a and b are a’s and b’s paranoia values; we don’t know what they are, we just know that they’re chosen uniformly from values between 0 and 1, inclusive. When we sample hypothetical games with these players, each game will last 10 rounds. The actions sampled will converge on the strategy we defined on the last slide – defecting based on whether the other player defected the last round, conditioned by how paranoid this player is – and from the values we observe in the samples after Markov chain convergence (hopefully!), we can get a better estimate of how paranoid A and B are.
  39. For Grim Trigger, the fact that we’ve defected on a previous round tells us that we should continue to defect on that person. Note that we’re not making this conditional on paranoia.
  40. Probabilistic SIPD: How large of a sample do we actually need to infer a player’s strategy? Inference about System 1 vs. System 2 influencing a player’s actions will require modeling the preferences and strategies of each system separately, and modeling how they interact