Strategies Without Frontiers

Meredith L.
Patterson
BSidesLV
August 5, 2014
STRATEGIES
WITHOUT
FRONTIERS

 I hate boring problems
 I especially hate solving tiny variations on the same
boring problem over and over again
 The internet is full of the same boring problems over
and over again
 Both in the cloud …
 … and in the circus
 Not my circus, not my monkeys
MOTIVATION

 Information theory
 Probability theory
 Formal language theory (of course)
 Control theory
 First-order logic
 Haskell
ALSO APPEARING IN THIS TALK

 When an unknown agent acts, how do you react?
 Observation of side effects
 Signals the agent sends
 Past interactions with others
 Formal language theory
(if you’re a computer)
 Systematic knowledge about the
structure of interactions and the
incentives involved in them
IT IS PITCH BLACK. YOU ARE LIKELY TO
BE EATEN BY A GRUE.

 Everything You Actually Need to Know About
Classical Game Theory
 in math …
 … and psychology
 Changing the Game
 Extensive form and signaling games
 Multiplayer and long-running games
 Reasoning Under Uncertainty, Over Real Data
OUTLINE

EVERYTHING YOU ACTUALLY
NEED TO KNOW ABOUT
CLASSICAL GAME THEORY

 Players
 Information available at each decision point
 Possible actions at each decision point
 Payoffs for each outcome
 Strategies (pure or mixed)
 Or behaviour, in iterated or turn-taking games
 Equilibria
 Different kinds of games have different kinds of equilibria
WHAT’S IN A GAME?

a, b c, d
e, f g, h
A NORMAL FORM GAME
Cooperate
Defect
Cooperate Defect

 Pure strategy: fully specified set of moves for every
situation
 Mixed strategy: probability assigned to each possible
move, random path through game tree
 Behaviour strategies: probabilities assigned at
information sets
STRATEGIES

PRISONER’S DILEMMA
-1, -1 -3, 0
0, -3 -2, -2
Cooperate
Defect
Cooperate Defect
d, e > a, b > g, h > c, f

MATCHING PENNIES
1, -1 -1, 1
-1, 1 1, -1
Heads
Tails
Heads Tails
a = d = f = g > b = c = e = h

DEADLOCK
1, 1 0, 3
3, 0 2, 2
Cooperate
Defect
Cooperate Defect
e > g > a > c and d > h > b > f

STAG HUNT
2, 2 0, 1
1, 0 1, 1
Stag
Hare
Stag Hare
a = b > d = e = g = h > c = f

CHICKEN
0, 0 -1, 1
1, -1 -10, -10
Swerve
Straight
Swerve Straight
e > a > c > g and d > b > f > h

HAWK/DOVE
𝑽
𝟐
,
𝑽
𝟐
0, V
V, 0
𝑉−𝐶
2
,
𝑉−𝐶
2
Share
Fight
Share Fight
e > a > c > g and d > b > f > h

BATTLE OF THE SEXES
3, 2 0, 0
0, 0 2, 3
Opera
Football
Opera Football
(a > g and h > b) > c = d = e = f

 Games can be zero-sum or non-zero-sum
 Games can be about conflict or cooperation
 Actions are not inherently morally valenced
 Payoffs determine type of game, strategy
WHAT HAVE WE SEEN SO FAR?

 Cournot equilibrium: each actor’s output maximizes
its profit given the outputs of other actors
 Nash equilibrium: each actor is making the best
decision they can, given what they know about each
other’s decisions
 Subgame perfect equilibrium: eliminates non-
credible threats
 Trembling hand equilibrium: considers the possibility
that a player might make an unintended move
EQUILIBRIUM

TRANSACTIONAL
ANALYSIS:
GAMES PEOPLE PLAY

MIND GAMES
“As far as the theory of games is concerned,
the principle which emerges here is that any
social intercourse whatsoever has a biological
advantage over no intercourse at all.”

 Procedures
 Operations
 Rituals
 Pastimes
 (Predatory) Games
TYPES OF INTERACTIONS

 “Hands” or roles = players
 Extensive form; players move in response to each
other
 Advantages
 Existential advantage: confirmation of existing beliefs
 Internal psychological advantage: direct emotional payoff
 External psychological advantage: avoiding a feared
situation
 Internal social advantage: structure/position with respect
to other players
 External social advantage: as above, wrt non-players
BERNE’S GAMES: STRUCTURE

 Kick Me
 Goal: Sympathy
 Find someone to beat on you, then whine about it
 “My misfortunes are better than yours”
 Ain’t It Awful
 Can be a pastime, but also manifests as a game
 Player displays distress; payoff is sympathy and help
 Why Don’t You – Yes, But
 Player claims to want advice. Player doesn’t really want it.
 Goal: Reassurance
BERNE’S GAMES: EXAMPLES

 Now I’ve Got You, You Son Of A Bitch
 Goal: Justification (or just money)
 Three-handed version is the badger game
 Roles
 Victim
 Aggressor
 Confederate
 Moves
 Provocation → Accusation
 Defence → Accusation
 Defence → Punishment
THE BADGER GAME

 “Schlemiel,” in Berne’s glossary
 Moves:
 Provocation → resentment
 (repeat)
 If B responds with anger, A appears justified in more
anger
 If B keeps their cool, A still keeps pushing
TROLLING

 Social media
 Organic responses against predatory games
 Predator Alert Tool
 /r/TumblrInAction “known trolls” wiki
 Those just happen to be ones I know about
 A truly generic reputation system is probably a pipe dream
 Wikipedia
 eBay
 But for these, we have to extend the basic
mathematical model.
OTHER MONKEY GAMEBOARDS

BOTH SPLIT
Split Steal
1
1 1
A
B
Split
Split
2
2
6800,
6800
6800,
6800

ONE SPLITS, ONE STEALS
Split Steal
1
1 1
A
B
Split
Split
6800,
6800
6800,
6800
2
2
A
Split
2
Steal
Steal
B
Split
2
0,
13600
0,
13600
13600,
0
13600,
0

BOTH STEAL
Split Steal
1
1 1
A
B
Split
Split
6800,
6800
6800,
6800
2
2
A
Split
2
Steal
Steal
B
Split
2
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0

NORMAL FORM
Also known as the Friend-or-Foe game.
1, 1 0, 2
2, 0 0, 0
Split
Steal
Split Steal
d = e > a = b > c = f = g = h

FIRST MOVE: NICK’S CHOICE
Split Steal
1
1 1
“I’m likely to split”
“I’m likely to steal”
Split
Split
6800,
6800
6800,
6800
2
SplitSteal
Steal
Split
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0
2

SECOND MOVE: NICK’S SIGNAL
Split Steal
1
1 1
Split
Split
6800,
6800
6800,
6800
2
SplitSteal
Steal
Split
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0
2

THE COMPLETE PATH
Split Steal
1
1 1
Split
Split
6800,
6800
6800,
6800
2
SplitSteal
Steal
Split
0,
13600
0,
13600
13600,
0
13600,
0
Steal
Steal
0,
0
0,
0
2

GAMES IN THE
TRANSPARENT SOCIETY

 Strategies now depend on payoff matrix and history
 Axelrod, 1981: how well do these strategies perform
against each other over time?
 “Ecological” tournaments: players abandon bad strategies
 Rapoport: if the only information you have is how
player X interacted with you last time, the best you
can do is Tit-for-Tat
 TFT cannot score higher than its opponent
 Axelrod: “Don’t be envious”
 Against TFT, no one can do better than cooperate
 Axelrod: “Don’t be too clever”
ITERATED GAMES

 Nice: S is a nice strategy iff it will not defect on
someone who has not defected on it
 Retaliatory: S is a retaliatory strategy iff it will
defect on someone who defects on it
 Forgiving: S is a forgiving strategy iff it will stop
defecting on someone who stops defecting on it
PROPERTIES

 Ord/Blair, 2002: what happens when strategies can
take into account all past interactions?
 We can express strategies in convenient first-order
logic, as it turns out
 Tit-for-Tat: D(c, r, p)
 Tit-for-Two-Tats: D(c, r, p) ∧ D(c, r, b(p))
 Grim: ∃t D(c, r, t)
 Bully: ¬∃t D(c, r, t)
 Spiteful-Bully: ¬∃t D(c, r, t) ∨ ∃s (D(c, r, s) ∧ D(c, r, b(s)) ∧
D(c, r, b(b(s))))
 Vigilante: ¬∃j D(c, j, p)
 Police: D(c, r, p) ∨ ∃j (D(c, j, p) ∧ ¬∃k(D(j, k, b(p)))
SOCIETAL ITERATED GAME THEORY

EVOLUTION IS A HARSH MISTRESS
Tit-for-Tat All-Cooperate Spiteful-Bully

PEACEKEEPING
Police All-Cooperate Spiteful-Bully

 In a society, niceness is more nuanced
 Individually nice: will not defect on someone who has not
defected on it
 Meta-individually nice: will not defect on individually nice
 Communally nice: will not defect on someone who has not
defected at all
 Meta-communally nice: will not defect on communally nice
 Same applies to forgiveness and retaliation
 Loyalty: will not defect on the same strategy as itself
NICENESS AND LOYALTY

 Peacekeepers don’t always agree
 Police will defect on Vigilantes and vice versa
 Peacekeepers protect non-peacekeeping strategies
at their own expense
META-PEACEKEEPING
Police
All-Cooperate
Spiteful-Bully
Tit-for-Tat

REDUCTIO AD ABSURDUM: ABSOLUTIST
∃t ∃j D(r, j, t) ⊕ D(c, j, t)
Tit-for-Tat All-Cooperate Spiteful-Bully Absolutist

ABSOLUTISM UBER ALLES
Tit-for-Tat All-Cooperate Spiteful-Bully Absolutist

 Frequentist: probability is the long-term frequency of
events
 Reasoning from absolute probabilities
 What happens if an event only happens once?
 Returns an estimate
 Bayesian: probability is a measure of confidence that
an event will occur
 Reasoning from relative probabilities
 Returns a probability distribution over outcomes
 Update beliefs (confidence) as new evidence arrives
TWO INTERPRETATIONS OF PROBABILITY
P(A|X) =
P X A P(A)
P(X)

 Probability distribution function: assigns
probabilities to outcomes
 Discrete: a finite set of values (enumeration)
 Function also called a probability mass function
 Poisson, binomial, Bernoulli, discrete uniform…
 Continuous: arbitrary-precision values
 Function also called a probability density function
 Exponential, Gaussian (normal), chi-squared, continuous
uniform…
 Mixed: both discrete and continuous
 Narrower distribution = greater certainty
DISTRIBUTIONS
𝐸 𝑍 𝜆 = 𝜆 𝐸 𝑍 𝜆 =
1
𝜆

 Game theory is great when you know the payoffs
 What can you do if you don’t know the payoffs?
 Or what the game tree looks like?
 Well…
 You usually have some educated guesses about who the
players are
 You have some idea what your possible actions are, as
well as the other players’
 You can look at past interactions and make inferences
 Which of these can be random variables? All of them.
 Deterministic: if all inputs are known, value is known
 Stochastic: even if all inputs are known, still random
YOU DON’T KNOW WHAT YOU DON’T KNOW

 Figure out what distribution to use
 Figure out what parameter you need to estimate
 Figure out a distribution for it, and any parameters
 Observing data tells you what your priors are
 Fixing values for stochastic variables
 Markov Chain Monte Carlo: sampling the posterior
distribution thousands of times
DON’T WAIT — SIMULATE

 Prerequisites:
 A Markov chain with an equilibrium distribution
 A function f proportional to the density of the distribution
you care about
 Choose some initial set of values for all variables
(state, S)
 Modify S according to Markov chain state transitions
 If f(S’)/f(S) ≥ 1, S’ is more likely than S, so accept
 Otherwise, accept S’ with probability f(S’)/f(S)
 Repeat
CONVERGING ON EXPECTED VALUES

A GAME WITHOUT PAYOFFS
type Outcome = Measure (Bool, Bool)
type Trust = Double
type Strategy = Trust -> Bool -> Bool -> Measure Bool
tit :: Trust -> Bool -> Bool -> Measure Bool
tit me True _ = conditioned $ bern 0.9
tit me False _ = conditioned $ bern me

CHOOSING WHICH HOLE TO FILL IN
play :: Strategy -> Strategy ->
(Bool, Bool) -> (Trust, Trust) -> Outcome
play strat_a strat_b (last_a,last_b) (a,b) = do
a_action <- strat_a a last_b last_a
b_action <- strat_b b last_a last_b
return (a_action, b_action)
iterated_game :: Measure (Double, Double)
iterated_game = do
let a_initial = False
let b_initial = False
a <- unconditioned $ uniform 0 1
b <- unconditioned $ uniform 0 1
rounds <- replicateM 10 $ return (a, b)
foldM_ (play tit tit) (a_initial, b_initial) rounds
return (a, b)

LET’S PLAY A GAME
games = [Just (toDyn False), Just (toDyn False),
Just (toDyn False), Just (toDyn True),
Just (toDyn False), Just (toDyn False),
Just (toDyn False), Just (toDyn False),
Just (toDyn False), Just (toDyn False)]
do
l <- mcmc iterated_game games
return [makeHistogram 30 (Data.Vector.fromList $ map fst
(take 5000 l)) "A's paranoia",
makeHistogram 30 (Data.Vector.fromList $ map snd
(take 5000 l)) "B's paranoia"]

HOW MUCH TINFOIL IS IN THAT HAT?

MORE STRATEGIES
allCooperate :: Trust -> Bool -> Bool -> Measure Bool
allCooperate _ _ _ = conditioned $ bern 0.1
allDefect :: Trust -> Bool -> Bool -> Measure Bool
allDefect _ _ _ = conditioned $ bern 0.9
grimTrigger :: Trust -> Bool -> Bool -> Measure Bool
grimTrigger me True False = conditioned $ bern 0.9
grimTrigger me False False = conditioned $ bern 0.1
grimTrigger me _ True = conditioned $ bern 0.9

STRATEGY AS A RANDOM VARIABLE
data SChoice = Tit | GrimTrigger | AllDefect | AllCooperate
deriving (Eq, Ord, Enum, Typeable, Show)
chooseStrategy :: SChoice -> Strategy
chooseStrategy Tit = tit
chooseStrategy AllDefect = allDefect
chooseStrategy AllCooperate = allCooperate
chooseStrategy GrimTrigger = grimTrigger
strat :: Measure SChoice
strat = unconditioned $ categorical [(AllCooperate, 0.25),
(AllDefect, 0.25),
(GrimTrigger, 0.25),
(Tit, 0.25)]

LET’S PLAY ANOTHER GAME
iterated_game2 :: Measure (SChoice, SChoice)
iterated_game2 = do
let a_initial = False
let b_initial = False
a <- unconditioned $ uniform 0 1
b <- unconditioned $ uniform 0 1
na <- strat
let a_strat = chooseStrategy na
nb <- strat
let b_strat = chooseStrategy nb
rounds <- replicateM 10 $ return (a, b)
foldM_ (play a_strat b_strat) (a_initial, b_initial) rounds
return (na, nb)
do
l <- mcmc iterated_game2 games
return [makeDiscrete (map fst (take 1000 l)) "A strategy",
makeDiscrete (map snd (take 1000 l)) "B strategy"]

 Probabilistic SIPD
 Extensive form SIPD with signaling
 And channels with decidable vs. heuristic recognisers
 Coordination. Enough said.
 System 1/System 2 conflict
 Sentiment analysis → payoff data
 Start small: the stroke is the smallest unit of interaction
 Data where information about players is limited
 IP flows
 Anonymity networks
 Signaling game about type: are two actors the same
person?
FUTURE WORK

QUESTIONS?
mlp@upstandinghackers.com
@maradydd

Strategies Without Frontiers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Strategies Without Frontiers

Similar to Strategies Without Frontiers (20)

Recently uploaded

Recently uploaded (20)

Strategies Without Frontiers

Editor's Notes