8/9 RLDM for Prosocial Behavior

Overview
• Why do we behave prosocially?
• Goal-directed, habitual and Pavlovian prosocial behavior
• run through
• RLDM (Reinforcement Learning and Decision Making)
• RLDM w/ prosocial bahavior

RLDM
• foresight
• model-based planning: tree search, Bayesian model inversion
• incomplete information, intractability: inference, heuristics
• slow
• history
• model-free learning: actor-critic model on cached action value
• need time to learn for optimization
• fast
• rule
• a priori programmed solutions
• speciﬁc, inﬂexible: generalization
• dominant

RLDM
model-based planningmodel-free learning
a priori programming

RLDM
Goal-directed Habitual Pavlovian
Algorithm
Simulation of
Consequences
Learning from
Past Experience
Evolution Reﬂexes + 
Classical Conditioning
Mechanism Deliberate Automatic/Learned Automatic/Inborn
Dominating stage Beginning Late All
Dependent on
working-memory
Y N N
Sensitive to
motivation change
Y N Y
Sensitive to
consequence
Y N N
Location DLPFC Straitum Amygdala

RLDM for Prosocial Behavior
• Goal-directed system:
• direct reciprocity
• tit-for-tat
• repeated prisoner’s dilemma: optimal strategy (co-op, copy, defect)
• theory of mind, adjustment against dominant trait and DLPFC
• indirect reciprocity
• reputation is good
• anonymity, external reward attribution is bad
• 5-year-olds share more when observed (sensitive to observer role)
• autism, TMS DLPFC -> less reputation management
• avoid punishments
• ultimatum game: usually share more than dictator game
• lesser punishment -> decreased
• passing false-belief task -> increased
• same individual with increased age -> increased
• tDCS DLPFC -> decreased in dictator, increased in ultimatum
• *moral principles: still a goal!

• Habitual system:
• frequent rewarding of an action -> goal-directed to habitual
• social heuristics hypothesis
• other-regarding acts in one-shot anonymous games
• intuition shaped by past successes and cultural norms
• repeated prisoner’s dilemma followed by one-shot economic games
• payoﬀ promotes co-op/defect -> incr/decr other-regarding acts
• assumptions of generalization: can habits spill over situations?
• "aid itself is not the main purpose of these acts"
• public goods theory: supposed to get "crowding out"
• equating init endowment in dictator game
• experimenter counterbalancing dictator’s donation
• warm glow
• utility is from the act of giving not the outcome
• devaluation procedure
• automatic
• impairing working memory by load
• mini-dictator games: load -> fair choice
• prisoner’s dilemma: load -> less defect near end
• individual diﬀerences in prosocial orientation
• striatum
• dense sensorimotor connection in dorsal for fast actor response
• reward prediction errors in ventral for critics’s updating expected value

• Pavlovian system:
• no other explanation for early(<24m) prosocial behavior
• affective empathy
• emotional contagion
• affective perspective taking
• self-other distinction occurs w/ prosocial behavior
• self-reported measures correlate w/ prosocial behavior
• observing others’ suffering motivates prosocial behavior
• empathy-altruism hypothesis
• reduce internal negative arousal
• help over escape
• sensitive to motivational state: whether or not subject feels empathy
• maladaptive and eclipse other goals
• inflexible
• negative auto-maintenance procedure
• bias toward object of empathy
• specific
• empathy does not carry over contexts -> not really concern about others
• Pavlovian-to-instrumental transfer (PIT)
• appetitive -> invigorate / aversive -> inhibit
• evolutionary origin
• over-generalization of the parental care instinct
• mere childlike features suffice -> alloparenting for co-op breeding

8/9 RLDM for Prosocial Behavior

Recommended

Recommended

More Related Content

Similar to 8/9 RLDM for Prosocial Behavior

Similar to 8/9 RLDM for Prosocial Behavior (20)

More from Rex Yuan

More from Rex Yuan (14)

Recently uploaded

Recently uploaded (20)

8/9 RLDM for Prosocial Behavior