1. Overview
ā¢ Why do we behave prosocially?
ā¢ Goal-directed, habitual and Pavlovian prosocial behavior
ā¢ run through
ā¢ RLDM (Reinforcement Learning and Decision Making)
ā¢ RLDM w/ prosocial bahavior
3. RLDM
ā¢ foresight
ā¢ model-based planning: tree search, Bayesian model inversion
ā¢ incomplete information, intractability: inference, heuristics
ā¢ slow
ā¢ history
ā¢ model-free learning: actor-critic model on cached action value
ā¢ need time to learn for optimization
ā¢ fast
ā¢ rule
ā¢ a priori programmed solutions
ā¢ speciļ¬c, inļ¬exible: generalization
ā¢ dominant
5. RLDM
Goal-directed Habitual Pavlovian
Algorithm
Simulation of
Consequences
Learning from
Past Experience
Evolution Reļ¬exes +āØ
Classical Conditioning
Mechanism Deliberate Automatic/Learned Automatic/Inborn
Dominating stage Beginning Late All
Dependent on
working-memory
Y N N
Sensitive to
motivation change
Y N Y
Sensitive to
consequence
Y N N
Location DLPFC Straitum Amygdala
6. RLDM for Prosocial Behavior
ā¢ Goal-directed system:
ā¢ direct reciprocity
ā¢ tit-for-tat
ā¢ repeated prisonerās dilemma: optimal strategy (co-op, copy, defect)
ā¢ theory of mind, adjustment against dominant trait and DLPFC
ā¢ indirect reciprocity
ā¢ reputation is good
ā¢ anonymity, external reward attribution is bad
ā¢ 5-year-olds share more when observed (sensitive to observer role)
ā¢ autism, TMS DLPFC -> less reputation management
ā¢ avoid punishments
ā¢ ultimatum game: usually share more than dictator game
ā¢ lesser punishment -> decreased
ā¢ passing false-belief task -> increased
ā¢ same individual with increased age -> increased
ā¢ tDCS DLPFC -> decreased in dictator, increased in ultimatum
ā¢ *moral principles: still a goal!
7. RLDM for Prosocial Behavior
ā¢ Habitual system:
ā¢ frequent rewarding of an action -> goal-directed to habitual
ā¢ social heuristics hypothesis
ā¢ other-regarding acts in one-shot anonymous games
ā¢ intuition shaped by past successes and cultural norms
ā¢ repeated prisonerās dilemma followed by one-shot economic games
ā¢ payoļ¬ promotes co-op/defect -> incr/decr other-regarding acts
ā¢ assumptions of generalization: can habits spill over situations?
ā¢ "aid itself is not the main purpose of these acts"
ā¢ public goods theory: supposed to get "crowding out"
ā¢ equating init endowment in dictator game
ā¢ experimenter counterbalancing dictatorās donation
ā¢ warm glow
ā¢ utility is from the act of giving not the outcome
ā¢ devaluation procedure
ā¢ automatic
ā¢ impairing working memory by load
ā¢ mini-dictator games: load -> fair choice
ā¢ prisonerās dilemma: load -> less defect near end
ā¢ individual diļ¬erences in prosocial orientation
ā¢ striatum
ā¢ dense sensorimotor connection in dorsal for fast actor response
ā¢ reward prediction errors in ventral for criticsās updating expected value
8. RLDM for Prosocial Behavior
ā¢ Pavlovian system:
ā¢ no other explanation for early(<24m) prosocial behavior
ā¢ aļ¬ective empathy
ā¢ emotional contagion
ā¢ aļ¬ective perspective taking
ā¢ self-other distinction occurs w/ prosocial behavior
ā¢ self-reported measures correlate w/ prosocial behavior
ā¢ observing othersā suļ¬ering motivates prosocial behavior
ā¢ empathy-altruism hypothesis
ā¢ reduce internal negative arousal
ā¢ help over escape
ā¢ sensitive to motivational state: whether or not subject feels empathy
ā¢ maladaptive and eclipse other goals
ā¢ inļ¬exible
ā¢ negative auto-maintenance procedure
ā¢ bias toward object of empathy
ā¢ speciļ¬c
ā¢ empathy does not carry over contexts -> not really concern about others
ā¢ Pavlovian-to-instrumental transfer (PIT)
ā¢ appetitive -> invigorate / aversive -> inhibit
ā¢ evolutionary origin
ā¢ over-generalization of the parental care instinct
ā¢ mere childlike features suļ¬ce -> alloparenting for co-op breeding