15. 追試は三角測量である
S
everal studies across many fields
estimate that only around 40% of
published findings can be replicated
reliably. Various funders and communities
are promoting ways for independent teams
to routinely replicate the findings of others.
Theseeffortsarelaudable,butinsufficient.
If a study is skewed and replications
recapitulate that approach, findings will be
consistently incorrect or biased. Consider
a commonly used assay in which the pro-
duction of a fluorescent protein is used to
monitor cell activity. If the compounds used
to manipulate cell activity are also fluores-
cent, as has happened1
, reliably repeatable
results will not yield robust conclusions.
We have both spent much of our careers
advocating ways to increase scientific cer-
tainty. One of us (M.R.M.) participated in
work by UK funding agencies to develop
strategiesforreproduciblescience,andhelped
to craft a manifesto for reproducibility2
.
Butreplicationalonewillgetusonlysofar.
In some cases, routine replication might
Repeating experiments
is not enough
Verifying results requires disparate lines of evidence — a technique called
triangulation. Marcus R. Munafò and George Davey Smith explain.
ILLUSTRATIONBYDAVIDPARKINS
2 5 J A N U A R Y 2 0 1 8 | V O L 5 5 3 | N A T U R E | 3 9 9
(Munafò & Smith, 2018)
・1地点からだけの観測ではどうしても見え
ない部分ができる
・2地点からなら(つまり複数人で違う方向
から見れば)見える。
・追試は第三者の眼で自分の実験を眺めても
らうための重要なプロセス
17. 2011年頃から風が吹き始めてきた
Bem騒動 (1月)
Feeling the Future: Experimental Evidence for Anomalous Retroactive
Influences on Cognition and Affect
Daryl J. Bem
Cornell University
The term psi denotes anomalous processes of information or energy transfer that are currently unex-
plained in terms of known physical or biological mechanisms. Two variants of psi are precognition
(conscious cognitive awareness) and premonition (affective apprehension) of a future event that could not
otherwise be anticipated through any known inferential process. Precognition and premonition are
themselves special cases of a more general phenomenon: the anomalous retroactive influence of some
f i di id l’ h h h i
Photograph by Shannon Taggart
再現できない
Stapel騒動 (8月)
by Joris Buijs
Gordijn, E.H., & Stapel, D.A. (2006). Behavioural effects of automatic interpersonal versus intergroup
social comparison. British Journal of Social Psychology, 45, 717-729.
Joly, J.F., & Stapel, D.A. (2009). Staff, miter, book, share: How attributes of Saint Nicholas induce
normative behavior. European Journal of Social Psychology, 39, 145-153.
Joly, J.F., Stapel, D.A., & Lindenberg, S.M. (2008). Silence and table manners: When environments
activate norms. Personality and Social Psychology Bulletin, 34, 1047-1056.
Lindenberg, S.M., Joly, J.F., & Stapel, D.A. (2011). The norm-activating power of celebrity: The dynamics
of success and influence. Social Psychology Quarterly, 74, 98-120.
Maringer, M., & Stapel, D.A. (2007). Unfinished business: How completeness affects the impact of
emotional states and emotion concepts on social judgments. Journal of Experimental Social
Psychology, 43, 712-718.
Maringer, M., & Stapel, D.A. (2009). Correction or comparison? The effects of prime awareness on social
judgments. European Journal of Social Psychology, 39, 719-733.
Marx, D.M., & Stapel, D.A. (2006). Distinguishing stereotype threat from priming effects: On the role of
the social self and threat-based concerns. Journal of Personality and Social Psychology, 91,
243-254.
Marx, D.M., & Stapel, D.A. (2006). It’s all in the timing: Measuring emotional reactions to stereotype
threat before and after taking a test. European Journal of Social Psychology, 36, 687-698.
Renkema, L.J., Stapel, D.A., Maringer, M., & Van Yperen, N.W. (2008). Terror management and
stereotyping: Why do people stereotype when mortality is salient? Personality and Social
Psychology Bulletin, 34, 553-564.
…
58本撤回
18. 院の授業で学生さん方がめっちゃがんばって追試研
究をやってきました
Ojiro, Y., Gobara, A., Nam, G., Sasaki, K., Kishimoto, R., Yamada, Y., & Miura, K. (2015). Two
replications of "Hierarchical encoding makes individuals in a group seem more attractive (2014;
Experiment 4)”. The Quantitative Methods for Psychology, 11, r8-r11.
Kishimoto, R., Sasaki, K., Gobara, A., Ojiro, Y., Nam, G., Miura, K., & Yamada, Y. (2016). When a
silhouette appears male: Observer’s own physical fitness governs social categorization of
sexually ambiguous stimuli. Letters on Evolutionary Behavioral Science, 7, 17-20.
Chaya, K., Xue, Y., Uto, Y., Yao, Q., & Yamada, Y. (2016). Fear of eyes: Triadic relation among
social anxiety, trypophobia, and discomfort for eye cluster. PeerJ, 4:e1942.
Yonemitsu, F., Sung, Y., Naka, K., Yamada, Y., & Marmolejo-Ramos, F. (2017). Does weight lifting
improve visual acuity? A replication of Gonzalo-Fonrodona and Porras (2013). BMC Research
Notes, 10:362.
Nitta, H., Tomita, H., Zhang, Y., Zhou, X., & Yamada, Y. (2018). Disgust and the rubber hand
illusion: A registered replication report of Jalal, Krishnakumar, and Ramachandran (2015).
Cognitive Research: Principles and Implications, 3:15.
19. しかし・・・
約1年10ヶ月(660日)
約1年1ヶ月(397日)
約1年(360日)
平均で525日間(約1年6ヶ月)もかかってる
Ojiro, Y., Gobara, A., Nam, G., Sasaki, K., Kishimoto, R., Yamada, Y., & Miura, K. (2015). Two
replications of "Hierarchical encoding makes individuals in a group seem more attractive (2014;
Experiment 4)”. The Quantitative Methods for Psychology, 11, r8-r11.
Kishimoto, R., Sasaki, K., Gobara, A., Ojiro, Y., Nam, G., Miura, K., & Yamada, Y. (2016). When a
silhouette appears male: Observer’s own physical fitness governs social categorization of
sexually ambiguous stimuli. Letters on Evolutionary Behavioral Science, 7, 17-20.
Chaya, K., Xue, Y., Uto, Y., Yao, Q., & Yamada, Y. (2016). Fear of eyes: Triadic relation among
social anxiety, trypophobia, and discomfort for eye cluster. PeerJ, 4:e1942.
Yonemitsu, F., Sung, Y., Naka, K., Yamada, Y., & Marmolejo-Ramos, F. (2017). Does weight lifting
improve visual acuity? A replication of Gonzalo-Fonrodona and Porras (2013). BMC Research
Notes, 10:362.
Nitta, H., Tomita, H., Zhang, Y., Zhou, X., & Yamada, Y. (2018). Disgust and the rubber hand
illusion: A registered replication report of Jalal, Krishnakumar, and Ramachandran (2015).
Cognitive Research: Principles and Implications, 3:15.
約1年4ヶ月(476日)
約2年(734日)
※ちなみに授業追試はあと2つあるけど継続中・・・
21. 山田が参加中の大規模追試
Many Smiles project
・顔面フィードバック仮説
の徹底的検証
・原著者のStrackも参加
・コロナのせいで再査読
1
A Multi-Lab Test of the Facial Feedback Hypothesis by The Many Smiles Collaboration
Accepted in-principle at Nature Human Behaviour
Corresponding author: Nicholas A. Coles (colesn@utk.edu), Department of Psychology, University of
Tennessee, Knoxville, USA
*Nicholas Coles1
, David March2
, Fernando Marmolejo Ramos3
, Arinze Nwadiogo4
, Izuchukwu Ndukaihe4
, Asil
Ali Özdoğru5
, Balazs Aczel6
, Nandor Hajdu6
, Tamas Nagy6
, Bidisha Som7
, Dana Basnight-Brown8
, Danilo
Zambrano9
, Daniela Serrato Alvarez9
, Francesco Foroni10
, Megan Willis10
, Gerit Pfuhl11
, Gwenael Kaminski12,
13
, Tracy Ehrengarth12
, Hans IJzerman14
, Kevin Vezirian14
, Hassan Banaruee15
, Omid Kathin-Zadeh15
, Isabel
Suarez16
, Jeff T. Larsen1
, José Antonio Hinojosa Poveda17
, Cristina Villalba-García17
, Krystian Barzykowski18
,
Michal Parzuchowski19
, Aneta Marczak19
, Natalia Trujillo20
, Sandra Trujillo20
, Natalie Butcher21
, Daniel
Eaves21
, Nikolay Dagaev22
, Elena Gorbunova22
, Niv Reggev23
, Pascal Gygax24
, Sarah Pressman25
, John
Hunter25
, Miranda Pinks25
, Ian Waldrop25
, Susana Ruiz-Fernandez26
, Yuki Yamada27
, Ayumi Ikeda27
, Fritz
Strack28
, Lowell Gaertner1
, Phoebe Ellsworth29
, Marco Marozzi30
, Marco Tullio Liuzza31
1
Department of Psychology, University of Tennessee, Knoxville, United States of America
2
Department of Psychology, Florida State University, United States of America
3
School of Psychology, University of Adelaide, Australia
22. 山田が参加中の大規模追試
Crowdsourced Replication Initiative
・もともとはドイツの学会のイベント
・研究者クラウドソーシングで,ある
研究の分析の追試を各自で行う
・著者は200人くらい
・分野が社会調査なので山田はリアル
ガチでシーキビだった
The Crowdsourced Replication Initiative: Investigating
Immigration and Social Policy Preferences using Meta-Science
‹ Executive Report ›
Principal Investigators:
Nate Breznau, University of Bremen breznau.nate@gmail.com
Eike Mark Rinke, University of Leeds E.M.Rinke@leeds.ac.uk
Alexander Wuttke, University of Mannheim alexander.wuttke@uni-mannheim.de
Research Participant Co-Authors:
Muna Adem, Jule Adriaans, Amalia Alvarez-Benjumea, Henrik Andersen, Daniel Auer, Flavio
Azevedo, Oke Bahnsen, Dave Balzer, Paul Bauer, Gerrit Bauer, Markus Baumann, Sharon
Baute, Verena Benoit, Julian Bernauer, Carl Berning, Anna Berthold, Felix S. Bethke, Thomas
Biegert, Katharina Blinzler, Johannes N. Blumenberg, Licia Bobzien, Andrea Bohman, Thijs
Bol, Amie Bostic, Zuzanna Brzozowska, Katharina Burgdorf, Kaspar Burger, Kathrin Busch,
Juan Castillo, Nathan Chan, Pablo Christmann, Roxanne Connelly, Christian Czymara, Elena
Damian, Alejandro Ecker, Achim Edelmann, Maureen A. Eger, Simon Ellerbrock, Anna Forke,
Andrea Forster, Chris Gaasendam, Konstantin Gavras, Vernon Gayle, Theresa Gessler, Timo
Gnambs, Amélie Godefroidt, Alexander Greinert, Max Grömping, Martin Groß, Stefan Gruber,
Tobias Gummer, Andreas Hadjar, Jan Paul Heisig, Sebastian Hellmeier, Stefanie Heyne,
Magdalena Hirsch, Mikael Hjerm, Oshrat Hochman, Jan H. Höffler, Andreas Hövermann,
Sophia Hunger, Christian Hunkler, Nora Huth, Zsofia Ignacz, Laura Jacobs, Jannes Jacobsen,
Bastian Jaeger, Sebastian Jungkunz, Nils Jungmann, Mathias Kauff, Dennis Kleinert, Julia
Klinger, Jan-Philipp Kolb, Marta Kołczyńska, John Kuk, Katharina Kunißen, Dafina Kurti,
Philipp Lersch, Lea-Maria Löbel, Philipp Lutscher, Matthias Mader, Joan Madia, Natalia
Malancu, Luis Maldonado, Helge Marahrens, Nicole Martin, Paul Martinez, Jochen Mayerl,
Oscar J. Mayorga, Patricia McManus, Kyle McWagner, Cecil Meeusen, Daniel Meierrieks,
Jonathan Mellon, Friedolin Merhout, Samuel Merk, Daniel Meyer, Jonathan Mijs, Cristobal
Moya, Marcel Neunhoeffer, Daniel Nüst, Olav Nygård, Fabian Ochsenfeld, Gunnar Otte, Anna
34. ARTICLE
Regulation of REM and Non-REM Sleep by
Periaqueductal GABAergic Neurons
Franz Weber1,3, Johnny Phong Hoang Do1, Shinjae Chung1,3, Kevin T. Beier2, Mike Bikov1,
Mohammad Saffari Doost1 & Yang Dan 1
DOI: 10.1038/s41467-017-02765-w OPEN
NATURE COMMUNICATIONS | (2018)9:354 |DOI: 10.1038/s41467-017-02765-w |www.nature.com/naturecommunications 1
Sample sizes. For optogenetic activation experiments, cell-type-specific ablation
experiments, and in vivo recordings (optrode recordings and calcium imaging), we
continuously increased the number of animals until statistical significance was
reached to support our conclusions. For rabies-mediated and anterograde tracing
experiments, the selection of the sample size was based on numbers reported in
previous studies. For optrode recordings, we first recorded a preliminary data set of
six units from two mice. Based on analysis of this data set and given the success
rate in finding identified GABAergic units, we predicted that about 20 units are
sufficient to statistically support our conclusions.
「ぼくたちの結論を支持するような有意差が出るまで動物の数を
増やしていきました」
35. p-hackingとp-slacking
pハッキング (e.g., John et al., 2012)
いろんなやり方でp値をできるだけ低くすること
pスラッキング (e.g., Smits, 2014 on Twitter; Bryan et al., 2019; Chuard et al., 2019)
いろんなやり方でp値をできるだけ高くすること
最初の報告者は有意に見せたいからpハッキングし,追試者は
「それが再現できなかった」と報告するためにpスラッキングす
る。結局,研究意図に応じてp値を密かに誤魔化していることに
変わりはない
37. HARKing (Hypothesizing After the Results are Known) 仮説の事後生成
結果を知った後に仮説を作り直し,はじめからその結果を予測していたよ
うに見せかける行為。これをやると研究者は全ての未来の結果を的中しま
くる預言者だということになってしまう。もちろん事実はそうではない。
(Kerr, 1998; Rubin, 2017)
38. TODA RADIO #002より
HARKing (Hypothesizing After the Results are Known) 仮説の事後生成
(Kerr, 1998; Rubin, 2017)
39. Which Article Should You Write?
There are two possible articles you can write: (a) the
article you planned to write when you designed your
study or (b) the article that makes the most sense now that
you have seen the results. They are rarely the same, and
the correct answer is (b).
「正しい論文は,結果を見たあとのいま,最もつじつまが合う
ように書かれたものである」
ベムの教え (Bem, 2004)
Photograph by Shannon Taggart
40. HARKing (Hypothesizing After the Results are Known) 仮説の事後生成
実は3種類あるといわれている
(Kerr, 1998; Rubin, 2017)
CHARKing(constructing hypotheses after the results are known)
結果がわかった後に仮説生成。Kerr的に純粋なHARKingのこと。
RHARKing(retrieving hypotheses after the results are known)
結果がわかった後に先行研究が提示した有用そうな仮説を検索すること。
SHARKing(suppressing hypotheses after the results are known)
結果がわかった後にヤバイ仮説は考えてなかったことにすること。
HARKingは特に発覚しづらい上にストーリーを美しく見せるのに
大変効果的である。仮説検証は全く行なってないから帰無仮説有意性
検定すること自体が不適切だし,追試しても再現されない。
49. The preregistration revolution
Brian A. Noseka,b,1
, Charles R. Ebersoleb
, Alexander C. DeHavena
, and David T. Mellora
a
Center for Open Science, Charlottesville, VA 22903; and b
Department of Psychology, University of Virginia, Charlottesville, VA 22904
Edited by Richard M. Shiffrin, Indiana University, Bloomington, IN, and approved August 28, 2017 (received for review June 15, 2017)
Progress in science relies in part on generating hypotheses with
existing observations and testing hypotheses with new observations.
This distinction between postdiction and prediction is appreciated
conceptually but is not respected in practice. Mistaking generation of
postdictions with testing of predictions reduces the credibility of
research findings. However, ordinary biases in human reasoning,
overconfidence in post hoc explanations (postdictions) and inflate
the likelihood of believing that there is evidence for a finding when
there is not. Presenting postdictions as predictions can increase
the attractiveness and publishability of findings by falsely reducing
uncertainty. Ultimately, this decreases reproducibility (6–11).
M t l C t i t Di ti i hi P di ti d
COL
P
プレレジ革命(Nosek et al., 2018)
データとる前にやることを登録しておき,その通りやる
(事前登録)
51. The preregistration revolution
Brian A. Noseka,b,1
, Charles R. Ebersoleb
, Alexander C. DeHavena
, and David T. Mellora
a
Center for Open Science, Charlottesville, VA 22903; and b
Department of Psychology, University of Virginia, Charlottesville, VA 22904
Edited by Richard M. Shiffrin, Indiana University, Bloomington, IN, and approved August 28, 2017 (received for review June 15, 2017)
Progress in science relies in part on generating hypotheses with
existing observations and testing hypotheses with new observations.
This distinction between postdiction and prediction is appreciated
conceptually but is not respected in practice. Mistaking generation of
postdictions with testing of predictions reduces the credibility of
research findings. However, ordinary biases in human reasoning,
overconfidence in post hoc explanations (postdictions) and inflate
the likelihood of believing that there is evidence for a finding when
there is not. Presenting postdictions as predictions can increase
the attractiveness and publishability of findings by falsely reducing
uncertainty. Ultimately, this decreases reproducibility (6–11).
M t l C t i t Di ti i hi P di ti d
COL
P
プレレジ革命(Nosek et al., 2018)
これでいくつかのQRPs問題への対策はバッチリ
データとる前にやることを登録しておき,その通りやる
特にHARKingの抑止力は強力。
実験前に仮説を登録させるので事後生成できない。
56. プレレジのQRPs
さてそんなプレレジ関係だが,これも現状クラックできる。
PARKing (Preregistering After the Results are Known): 事後事前登録
つまり結果を見た後にプレレジ。プレレジでQRPsを行なっていないよ
うに見せかけつつ良い結果を報告しようとする行為。
OPINION
published: 26 September 2018
doi: 10.3389/fpsyg.2018.01831
How to Crack Pre-registration:
Toward Transparent and Open
Science
Yuki Yamada*
Faculty of Arts and Science, Kyushu University, Fukuoka, Japan
Keywords: QRP, misconduct in research, academic publishing, preregistration, open science
The reproducibility problem that exists in various academic fields has been discussed in recent
years, and it has been revealed that scientists discreetly engage in several questionable research
86. マタイ効果 「富める者はますます富む」
(Bol et al., 2018; Marton, 1968)
050100150200250300350
AveragecumulativefundingfromNWOandERC(kEuro)
0 1 2 3 4 5 6 7 8
Year since early grant competition
Ranks −1/−2 Ranks −3/−4 Ranks −5/−6
Ranks +1/+2 Ranks +3/+4 Ranks +5/+6
Fig. 3. Accumulation of grant money by early career grant applicants.
Shown is the cumulative amount of funding received in NWO and ERC
competitions (vertical axis) as a function of the number of years elapsed
since the early career grant competition (horizontal axis). This relationship is
shown for different ranks above (green, +) and below (red, −) the early
career funding threshold.
初期に金もらった研究者
初期に金もらえなかった研究者
はじめから浮いてな
いと負け続ける。
麻雀と一緒。
勝つ者には牌が勝手
に集まってくる
106. 本番では意味ないので削除してますが,今回の話に関連する私の文献です。
ご参考までに。全てオープンアクセスのリンク付きです:
・山田祐樹 (2016). 認知心理学における再現可能性の認知心理学 心理学評論, 59(1), 15-29.
・Yamada, Y. (2018). How to crack pre-registration: Toward transparent and open science. Frontiers in Psychology, 9:1831.
・山田祐樹 (2018). こころの測り方「自由を棄てて透明な心理学を掴む」 心理学ワールド 83, 34-35.
・山田祐樹 (2018). 再現可能性問題をハックする ―是非に及ばぬ研究コミュニティからの包囲網― ヒューマンインタフェース学会誌, 20(1), 17-22.
・佐々木恭志郎・米満文哉・山田祐樹 (2019). 利き手側の良さ ―事前登録されたCasasanto (2009) の直接的追試― 心理学評論, 62(3), 262-271.
・Ikeda, A., Xu, H., Fuji, N., Zhu, S., & Yamada, Y. (2019). Questionable research practices following pre-registration. Japanese Psychological Review, 62(3), 281-295.
・山田祐樹 (2019). 未来はごく一部の人達の手の中 ―研究者評価の歪みがもたらす心理学界全体の歪み― 心理学評論, 62(3), 296-303.
・Yamada, Y. (2019). Publish but perish regardless in Japan. Nature Human Behaviour, 3, 1035.
・佐々木恭志郎・山田祐樹 (2020). 実験心理学者も快適に論文投稿したい 認知心理学会テクニカルレポート, COGPSY-TR-007, 1-6.
・Ikeda, K., Yamada, Y., & Takahashi, K. (2020, May 26). Post-publication peer review for real. PsyArXiv. https://doi.org/10.31234/osf.io/sp3j5.
・Sasaki, K., & Yamada, Y. (2020, July 8). The pandemic threatens the Registered Reports system as well as human lives. PsyArXiv. https://doi.org/10.31234/osf.io/6wdaz.
・Yamada, Y. (2020). Micropublishing during and after the COVID-19 era. Collabra: Psychology, 6(1), 36.