SlideShare a Scribd company logo
1 of 7
When Crowd Meets Persona: Creating a Large-
Scale Open-Domain Persona Dialogue Corpus
Nov. 2022. @HCOMP (WiP)
Won Ik Cho¹*, Yoon Kyung Lee¹*, Seoyeon Bae¹, Jihwan Kim¹,
Sangah Park², Moosung Kim³, Sowon Hahn¹ and Nam Soo Kim¹
Seoul National University¹, DeepNatural AI², Smilegate AI³
Motivation
• Creating dialogue dataset
 Multiple participants
 High degree of freedom
• Difficulties of crowdsourcing
 Researchers, moderators, and crowdworkers
 Considerate scheduling and conflict resolution required
• Persona dialogue
 Challenging and time-
consuming project
 What should the task
managers keep in mind?
1
Our study
• Setting
 Persona participants (actors) talk with user participants (workers)
 Actors are hired, while workers are crowdsourced
 User initiates the conversation, but persona leads the role
• Collection
 Recruiting workers from crowdsourcing platform
 Chat interface developed by the platform
2
Our study
• Project flow
3
Discussion
• Overview
 RQ1: What should be considered in accommodating the construction
of a successful dialogue dataset?
• The organizer should acknowledge that it differs a lot from usual conversation
and it is crucial to handle unexpected and unwanted situations
 RQ2: What is the role of the moderator in large-scale dialogue dataset
construction?
• Resolve conflicts after constructing a rapport with participants
• Be aware on the points participants feel uncomfortable, empathizing and
understanding the struggles
• Recruitment and financial support that affects the atmosphere
 RQ3: Will such considerations help reach an intended goal of
construction?
• Shown indirectly using survey results, textual analysis, and generative model-
based experiments (to be further investigated)
4
Conclusion
• Dataset
 https://github.com/smilegate-ai/OPELA
• Acknowledgement
 Smilegate AI (funding and discussions)
 DeepNatural AI (crowdsourcing and moderation)
 Kudos to all our crowdworkers 
• Full paper and analyses
 To be disclosed
5
Thank you
6

More Related Content

Similar to 2211 HCOMP

Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Mark_Childs
 
European Communication School: Social Media Session 5
European Communication School: Social Media Session 5European Communication School: Social Media Session 5
European Communication School: Social Media Session 5Richard Stacy
 
Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...berhanu taye
 
#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference OverviewLaura Pasquini
 
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningMental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningDaniel Eizans
 
CorporateCommunityOWF2010
CorporateCommunityOWF2010CorporateCommunityOWF2010
CorporateCommunityOWF2010Connect'up
 
Zen and the Art of UX Planning
Zen and the Art of UX PlanningZen and the Art of UX Planning
Zen and the Art of UX PlanningCorey Allenbach
 
Redistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationRedistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationKurt Luther
 
Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Karen S Calhoun
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementctedds
 
The Birth of the HUGE UX School
The Birth of the HUGE UX SchoolThe Birth of the HUGE UX School
The Birth of the HUGE UX SchoolMichal Pasternak
 
Project Management Base Camp
Project Management Base CampProject Management Base Camp
Project Management Base Campeph-hr
 
Some perspectives from the Astropy Project
Some perspectives from the Astropy ProjectSome perspectives from the Astropy Project
Some perspectives from the Astropy ProjectKelle Cruz
 
Project management.docx communiction
Project management.docx communictionProject management.docx communiction
Project management.docx communictionberhanu taye
 
Open Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupOpen Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupChris Aniszczyk
 

Similar to 2211 HCOMP (20)

Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2
 
COMP 4026 - Lecture 1
COMP 4026 - Lecture 1COMP 4026 - Lecture 1
COMP 4026 - Lecture 1
 
Mg6088 spm unit-4
Mg6088 spm unit-4Mg6088 spm unit-4
Mg6088 spm unit-4
 
Report
ReportReport
Report
 
European Communication School: Social Media Session 5
European Communication School: Social Media Session 5European Communication School: Social Media Session 5
European Communication School: Social Media Session 5
 
Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...
 
Sakai Development Process
Sakai Development ProcessSakai Development Process
Sakai Development Process
 
#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview
 
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningMental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
 
CorporateCommunityOWF2010
CorporateCommunityOWF2010CorporateCommunityOWF2010
CorporateCommunityOWF2010
 
Proyectos Investigación y Desarrollo
Proyectos Investigación y DesarrolloProyectos Investigación y Desarrollo
Proyectos Investigación y Desarrollo
 
Zen and the Art of UX Planning
Zen and the Art of UX PlanningZen and the Art of UX Planning
Zen and the Art of UX Planning
 
Redistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationRedistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative Collaboration
 
Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangement
 
The Birth of the HUGE UX School
The Birth of the HUGE UX SchoolThe Birth of the HUGE UX School
The Birth of the HUGE UX School
 
Project Management Base Camp
Project Management Base CampProject Management Base Camp
Project Management Base Camp
 
Some perspectives from the Astropy Project
Some perspectives from the Astropy ProjectSome perspectives from the Astropy Project
Some perspectives from the Astropy Project
 
Project management.docx communiction
Project management.docx communictionProject management.docx communiction
Project management.docx communiction
 
Open Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupOpen Source Lessons from the TODO Group
Open Source Lessons from the TODO Group
 

More from WarNik Chow

2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inpersonWarNik Chow
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech datasetWarNik Chow
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2eWarNik Chow
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminarWarNik Chow
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH WarNik Chow
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate SpeechWarNik Chow
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLPWarNik Chow
 

More from WarNik Chow (20)

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
 

Recently uploaded

Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 

2211 HCOMP

  • 1. When Crowd Meets Persona: Creating a Large- Scale Open-Domain Persona Dialogue Corpus Nov. 2022. @HCOMP (WiP) Won Ik Cho¹*, Yoon Kyung Lee¹*, Seoyeon Bae¹, Jihwan Kim¹, Sangah Park², Moosung Kim³, Sowon Hahn¹ and Nam Soo Kim¹ Seoul National University¹, DeepNatural AI², Smilegate AI³
  • 2. Motivation • Creating dialogue dataset  Multiple participants  High degree of freedom • Difficulties of crowdsourcing  Researchers, moderators, and crowdworkers  Considerate scheduling and conflict resolution required • Persona dialogue  Challenging and time- consuming project  What should the task managers keep in mind? 1
  • 3. Our study • Setting  Persona participants (actors) talk with user participants (workers)  Actors are hired, while workers are crowdsourced  User initiates the conversation, but persona leads the role • Collection  Recruiting workers from crowdsourcing platform  Chat interface developed by the platform 2
  • 5. Discussion • Overview  RQ1: What should be considered in accommodating the construction of a successful dialogue dataset? • The organizer should acknowledge that it differs a lot from usual conversation and it is crucial to handle unexpected and unwanted situations  RQ2: What is the role of the moderator in large-scale dialogue dataset construction? • Resolve conflicts after constructing a rapport with participants • Be aware on the points participants feel uncomfortable, empathizing and understanding the struggles • Recruitment and financial support that affects the atmosphere  RQ3: Will such considerations help reach an intended goal of construction? • Shown indirectly using survey results, textual analysis, and generative model- based experiments (to be further investigated) 4
  • 6. Conclusion • Dataset  https://github.com/smilegate-ai/OPELA • Acknowledgement  Smilegate AI (funding and discussions)  DeepNatural AI (crowdsourcing and moderation)  Kudos to all our crowdworkers  • Full paper and analyses  To be disclosed 5

Editor's Notes

  1. Hi, we are joint team of Seoul national university, Deep natural AI, and smilegate AI, from South korea. Today we are going to present our work-in-progress project on persona dialogue creation with hired persona actors and crowdsourced users.
  2. Our work first considers an innate difficulty of making up dialogue corpus, that two or more participants are necessarily involved with the construction process, and such process has so high degree of freedom that the quality control of the output may not be feasible. Also, in many corpus creation work these days corporate with crowdsourcing companies and the moderators there, who recruit the workers and manage their overall load and compensation. That is, the role of researchers, moderators and crowdworkers are all slightly different concerning the goal and scale, which requires a considerate scheduling and conflict resolution. In this light, we’ve come to a question that how should the persona dialogue corpus generation should be managed in practice.
  3. In our study, we let persona participants, namely the actors, talk with user participants, the workers. Actors are hired here, while workers are crowdsourced. For every dialogue, the user initiates the conversation, but persona actors lead the role while they talk. The collection is processed by recruiting workers from the community of crowdsourcing platform, using the chat interface developed by the platform so as to check and manage the progress of the conversation. Freedom of conversation was guaranteed as much as possible, but users who make actors uneasy or feel eerieness were reported and set aside from the project. After the collection was finished, we analyzed the survey and interview done with participants and the moderator, and furthermore analyzed the constructed data.
  4. We demonstrate the overall project flow. First, guidelines for the conversation are created by researchers, and the platform and moderator recruit actors and workers based on the guidelines. Here, actor plays the perfona they first decided, and the user initiate the conversation with the persona based on the profile they face, only if the pass the test prepared for user participants. When the conversation starts, The conversation lasts over 15 turns, and it is terminated by actors or workers if they feel fatigued or feel bored. They finish a survey after each conversation, and the reward is given afterward according to the amount of dialogue.
  5. After the whole collection phase, we answered our research questions briefly. First, In accommodating the construction of a successful persona dialogue dataset, the organizer should acknowledge that it differs a lot from usual conversation and it is crucial to handle unexpected and unwanted situations, which could be moderated by a expertise moderator. To look more into this, the moderator should resolve conflicts after constructing a rapport with participants so that they can report whatever they feel uncomfortable, at the same time empathizing and understanding their struggles. Recruiting them and managing finance is also a crucial role in that such environments can deter or boost the atmosphere of the project. We've also found that the whole process led to high quality generation of the persona dialogue dataset and recently disclosed it online, but our work is to be further investigated with more thorough experimental criteria, and to be presented as a more mature work afterwards.
  6. Our work is currently disclosed in the github of our funding agency, smilegate AI. also, we thank deep natural AI for building up the chat interface, recruiting participants from the worker pool, and moderating the whole process. Finally, we thank all our crowdworkers, including actors and users, who made up the whole dialogues and went through the survey and interviews. Since our work is in progress, we will soon disclose the whole analysis results with our full paper.
  7. Thank you for listening 