Strategies for Landing an Oracle DBA Job as a Fresher
Adventures in Crowdsourcing: Research at UT Austin & Beyond
1. Adventures in Crowdsourcing:
Research at UT Austin & Beyond
Matt Lease
School of Information @mattlease
University of Texas at Austin ml@ischool.utexas.edu
2. Outline
• Foundations
• Work at UT Austin
• A Few Roadblocks
– Workflow Design
– Sensitive data
– Regulation
– Fraud
– Ethics
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 2
3. Amazon Mechanical Turk (MTurk)
• Marketplace for crowd labor (microtasks)
• Created in 2005 (still in “beta”)
• On-demand, scalable, 24/7 global workforce
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 3
5. Snow et al. (EMNLP 2008)
• MTurk annotation for 5 Tasks
– Affect recognition
– Word similarity
– Recognizing textual entailment
– Event temporal ordering
– Word sense disambiguation
• 22K labels for US $26
• High agreement between
consensus labels and
gold-standard labels
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 5
6. Sorokin & Forsythe (CVPR 2008)
• MTurk for Computer Vision
• 4K labels for US $60
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 6
7. Kittur, Chi, & Suh (CHI 2008)
• MTurk for User Studies
• “…make creating believable invalid responses as
effortful as completing the task in good faith.”
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 7
8. Alonso et al. (SIGIR Forum 2008)
• MTurk for Information Retrieval (IR)
– Judge relevance of search engine results
• Various follow-on studies (design, quality, cost)
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 8
9. Social & Behavioral Sciences
• A Guide to Behavioral Experiments
on Mechanical Turk
– W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
– L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research:
Insights from Mechanical Turk
– Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of
Inexpensive, Yet High-Quality, Data?
– M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 9
11. What about data quality?
• Many CS papers on statistical methods
– Online vs. offline, feature-based vs. content-agnostic
– Worker calibration, noise vs. bias, weighted voting
– Work in my lab by Jung, Kumar, Ryu, & Tang
• Human factors also matter!
– Instructions, design, interface, interaction
– Names, relationship, reputation
– Fair pay, hourly vs. per-task, recognition, advancement
– For contrast with MTurk, consider Kochhar (2010)
• See Lease, HComp‘11 11
12.
13. Kovashka & Lease, CrowdConf’10
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 13
14. Grady & Lease, 2010 (Search Eval.)
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 14/10
15. Noisy Supervised Classification
Kumar and Lease, 2011(a)
Our 1st study of aggregation (Fall’10)
Simple idea, simulated workers
Highlights concepts & open questions
16. Problem
• Crowd labels tends to be noisy
• Can reduce uncertainty via wisdom of crowds
– Collect & aggregate multiple labels per example
• How do we to maximize learning (labeling effort)?
– Label a new example?
– Get another label for an already-labeled example?
See: Sheng, Provost & Ipeirotis, KDD’08
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 16
17. Setup
• Task: Binary classification
• Learner: C4.5 decision tree
• Given
– An initial seed set of single-labeled examples (64)
– An unlimited pool of unlabeled examples
• Cost model
– Fixed unit cost for labeling any example
– Unlabeled examples are freely obtained
• Goal: Maximize learning rate (for labeling effort)
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 17
18. Compare 3 methods: SL, MV, & NB
• Single labeling (SL): label a new example
• Multi-Labeling: get another label for pool
– Majority Vote (MV): consensus by simple vote
– Naïve Bayes (NB): weight vote by annotator accuracy
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 18
19. Assumptions
• Example selection: random
– From pool for SL, from seed set for multi-labeling
• Fixed commitment to a single method a priori
• Balanced classes (accuracy, uniform prior)
• Annotator accuracies are known to system
– In practice, must estimate these: from gold data
(Snow et al. ’08) or EM (Dawid & Skene’79)
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 19
20. Simulation
• Each annotator
– Has parameter p (prob. of producing correct label)
– Generates exactly one label
• Uniform distribution of accuracies U(min,max)
• Generative model for simulation
– Pick an example x (with true label y*) at random
– Draw annotator accuracy p ~ U(min,max)
– Generate label y ~ P(y | p, y*)
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 20
21. Evaluation
• Data: datasets from UCI ML Repository
– Mushroom
– Spambase http://archive.ics.uci.edu/ml/datasets.html
– Tic-Tac-Toe
– Chess: King-Rook vs. King-Pawn
• Same trends across all 4, so we report first 2
• Random 70 / 30 split of data for seed+pool / test
• Repeat each run 10 times and average results
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 21
22. p ~ U(0.6, 1.0)
• Fairly accurate annotators (mean = 0.8)
• Little uncertainty -> little gain from multi-labeling
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 22
23. p ~ U(0.4, 0.6)
• Very noisy (mean = 0.5, random coin flip)
• SL and MV learning rates are flat
• NB wins by weighting more accurate workers
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 23
24. p ~ U(0.1, 0.7)
• Worsen accuracies further (mean = 0.4)
• NB virtually unchanged
• SL and MV predictions become anti-correlated
– We should actually flip their predictions…
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 24
25. Label flipping
• Is NB doing better due to how it uses accuracy,
or simply because it’s using more information?
• Average accuracy < 50% --> label usually wrong
– NB implicitly captures; SL and MV do not
• Label flipping: put all methods on even-footing
• Simple case of bias vs. noise
– Issue is not whether correlated or anti-correlated
– Issue is strength of correlation
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 25
26. p ~ U(0.1, 0.7)
No flipping
fter
With flipping
Mushroom Dataset Spambase Dataset
100
90
80 80
70
60 SL accuracy (%) SL accuracy (%)
60
MV accuracy(%)
40 MV accuracy(%) 50
NB accuracy (%)
40
20 NB accuracy (%)
30
0 20
64 128 256 512 1024 2048 4096 64 128 256 512 1024 2048
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 26
27. Summary of study
• Detecting anti-correlated (bad) workers
more important than the model used
• Open Questions
– When accuracies are estimated (noisy)?
– With actual error distribution (real data)?
– With different learners or tasks (e.g. ranking)?
– With dynamic choice of new example or re-label?
– With active learning example selection?
– With imbalanced classes?
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 27
39. Workflow Management
• How should we balance automation vs.
human computation? Who does what?
• Who’s the right person for the job?
• Juggling constraints on budget, scheduling,
quality, effort …
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 39
40. What about sensitive data?
• Not all data can be publicly disclosed
– User data (e.g. AOL query log, Netflix ratings)
– Intellectual property
– Legal confidentiality
• Need to restrict who is in your crowd
– Separate channel (workforce) from technology
– Hot question for adoption at enterprise level
40
41. What about regulation?
• Wolfson & Lease (ASIS&T’11)
• As usual, technology is ahead of the law
– employment law
– patent inventorship
– data security and the Federal Trade Commission
– copyright ownership
– securities regulation of crowdfunding
• Take-away: don’t panic, but be mindful
– Understand risks of “just in-time compliance”
41
42. What about fraud?
• Some reports of robot “workers” on MTurk
– Artificial Artificial Artificial Intelligence
– Violates terms of service
• Why not just use a captcha?
42
44. Requester Fraud on MTurk
“Do not do any HITs that involve: filling in
CAPTCHAs; secret shopping; test our web page;
test zip code; free trial; click my link; surveys or
quizzes (unless the requester is listed with a
smiley in the Hall of Fame/Shame); anything
that involves sending a text message; or
basically anything that asks for any personal
information at all—even your zip code. If you
feel in your gut it’s not on the level, IT’S NOT.
Why? Because they are scams...”
44
48. Identifying Workers (Uniquely)
• Need for identifiable workers
– Repeated labeling
– Recognizing “Master Workers”
• Today
– Platforms assign IDs intended to be unique
– Problem in practice, esp. with multiple platforms
– Sybil attacks
• Identity value
– If workers interchangeable, identities are disposable
– If workers are distinguished, identifies become valuable
– Reduce some types of attacks, increase others
August 23, 2012 Matt Lease - ml@ischool.utexas.edu 48
49. What about ethics?
Fort, Adda, and Cohen (2011)
• “…opportunities for our community to deliberately
value ethics above cost savings.”
• Suggest we focus on unpaid games; narrow solution
Silberman, Irani, and Ross (2010)
• “How should we… conceptualize the role of these
people who we ask to power our computing?”
• Power dynamics between parties
• “Abstraction hides detail”
49
52. Digital Dirty Jobs
• The Googler who Looked at the Worst of the Internet
• Policing the Web’s Lurid Precincts
• Facebook content moderation
• The dirty job of keeping Facebook clean
• Even linguistic annotators report stress &
nightmares from reading news articles!
52
53. What about freedom?
• Vision: empowering worker freedom:
– work whenever you want for whomever you want
• Risk: people being compelled to perform work
– As crowdsourcing grows, greater $$$ at stake
– Digital sweat shops? Digital slaves?
– Prisoners used for gold farming
– We really don’t know much today
– Traction? Human Trafficking at MSR Summit’12
53
54. Thank You!
Students: Past & Present
– Catherine Grady (iSchool)
– Hyunjoon Jung (iSchool)
– Jorn Klinger (Linguistics)
– Adriana Kovashka (CS)
– Abhimanu Kumar (CS)
ir.ischool.utexas.edu/crowd
– Hohyon Ryu (iSchool)
– Wei Tang (CS)
– Stephen Wolfson (iSchool)
Support
– John P. Commons Fellowship
– Temple Fellowship
Matt Lease - ml@ischool.utexas.edu - @mattlease 54