Introduction to standard setting (cutscores)

•Download as PPTX, PDF•

1 like•1,536 views

Ever wonder how a cutscore is set on a certification/licensure test? This is a very brief intro to the topic of standard setting, that is, how cutscores (passing points) are set on credentialing exams using scientifically-backed research and rigorous psychometrics. Some approaches include the modified-Angoff, Bookmark, and Contrasting Groups. Visit www.assess.com to learn more.

Education

SETTING CUTSCORES
FOR CERTIFICATION
EXAMS
Nathan A. Thompson, Ph.D.
Vice President, ASC
Adjunct Faculty, University of Cincinnati

Why are cutscores necessary?
 As Glaser (1963) pointed out, the reason for
the existence of many tests is to make
decisions about people
 Mastery: Pass/Fail educational content
 Credentialing: Award/not professional credential
(certification, certificate, license)
 Pre-employment: Hire/not for job (or eligible as
candidate)
 University selection: Admission/not to university
or program

 From Livingston (1980), discussing the
rationale for cutscores:

Why are cutscores necessary?
 What does that mean?
 That most of what we want to measure is in a
continuum (knowledge, intelligence) and not
naturally in “states” (e.g., male/female)
 So we need to set a cutscore (or cutscores) on
the continuum to sort examinees into groups
that reflect interpretations and meanings that
are useful to us
 Pass is “qualified” and Fail is “unqualified”

How do we set a cutscore?
 As the Livingston excerpt notes, all cutscores
involve a level of subjectivity or arbitrariness
 The higher the stakes of the exam, the more
we need to reduce the arbitrariness
 Standard setting methods differ in their level of
objectivity
 A more objective method provides an anchor
to validity and defensibility

How do we set a cutscore?
Approach Example Arbitrarine
ss
Arbitrary round
number
70% of items MOST
Quota Whatever passes 85%
of people (z=-1.0)
MOST
Examinee-based Borderline,
Contrasting Groups
LEAST
Content-based Angoff, Bookmark LEAST

Examinee-based methods
 Borderline Method
 Experts familiar with content AND all examinees
identify those examinees they consider
“borderline”
 The mean or median score for those examinees
is the cutscore
 Contrasting Groups Method
 Experts familiar with content AND all examinees
sort examinees into Pass and Fail Groups (or
external criterion is used)
 The point where the two score distributions cross
is the cutscore

Examinee-based methods
 Are conceptually appealing but have two large
disadvantages:
 Require examinees to take the test first, so
pass/fail decisions cannot be made after they
finish the test
 Require a way to assign examinees into groups
WITHOUT test scores – either experts that are
familiar with all examinees or some sort of “gold
standard”
 Example: For a practice test, results on the real test
can be used as a gold standard to set cutscore

Content-based methods
 The Angoff and Bookmark methods require
experts to look at items rather than candidates
 Bookmark: pilot all items, analyze difficulty
statistics, order the items by difficulty in a
booklet, and ask experts to place a bookmark
 Angoff: All experts provide a rating 0 to 100 for
each item, average serves as cutscore

Content-based methods
 The Angoff method is the most commonly
used approach in certification testing and
therefore quite legally defensible
 Biggest advantage: does not require test to be
administered for data
 Can use data too, with Beuk Compromise, to
incorporate examinee-based aspects
 The drawback is that it requires a group of
subject matter experts to rate all items, which
can take time

Content-based methods
 The Bookmark method has the advantage that
a rating is not required for every item from
every expert (which takes a lot of time)
 The drawback is that it requires all items to be
delivered to a decent-sized sample in order to
obtain item difficulty statistics (might not be
feasible)

What's hot

Development and use of the checklistLiza Javier

Constructing subjective test itemsInternational advisers

Comparison of criterion referenced and norm referenced assessmentDr. Amjad Ali Arain

measurment, testing & eveluationmpazhou

Principles of student assessment in medical education 2017 SATYA sathyanarayanan varadarajan

Introduction to assessmentRitu71

Developing short answer questions (sa qs)Javed Iqbal

Differences between testing and assessmentsShilpi Agrawal

MCQ Workshop - Dr Jane HollandIreland & UK Moodlemoot 2012

Test, measurement, assessment, evaluation and testingDr. Amjad Ali Arain

Assessment (Blueprint)Alwyn Lau

How to create multiple choice questionsJennifer Morrow

Constructing Subjective type of Achievement TestHennaAnsari

Objective TestsAmna Qureshi

Standardized testing.pptx 2Jesullyna Manuel

Choice Based Credit SystemMadan Mankotia

Assessment and evaluationDr. Diptansu Bhusan Pati

Test scoressungwon_ciel

Introduction to Alternative AssessmentSam Llaguno

Formative Assessment vs. Summative Assessmentjcheek2008

What's hot (20)

Development and use of the checklist

Constructing subjective test items

Comparison of criterion referenced and norm referenced assessment

measurment, testing & eveluation

Principles of student assessment in medical education 2017 SATYA

Introduction to assessment

Developing short answer questions (sa qs)

Differences between testing and assessments

MCQ Workshop - Dr Jane Holland

Test, measurement, assessment, evaluation and testing

Assessment (Blueprint)

How to create multiple choice questions

Constructing Subjective type of Achievement Test

Objective Tests

Standardized testing.pptx 2

Choice Based Credit System

Assessment and evaluation

Test scores

Introduction to Alternative Assessment

Formative Assessment vs. Summative Assessment

Similar to Introduction to standard setting (cutscores)

Usability Testingmbrosset

Evaluation in EducationKusum Gaur

Doing your systematic review: managing data and reportingUniversity of Liverpool Library

Chapters 7 8EDUCAUSE

Advancing Testing Using AxiomsPaul Gerrard

week 9 - interviewnigelium

Survey design workshopJames Neill

Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010TEST Huddle

Validity and reliability of questionnairesVenkitachalam R

Surveys and test redesign 1 456marylee6657

Chapter 3 - Evaluation Rubric Criteria Does Not Meet 0.01 .docxAbhinav816839

Test Managementsuci maisaroh

SCIENTIFIC MERIT ACTION RESEARCH TEMPLATE (SMART) FORMa..docxkenjordan97598

Doing a systematic review: top tips for progressing your reviewUniversity of Liverpool Library

Unit 9c. Data Collection tools.pptxshakirRahman10

Staffing in Org internal selectionKiruthika D

Pan eSeminar Two For One Using Assessments For Selection & Development 06...sarahklacey

Eurobsdcon 2011Dru Lavigne

Tools Of Data Collection.pptxPariNaz10

22.10.17 instrumentation and data collectionAdibah Latif

Similar to Introduction to standard setting (cutscores) (20)

Usability Testing

Evaluation in Education

Doing your systematic review: managing data and reporting

Chapters 7 8

Advancing Testing Using Axioms

week 9 - interview

Survey design workshop

Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010

Validity and reliability of questionnaires

Surveys and test redesign 1 456

Chapter 3 - Evaluation Rubric Criteria Does Not Meet 0.01 .docx

Test Management

SCIENTIFIC MERIT ACTION RESEARCH TEMPLATE (SMART) FORMa..docx

Doing a systematic review: top tips for progressing your review

Unit 9c. Data Collection tools.pptx

Staffing in Org internal selection

Pan eSeminar Two For One Using Assessments For Selection & Development 06...

Eurobsdcon 2011

Tools Of Data Collection.pptx

22.10.17 instrumentation and data collection

Recently uploaded

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco

Oppenheimer Film Discussion for Philosophy and FilmStan Meyer

Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7

Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO

Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar

Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxExcellence Foundation for South Sudan

Transaction Management in Database Management SystemChristalin Nelson

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543

MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir

4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239

Mattingly "AI & Prompt Design: Large Language Models"National Information Standards Organization (NISO)

Paradigm shift in nursing research by RS MEHTABP KOIRALA INSTITUTE OF HELATH SCIENCS,, NEPAL

Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar

How to Fix XML SyntaxError in Odoo the 17Celine George

Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW

ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1

Measures of Position DECILES for ungrouped dataBabyAnnMotar

How to Make a Duplicate of Your Odoo 17 DatabaseCeline George

week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2

Recently uploaded (20)

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf

Oppenheimer Film Discussion for Philosophy and Film

Using Grammatical Signals Suitable to Patterns of Idea Development

Daily Lesson Plan in Mathematics Quarter 4

Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx

Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx

Transaction Management in Database Management System

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)

MS4 level being good citizen -imperative- (1) (1).pdf

4.11.24 Mass Incarceration and the New Jim Crow.pptx

Mattingly "AI & Prompt Design: Large Language Models"

Paradigm shift in nursing research by RS MEHTA

Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx

How to Fix XML SyntaxError in Odoo the 17

Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW

ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv

Measures of Position DECILES for ungrouped data

How to Make a Duplicate of Your Odoo 17 Database

week 1 cookery 8 fourth - quarter .pptx

Introduction to standard setting (cutscores)

1. SETTING CUTSCORES FOR CERTIFICATION EXAMS Nathan A. Thompson, Ph.D. Vice President, ASC Adjunct Faculty, University of Cincinnati

2. Why are cutscores necessary?  As Glaser (1963) pointed out, the reason for the existence of many tests is to make decisions about people  Mastery: Pass/Fail educational content  Credentialing: Award/not professional credential (certification, certificate, license)  Pre-employment: Hire/not for job (or eligible as candidate)  University selection: Admission/not to university or program

3.  From Livingston (1980), discussing the rationale for cutscores:

4. Why are cutscores necessary?  What does that mean?  That most of what we want to measure is in a continuum (knowledge, intelligence) and not naturally in “states” (e.g., male/female)  So we need to set a cutscore (or cutscores) on the continuum to sort examinees into groups that reflect interpretations and meanings that are useful to us  Pass is “qualified” and Fail is “unqualified”

5. How do we set a cutscore?  As the Livingston excerpt notes, all cutscores involve a level of subjectivity or arbitrariness  The higher the stakes of the exam, the more we need to reduce the arbitrariness  Standard setting methods differ in their level of objectivity  A more objective method provides an anchor to validity and defensibility

6. How do we set a cutscore? Approach Example Arbitrarine ss Arbitrary round number 70% of items MOST Quota Whatever passes 85% of people (z=-1.0) MOST Examinee-based Borderline, Contrasting Groups LEAST Content-based Angoff, Bookmark LEAST

7. Examinee-based methods  Borderline Method  Experts familiar with content AND all examinees identify those examinees they consider “borderline”  The mean or median score for those examinees is the cutscore  Contrasting Groups Method  Experts familiar with content AND all examinees sort examinees into Pass and Fail Groups (or external criterion is used)  The point where the two score distributions cross is the cutscore

8. Examinee-based methods  Are conceptually appealing but have two large disadvantages:  Require examinees to take the test first, so pass/fail decisions cannot be made after they finish the test  Require a way to assign examinees into groups WITHOUT test scores – either experts that are familiar with all examinees or some sort of “gold standard”  Example: For a practice test, results on the real test can be used as a gold standard to set cutscore

9. Content-based methods  The Angoff and Bookmark methods require experts to look at items rather than candidates  Bookmark: pilot all items, analyze difficulty statistics, order the items by difficulty in a booklet, and ask experts to place a bookmark  Angoff: All experts provide a rating 0 to 100 for each item, average serves as cutscore

10. Content-based methods  The Angoff method is the most commonly used approach in certification testing and therefore quite legally defensible  Biggest advantage: does not require test to be administered for data  Can use data too, with Beuk Compromise, to incorporate examinee-based aspects  The drawback is that it requires a group of subject matter experts to rate all items, which can take time

11. Content-based methods  The Bookmark method has the advantage that a rating is not required for every item from every expert (which takes a lot of time)  The drawback is that it requires all items to be delivered to a decent-sized sample in order to obtain item difficulty statistics (might not be feasible)

Introduction to standard setting (cutscores)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to standard setting (cutscores)

Similar to Introduction to standard setting (cutscores) (20)

More from Nathan Thompson

More from Nathan Thompson (8)

Recently uploaded

Recently uploaded (20)

Introduction to standard setting (cutscores)