SlideShare a Scribd company logo
1 of 17
RELIABILITY
DEFINITION OF TERMS
• Reliability: consistency in measurement
• Reliability Coefficient: the ratio between true score
variance on a test and the total variance
• Measurement Error: factors associated with the process
of measuring some variable other than what is being
measured
• Random Error: error due to measuring a targeted variable
caused by unpredictable fluctuations and inconsistencies
of other variables in the measurement process
• Systematic Error: error due to measuring a variable
typically constant or proportionate to what is presumed to
be the true value of variable being measured
SOURCES OF ERROR
• Test Construction
• Variation among items within a test and between tests
• Test Interpretation
• Error variances occurring test administration can cause
alterations in attention or motivation
• Test Scoring
• Objective tests have well-documented reliability
• Subjective tests can be subject to source of error
variabce
CLASSICAL TEST THEORY
• It assumes that each person would have a true score
obtained if there were no errors in measurement
• Errors of measurement are random
• True scores will not change with repeated
applications of the same test
• X (observed true score) = T (true score) + E (error)
DOMAIN SAMPLING
THEORY
• Uses a limited number of items to represent a larger
and more complicated construct
• Reliability is the ratio of variance on shorter test and
variance of the long-run true score
• The greater the number of items, the greater the
reliability
ITEM RESPONSE THEORY
(LATENT TRAIT THEORY)
• Models the probability that a person with X ability
will be able to perform at a level of Y
• Discrimination: how an item differentiates among
people with higher or lower levels of whatever is
being measured
GENERALIZABILITY
THEORY
• Universe score
• Facets: number of items, amount of training raters
had, and purpose of the test; must be similar
• Generalizability Study: how much impact the
different facets have on the test score; how can
scores be generalized in different situations
• Decision Study: usefulness of test in helping the user
to make decisions
TEST-RETEST
RELIABILITY
• It is an estimate of reliability obtained by correlating
pairs of scores from the same people on different
administrations
• It is appropriate when the test measures something
that is relatively stable over time
• Coefficient of Stability: when the interval between
the obtained measures was greater than six months.
ALTERNATE-FORMS &
PARALLEL-FORMS RELIABILITY
• Parallel-Forms Reliability: the extent to which item
sampling and other errors have affected the test
scores on versions of the same test; means and the
variances of observed test scores are equal.
• Alternate-Forms Reliability: the estimate by which
the different forms of the test have been affected by
item sampling error or others
• Coefficient of Equivalence: the degree of the
relationship between various forms of the test
SPLIT-HALF RELIABILITY
• Obtain through the correlation of two pairs of scores from
equivalent halves of a test that was administered once
• Also known as the odd-even reliability
• Spearman-Brown Formula: estimates the correlation
between two-halves if each had been the length of the
whole test. This can be used to estimate the effect of
shortening the items onto the reliability of the test and
determine the number of items needed to attained a
desired level of reliability
𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑒𝑑 𝑟 =
2𝑟
1 + 𝑟- r is the Pearson’s r between two halves
SPLIT-HALF RELIABILITY
• Cronbach’s Alpha: when two-halves have unequal
variances.
𝛼 =
2 𝜎𝑥
2
− 𝜎 𝑦1
2
𝜎 𝑦2
2
𝜎𝑥
2
- 𝜎 𝑥
2
variance of scores on the whole test
- 𝜎 𝑦1
2 𝜎 𝑦2
2 variance for two separate halves of the test
• Coefficient Omega: used to estimate the extent by
which all items measure the same underlying trait.
INTER-ITEM CONSISTENCY
• The degree of correlation among all items on a scale
• Can be used to measure the degree of homogeneity
• Kuder-Richardson Formula: measures the inter-item
consistency for dichotomous items
𝐾𝑅20 =
𝑘
𝑘 − 1
1 −
𝑝𝑞
𝜎2
INTER-ITEM CONSISTENCY
• Coefficient Alpha: measures the inter-item
consistency of nondichotomous items; mean of all
possible split-half correlations
𝛼 =
𝑁
𝑁 − 1
𝑆2
− Σ𝑆ⅈ2
𝑆2
• Average Proportional Distance: focuses on the
difference between item scores. A value of <0.20 is an
indicator of excellent internal consistency
INTER-SCORER RELIABILITY
• The degree of agreement between scorers with
regard to particular measures
• Fleiss Kappa: the actual agreement proportion of the
potential agreement following correction for chance
agreement
STANDARD ERROR OF
MEASUREMENT
• Provides a measure of precision of an observed test
score.
• The higher the test reliability is, the lower the
standard error of measurement
𝑆𝐸𝑀 = 𝜎 1 − 𝑟
STANDARD ERROR OF
MEASUREMENT
• Standard error of measurements are used for
constructing confidence intervals around specific
observed scores in the attempt to inform the
probability that the true score lies within the range of
scores
• Z-scores are commonly used
• Reporting:
“Given the assessee obtained a score of ____, there are
two out of three chances that the assessee’s true score
would fall between ____ and ___”
IMPROVING RELIABILITY
• Quality of test items
• Adequate sampling of content domains
• Longer assessment
• Develop a scoring plan
• Ensure validity

More Related Content

What's hot

Correlational research
Correlational researchCorrelational research
Correlational research
Jijo G John
 
Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Validity, its types, measurement & factors.
Validity, its types, measurement & factors.
Maheen Iftikhar
 
reliability presentation.pptx
reliability presentation.pptxreliability presentation.pptx
reliability presentation.pptx
Ramsha Makhdum
 

What's hot (20)

Reliablity
ReliablityReliablity
Reliablity
 
Meaning and Methods of Estimating Reliability of Test.pptx
Meaning and Methods of Estimating Reliability of Test.pptxMeaning and Methods of Estimating Reliability of Test.pptx
Meaning and Methods of Estimating Reliability of Test.pptx
 
Correlational research
Correlational researchCorrelational research
Correlational research
 
Norms[1]
Norms[1]Norms[1]
Norms[1]
 
Monika seminar
Monika seminarMonika seminar
Monika seminar
 
Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Validity, its types, measurement & factors.
Validity, its types, measurement & factors.
 
Test standardization and norming
Test standardization and normingTest standardization and norming
Test standardization and norming
 
Independent group design
Independent group designIndependent group design
Independent group design
 
Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and Validity
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
psychological assessment standardization, evaluation etc
psychological assessment standardization, evaluation etc psychological assessment standardization, evaluation etc
psychological assessment standardization, evaluation etc
 
reliability presentation.pptx
reliability presentation.pptxreliability presentation.pptx
reliability presentation.pptx
 
Otis mental ability test
Otis mental ability testOtis mental ability test
Otis mental ability test
 
Randomize group design
Randomize group designRandomize group design
Randomize group design
 
Reliability and its types: Split half method and test retest methods
Reliability and its types: Split half method and test retest methodsReliability and its types: Split half method and test retest methods
Reliability and its types: Split half method and test retest methods
 
Validity &amp; reliability
Validity &amp; reliabilityValidity &amp; reliability
Validity &amp; reliability
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Correlational Research
Correlational ResearchCorrelational Research
Correlational Research
 
Reliability & validity
Reliability & validityReliability & validity
Reliability & validity
 
Validity
ValidityValidity
Validity
 

Similar to Reliability

unit 9 measurements presentation- short.ppt
unit 9 measurements presentation- short.pptunit 9 measurements presentation- short.ppt
unit 9 measurements presentation- short.ppt
MitikuTeka1
 
Validity and reliability of the instrument
Validity and reliability of the instrumentValidity and reliability of the instrument
Validity and reliability of the instrument
Bhumi Patel
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
cyrilcoscos
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
Louzel Linejan
 
Qualities of Good Test.pdf
Qualities of Good Test.pdfQualities of Good Test.pdf
Qualities of Good Test.pdf
FaheemGul17
 
Lesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
Lesson 11 Understanding Data and Ways to Systematically Collect Data.pptLesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
Lesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
GerfelChan1
 

Similar to Reliability (20)

unit 9 measurements presentation- short.ppt
unit 9 measurements presentation- short.pptunit 9 measurements presentation- short.ppt
unit 9 measurements presentation- short.ppt
 
Reliability of test
Reliability of testReliability of test
Reliability of test
 
Monika seminar
Monika seminarMonika seminar
Monika seminar
 
Reliability and validity1
Reliability and validity1Reliability and validity1
Reliability and validity1
 
Evaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptEvaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.ppt
 
RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYRELIABILITY AND VALIDITY
RELIABILITY AND VALIDITY
 
variables cont
variables contvariables cont
variables cont
 
Reliability and Validity.pptx
Reliability and Validity.pptxReliability and Validity.pptx
Reliability and Validity.pptx
 
Validity and reliability of the instrument
Validity and reliability of the instrumentValidity and reliability of the instrument
Validity and reliability of the instrument
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
 
JC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxJC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptx
 
Edm 202
Edm 202Edm 202
Edm 202
 
Establishing the English Language Test Reliability
 Establishing the  English Language Test Reliability  Establishing the  English Language Test Reliability
Establishing the English Language Test Reliability
 
Reliability
Reliability Reliability
Reliability
 
Validity and reliability in assessment.
Validity and reliability in assessment. Validity and reliability in assessment.
Validity and reliability in assessment.
 
Reliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdfReliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdf
 
Qualities of Good Test.pdf
Qualities of Good Test.pdfQualities of Good Test.pdf
Qualities of Good Test.pdf
 
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptxChapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
 
Lesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
Lesson 11 Understanding Data and Ways to Systematically Collect Data.pptLesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
Lesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
 

More from Martin Vince Cruz, RPm

More from Martin Vince Cruz, RPm (20)

Multivariatetechniques01
Multivariatetechniques01Multivariatetechniques01
Multivariatetechniques01
 
Late adulthood
Late adulthoodLate adulthood
Late adulthood
 
Emerging and Early Adulthood
Emerging and Early  AdulthoodEmerging and Early  Adulthood
Emerging and Early Adulthood
 
Middle and Late Childhood
Middle and Late ChildhoodMiddle and Late Childhood
Middle and Late Childhood
 
infancy
infancyinfancy
infancy
 
Introto lifespandevt
Introto lifespandevtIntroto lifespandevt
Introto lifespandevt
 
Feminist therapy
Feminist therapyFeminist therapy
Feminist therapy
 
Paraphilias
ParaphiliasParaphilias
Paraphilias
 
Somatic sexdysphoria
Somatic sexdysphoriaSomatic sexdysphoria
Somatic sexdysphoria
 
Anxiety disorders
Anxiety disordersAnxiety disorders
Anxiety disorders
 
Person centered therapy
Person centered therapyPerson centered therapy
Person centered therapy
 
Organizational culture
Organizational cultureOrganizational culture
Organizational culture
 
Anxiety disorders
Anxiety disordersAnxiety disorders
Anxiety disorders
 
Counselor: Person and Professional
Counselor: Person and ProfessionalCounselor: Person and Professional
Counselor: Person and Professional
 
Abnormal Behavior in the Historical Context
Abnormal Behavior in the Historical ContextAbnormal Behavior in the Historical Context
Abnormal Behavior in the Historical Context
 
George kelly
George kellyGeorge kelly
George kelly
 
Raymond cattell
Raymond cattellRaymond cattell
Raymond cattell
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Using SPSS: A Tutorial
Using SPSS: A TutorialUsing SPSS: A Tutorial
Using SPSS: A Tutorial
 
Review of Statistics
Review of StatisticsReview of Statistics
Review of Statistics
 

Recently uploaded

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 

Recently uploaded (20)

Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 

Reliability

  • 2. DEFINITION OF TERMS • Reliability: consistency in measurement • Reliability Coefficient: the ratio between true score variance on a test and the total variance • Measurement Error: factors associated with the process of measuring some variable other than what is being measured • Random Error: error due to measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process • Systematic Error: error due to measuring a variable typically constant or proportionate to what is presumed to be the true value of variable being measured
  • 3. SOURCES OF ERROR • Test Construction • Variation among items within a test and between tests • Test Interpretation • Error variances occurring test administration can cause alterations in attention or motivation • Test Scoring • Objective tests have well-documented reliability • Subjective tests can be subject to source of error variabce
  • 4. CLASSICAL TEST THEORY • It assumes that each person would have a true score obtained if there were no errors in measurement • Errors of measurement are random • True scores will not change with repeated applications of the same test • X (observed true score) = T (true score) + E (error)
  • 5. DOMAIN SAMPLING THEORY • Uses a limited number of items to represent a larger and more complicated construct • Reliability is the ratio of variance on shorter test and variance of the long-run true score • The greater the number of items, the greater the reliability
  • 6. ITEM RESPONSE THEORY (LATENT TRAIT THEORY) • Models the probability that a person with X ability will be able to perform at a level of Y • Discrimination: how an item differentiates among people with higher or lower levels of whatever is being measured
  • 7. GENERALIZABILITY THEORY • Universe score • Facets: number of items, amount of training raters had, and purpose of the test; must be similar • Generalizability Study: how much impact the different facets have on the test score; how can scores be generalized in different situations • Decision Study: usefulness of test in helping the user to make decisions
  • 8. TEST-RETEST RELIABILITY • It is an estimate of reliability obtained by correlating pairs of scores from the same people on different administrations • It is appropriate when the test measures something that is relatively stable over time • Coefficient of Stability: when the interval between the obtained measures was greater than six months.
  • 9. ALTERNATE-FORMS & PARALLEL-FORMS RELIABILITY • Parallel-Forms Reliability: the extent to which item sampling and other errors have affected the test scores on versions of the same test; means and the variances of observed test scores are equal. • Alternate-Forms Reliability: the estimate by which the different forms of the test have been affected by item sampling error or others • Coefficient of Equivalence: the degree of the relationship between various forms of the test
  • 10. SPLIT-HALF RELIABILITY • Obtain through the correlation of two pairs of scores from equivalent halves of a test that was administered once • Also known as the odd-even reliability • Spearman-Brown Formula: estimates the correlation between two-halves if each had been the length of the whole test. This can be used to estimate the effect of shortening the items onto the reliability of the test and determine the number of items needed to attained a desired level of reliability 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑒𝑑 𝑟 = 2𝑟 1 + 𝑟- r is the Pearson’s r between two halves
  • 11. SPLIT-HALF RELIABILITY • Cronbach’s Alpha: when two-halves have unequal variances. 𝛼 = 2 𝜎𝑥 2 − 𝜎 𝑦1 2 𝜎 𝑦2 2 𝜎𝑥 2 - 𝜎 𝑥 2 variance of scores on the whole test - 𝜎 𝑦1 2 𝜎 𝑦2 2 variance for two separate halves of the test • Coefficient Omega: used to estimate the extent by which all items measure the same underlying trait.
  • 12. INTER-ITEM CONSISTENCY • The degree of correlation among all items on a scale • Can be used to measure the degree of homogeneity • Kuder-Richardson Formula: measures the inter-item consistency for dichotomous items 𝐾𝑅20 = 𝑘 𝑘 − 1 1 − 𝑝𝑞 𝜎2
  • 13. INTER-ITEM CONSISTENCY • Coefficient Alpha: measures the inter-item consistency of nondichotomous items; mean of all possible split-half correlations 𝛼 = 𝑁 𝑁 − 1 𝑆2 − Σ𝑆ⅈ2 𝑆2 • Average Proportional Distance: focuses on the difference between item scores. A value of <0.20 is an indicator of excellent internal consistency
  • 14. INTER-SCORER RELIABILITY • The degree of agreement between scorers with regard to particular measures • Fleiss Kappa: the actual agreement proportion of the potential agreement following correction for chance agreement
  • 15. STANDARD ERROR OF MEASUREMENT • Provides a measure of precision of an observed test score. • The higher the test reliability is, the lower the standard error of measurement 𝑆𝐸𝑀 = 𝜎 1 − 𝑟
  • 16. STANDARD ERROR OF MEASUREMENT • Standard error of measurements are used for constructing confidence intervals around specific observed scores in the attempt to inform the probability that the true score lies within the range of scores • Z-scores are commonly used • Reporting: “Given the assessee obtained a score of ____, there are two out of three chances that the assessee’s true score would fall between ____ and ___”
  • 17. IMPROVING RELIABILITY • Quality of test items • Adequate sampling of content domains • Longer assessment • Develop a scoring plan • Ensure validity