This document discusses the validity and reliability of questionnaires. It defines validity as the ability of a questionnaire to measure what it intends to measure. There are several types of validity discussed, including content validity, face validity, criterion validity (concurrent and predictive), and construct validity. Steps for validating a questionnaire include evaluating face validity and getting expert feedback to establish content validity. Reliability is the ability to get consistent results and is measured through test-retest reliability, internal consistency (split-half), and inter-rater reliability. Establishing both validity and reliability is important for developing a high-quality questionnaire.
2. CONTENTS
Introduction
Steps in questionnaire designing
Validity
Concept of validity
Types of validity
Steps in questionnaire validation
Reliability
Types and measurement of reliability
Conclusion
References
3. INTRODUCTION
Questionnaire: Important method of data collection used
extensively
Advantages of questionnaire
Less expensive
Offers greater anonymity
Disadvantages
Application is limited
Response rate is low
Opportunities to clarify issues is lacking
4. Ideal requisites of a questionnaire:
Should be clear and easy to understand
Layout is easy to read and pleasant to eye
Sequence of questions easy to follow
Should be developed in an interactive style
Sensitive questions must be worded exactly
NOTE: The terminologies research instrument, measuring
instrument,scale and test in various parts of the seminar
represent questionnaire in this context . . . And item represents
each question in a questionnaire
7. The concept of validity
Validity is the ability of an instrument to measure what it is intended to
measure.
Degree to which the researcher has measured what he has set out to
measure (Smith, 1991)
Are we measuring what we think we are measuring? (Kerlinger, 1973)
Extent to which an empirical measure adequately reflects the real
meaning of the concept under consideration (Babbie, 1989)
8. Why validity ?
Validity is done mainly to answer the following questions:
Is the research investigation providing answers to the research
questions for which it was undertaken?
If so, is it providing these answers using appropriate methods and
procedures?
10. Logical thinking
Justification of each question in relation to objective of study
Easy if questions relate to tangible matters
Difficult in situations where we are measuring attitude,
effectiveness of a program, satisfaction etc
Everybody’s logic doesn’t match . . No statistical backing
13. CONTENT VALIDITY
Uses logical reasoning and hence easy to apply
Extent to which a measuring instrument covers a
representative sample of the domain of the aspects measured
Whether items and questions cover the full range of the
issues or problem being measured
14. FACE VALIDITY
The extent to which a measuring instrument appears valid
on its surface
Each question or item on the research instrument must have a
logical link with the objective
15. Face validity is not content validity. Why?
Face validity
Simply addresses whether a measuring instrument looks
valid
Not a validity in technical sense because it does not refer
to what is actually being measured rather what it appears
to measure
It has more to do with rapport and public relations than
with actual validity
16. Other aspects of content validity
Coverage of issue should be balanced
Each aspect should have similar and adequate representation
in questions
17. Problems associated with content validity
Based on subjective logic; no definitive conclusion can be
drawn or consensus reached
Extent to which questions reflect the objectives of the study
may differ. If wordings changed or question substituted,
magnitude of link changes
18. CRITERION VALIDITY
The extent to which a measuring instrument accurately
predicts behaviour or ability in a given area.
The measuring instrument is called ‘criteria’
It is of two types:
Predictive validity
Concurrent validity
19. Predictive validity
If the test is used to predict future performance
Eg: Entrance exam . . . . Performance of these tests correlates
with later performance in professional college
Eg: Written driving test
Eg: measurement of sugar exposure for caries development
20. Concurrent validity
If the test is used to estimate present performance or person’s
ability at the present time not attempting to predict future
outcomes
Professional college exam
Eg: driving test, pilot test
Eg: measurement of DMFT for caries experience
21. Problems in criterion validity
Cannot be used in all circumstances
Esp in social sciences where some conditions do not have a
relevant criteria
Eg: for measuring self-esteem, no criteria can be applied
22. CONSTRUCT VALIDITY
Most important type of validity
Assesses the extent to which a measuring instrument
accurately measures a theoretical construct it is designed to
measure
Measured by correlating performance on the test with
performance on a test for which construct validity has been
determined
Eg: a new index for measuring caries can be validated by
comparing its values with a standard index (like DMFT)
23. Another method is to show that scores of the new test differs
across people with different levels of outcomes being
measured
Eg: Establishing the validity of a new caries index by
applying it to different stages of dental caries and calculating
its accuracy
24. Summary of Validity
CONTENT CRITERION CONSTRUCT
CONCURRENT PREDICTIVE
What it
measures
Whether the test
covers a
representative
sample of the
domains to be
measured
The ability of
the test to
estimate present
performance
The ability of the
test to predict
future
performance
The extent to
which the
instrument
measures a
theoretical
construct
How it is
accomplished
Ask experts to
assess the test to
establish that the
items are
representative of
the outcome
Correlate
performance on
the test with a
concurrent
behaviour
Correlate
performance on
the test with a
behaviour in
future
Correlate
performance on
the instrument
with a
performance on
an established
instrument
26. FACE VALIDITY
Evaluate in terms of:
Readability
Layout
and style
Clarity of wording
Feasibility
27. CONTENT VALIDITY
Two phases
Researcher: Conceptualization and
domain analysis
Experts: Enhancement of content of
questionnaire (Seven or more
experts)
Specify the full domain of
content that is relevant to
the issue
Sample specific areas form
this domain
Put items/questions in a form
that is testable
28. How do experts evaluate validity
Method 1: Average Congruency Percentage (ACP)
[Popham, 1978]
Experts compute the percentage of questions deemed
to be relevant for them
Take the average of all experts
If the value is > 90 . . . Valid
Eg: 2 experts . . (Expert 1-100%, Expert 2-80%)
Then ACP = 90%
29. Method 2: Content validity index [Martuza 1977]
Content validity Index for individual items (I-CVI)
Content Validity Index for the scale (S-CVI)
30. I-CVI
Panel of content experts asked to review the relevance of
each question on a 4-point Likert scale (minimum 3
maximum 10 experts)
1= not relevant
2= somewhat relevant
3= relevant
4= very relevant
Then for each question, number of experts giving 3 or 4
score is counted (3,4 – relevant; 1,2 – nonrelevant)
Proportion is calculated
Eg: If 4/5 experts give score 3 or 4: I-CVI = 0.80
31. Critics of I-CVI
Collapses experts multipoint assessment into two categories
(relevant and non-relevant)
Does not give inference on comprehensiveness of whole
questionnaire
Problem of chance agreement. To overcome that, Lynn
proposed
Five or fewer experts: all must agree (I-CVI = 1.0)
Six or more: (I-CVI should not be less than 0.78)
32. S-CVI
The proportion of items on an instrument that achieved a
rating of 3 or 4 by all the content experts
Two approaches:
S-CVI/UA – Universal agreement
S-CVI/Ave - Average
33. Which would be an effective measure here ??
S-CVI/UA or S-CVI/Ave
34.
35. Which to follow?
Report both the values I-CVI and S-CVI rather than using
CVI as an acronym
Report the range of I-CVI values
The best method is S-CVI/UA for stringent validity, but
will be difficult to use if multiple experts are validating. .
In such situations S-CVI/Ave is used
36. CONSTRUCT VALIDITY
Method: Factor analysis
To examine empirically the interrelationship among items and to
identify clusters of items that share sufficient variation to justify
their existence as a factor or construct to be measured by the
instrument
Various items are gathered into common factors
Common factors are synthesized into fewer factors and then
relation between each item and factor is measured
Unrelated items are eliminated
38. RELIABILITY
Definition: It is the ability of an instrument to create
reproducible results
Each time it is used, similar scores should be obtained
A questionnaire is said to be reliable if we get same/similar
answers repeatedly
Though it cannot be calculated exactly, it can be measured
by estimating correlation coefficients
39. Reliability measured in aspects of:
• Done to ensure that same results are obtained
when used consecutively for two or more times
• Test-retest method is used
STABILITY
• To ensure all subparts of a instrument measure
the same characteristic (Homogeneity)
• Split-half method
INTERNAL
CONSISTENCY
• Used when two observers study a single
phenomenon simulataneously
• Inter-rater reliability
EQUIVALENCE
40. Test-Retest reliability (for stability)
Test administered twice to the same participant at different
times
Used for things that are stable over time
Easy and straight-forward approach
Useful for questionnaires, checklist, rating scales etc
Disadvantages
Practice effect (mainly for tests)
Too short intervals in between (effect of memory)
Some traits may change with time
41. Statistical calculation
Administration of instrument to a sample on two
different occasions
Scores compared and calculated by using
correlation coefficient formula (pearson)
42. Correlation coefficient
Measures the degree of relationship between two sets of
scores
Can range from -1 to +1
0 indicates absence of any relationships
Correlation coefficient Strength of relationship
+/- 0.7 to 1.0 Strong
+/- 0.3 to 0.69 Moderate
+/- 0.0 to 0.29 None to weak
43. Split halves reliability (homogenity)
Split the contents of the questionnaire into two equivalent
halves; either odd/even number or first/second half
Correlate scores of one half with scores of the other
Formula: r = Σ (x-x’)(y-y’)
√ Σ(x-x’)2 (y-y’)2
But this r is only for the half, so to check reliability of
entire test, use the formula
44. R’ = 2r/1+r
(r = coefficient of split half, R’ = coefficient of entire
test)
Cronbach’s alpha:
Another method of calculation using the formula:
R = k/k-1 (1-Σσ1
2/σy2)
k = total number of items in list
σ1 = variance of individual items
σy2 = variance of total test scores
45. Inter-rater reliability (Equivalence)
Used when a single event is measured simultaneously and
independently by two or more trained observers
R = Number of agreements
Number of agreements + Number of disagreements
46. Summary of Reliability
TEST RETEST SPLIT HALF INTERRATER
What it
measures
Stability over
time
Equivalency of items Agreement between
raters
How it is
accomplished
Administer the
same test to the
same people at
two different
times
Correlate
performance for a
group of people on
two equivalent
halves of same test
Have multiple
researchers measure
same instrument and
determine percentage
of agreement between
them
47. Conclusion
Validated questionnaire
It is one which has undergone a validation procedure to
show that it accurately measures what it aims to,
regardless of who responds, when they respond, and to
whom they respond or when self-administered and whose
reliability has also been examined thereby:
Reducing bias and ambiguities
Better quality of data and credible information
48. In a nutshell . . . .
A questionnaire can be reliable but invalid . . .
But a valid questionnaire is always reliable . . .
50. References
Linda Del Greco, Walop W, Richard H McCarthy. Questionnaire
development: 2. Validity and Reliability. CMAJ. 1987;136:699–700.
Sushil S, Verma N. Questionnaire validation made easy. Eur J Sci Res.
2010;46(2):172–8.
Polit DF, Cheryl Tatano Beck. The Content Validity Index: Are You Sure
You Know What’s Being Reported? Critique and Recommendations. Res
Nurs Health. 2006;29:489–97.
Reliability and Validity Module 6. Cengage Learning; 2010.
51. Rama B Radhakrishna. Tips for Developing and Testing
Questionnaires/Instruments. J Ext. 2007;35(1):710–4.
06Article04.pdf [Internet]. [cited 2015 Apr 7]. Available from:
http://www.uk.sagepub.com/salkind2study/articles/06Article04.pdf
pta_6871_6791004_64131.pdf [Internet]. [cited 2015 Apr 7]. Available
from:
http://cfd.ntunhs.edu.tw/ezfiles/6/1006/attach/33/pta_6871_6791004_6
4131.pdf
Questionnaire designing and validation [Internet]. [cited 2015 Apr 7].
Available from:
http://www.jpma.org.pk/full_article_text.php?article_id=3414
52. Suresh K Sharma. Nursing Research and Statistics. 1st ed. New Delhi:
Elsevier Saunders;
Edward G, Richard Zeller. Reliability and Validity Assessment. New Delhi:
SAGE publication; 1979.
Ranjit Kumar. Research Methodology - A step by step guide for beginners.
3rd ed. New Delhi: SAGE publication; 2012.
Articles from Dr. Joe