Test Construction

STEPS IN TEST
DEVELOPMENT
• Test Conceptualization
• Test Construction
• Test Tryout
• Item Analysis
• Test Revision

STEP 1: TEST
CONCEPTUALIZATION
• The process can be traced through thoughts
• “There ought to be a test designed to measure _____ in
such and such way”
• An emerging phenomenon or pattern of behavior might
serve as the stimulus for test conceptualization
• Pilot Work: the generalized term for preliminary research
surrounding the creation of the test prototype
• Items must be subject to pilot studies to evaluate whether
or not they should be included in the final form of the test

STEP 1: TEST
CONCEPTUALIZATION
• Criterion-Referenced: based on the amount of
knowledge and/or the level of competence ;
employed in licensing
• Norm-Referenced: based on the performance of a
specific group; employed in educational contexts;
mastery of material; existing base of knowledge and
skills

STEP 2: TEST
CONSTRUCTION
• Scaling
• setting rules for assigning numbers in measurement
• process by which a measuring device is designed and
calibrated by which numbers are assigned to different
amounts of trait, attribute, or characteristic being
measured

STEP 2: TEST
CONSTRUCTION
• Scaling Methods
• Rankings of Experts
• Asking a panel of experts which would then rank the
behavioral indicators and provide a meaningful numerical
score
• Method of Equal-Appearing Intervals
• Developed by L. L. Thurstone (1929)
• A large number of true-false statements reflects positive and
negative attitudes
• Items would be in an interval scale
• Reliability and validity analyses are important to determine
the appropriateness and usefulness
• An item with a larger standard deviation would be dropped

STEP 2: TEST
CONSTRUCTION
• Scaling Methods
• Method of Absolute Scaling
• Obtaining a measure of absolute item difficulty based
on results for different age groups of testtakers
• Commonly used in group achievement and aptitude
testing
• Likert Scale
• Consists of ordered responses in a continuum
• Total score is obtained by adding the scores from
individual items

STEP 2: TEST
CONSTRUCTION
• Scaling Methods (cont’d)
• Guttman Scales
• Respondents that endorse a stronger statement will also
endorse on the milder ones
• Method of Empirical Keying
• Test items are selected based entirely on how well they
contrast a criterion group from a normative sample

STEP 2: TEST
CONSTRUCTION
• Scaling Methods (cont’d)
• Method of Rational Scaling
• All scale items correlate positively with each other and
with the total score for each scale
• Method of Paired Comparisons
• Testtakers are presented with pairs of stimuli which they
will be asked to compare
• Categorical Scaling
• Stimuli are placed into one of two or more alternative
categories that differ quantitatively with respect to
some continuum.

STEP 2: TEST
CONSTRUCTION
• Writing Items
• Define clearly what you want to measure
• Generate an item pool
• Avoid exceptionally long items
• Keep the level of difficulty appropriate for those who
will
• Avoid double-barreled items that convey two or more
ideas at the same time
• Consider mixing positively and negatively worded itms

STEP 2: TEST
CONSTRUCTION
• Approaches to Test Construction:
• Rational (Theoretical) Approach
• Reliance on reason and logic over data collection for
statistical analysis
• Empirical Approach
• Reliance on data gathering to identify items that relate to the
construct
• Bootstrap
• Combination of rational and empirical approaches based on a
theory, then an empirical approach will be used to identify
items that are highly related to the construct

STEP 2: TEST
CONSTRUCTION
• Item Format: form, plan, structure, arrangement, and
layout of individual test items
• Multiple choice
• Matching
• Binary-choice (i. e., True or False)
• Short Answer

STEP 2: TEST
CONSTRUCTION
• Scoring Models
• Cumulative
• the number of items endorsed/responded to match the key
which represents the construct being measured
• Class/Category
• the placement of an individual to a particular class for
description or prediction
• Ipsative
• the indication of how an individual performed on one scale
within the given test

STEP 3: TEST TRYOUT
• The test should be tried out on people who are
similar in critical respects to the people to whom the
test was designed
A x 5 to 10 = n
A = items on a questionnaire
n = participants
• For validation purposes, there must be at least 20
participants each
• A good test helps in discriminating testtakers

STEP 4: ITEM ANALYSIS
• Item-Difficulty Index
• Calculation of the proportion of the total number of
testtakers that answered the test correctly
• The difficulty of the test can be found by averaging the
item-difficulty indices
• Item-Reliability Index
• Indication of the test’s internal consistenct
• Use factor analysis

• Item-Validity Index
• Indicates the degree on which a test is measuring what
it intends to measure
• Can be calculated by means of item score standard
deviation and the correlation between the item and
criterion score
• Item-Discrimination Index
• How an item discriminates high-scorers and the low-
scorers

• Considerations:
• Guessing
• Item fairness
• Speed Tests
• Qualitative Item Analysis
• Comparison of individual test items with one another and
the test as a whole

• “Think Aloud” Test Administration
• Innovative approach to cognitive assessment by
having respondents verbalize thoughts as they occur
• Expert Panels
• Sensitivity Review
• Testtakers could be interviewed

STEP 5: TEST REVISION
• Popular culture changes
• Adequacy of test norms
• Changes in reliability or validity
• Theoretical modifications

• Cross-Validation
• Revalidation of a test on a sample of testtakers other
than those on whom test performance was originally
found to be a valid predictor of some criterion
• Co-validation
• A validation process conducted on two or more tests
using the same sample of testtakers

• Quality Assurance
• Anchor Protocol
• Produced by a highly authoritative scorer designed to model
scoring and resolve discrepancies that goes along with it
• Scoring Drift
• Discrepancy between scoring in an anchor protocol and
another protocol
• Evaluate properties of existing tests and guide in revisions
• Determine measurement equivalence across populations
• Development of item banks

Test Construction

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Test Construction

Similar to Test Construction (20)

More from Martin Vince Cruz, RPm

More from Martin Vince Cruz, RPm (20)

Recently uploaded

Recently uploaded (20)

Test Construction