3. CHAPTER 1
CHAPTER 1
REVIEW OF PRINCIPLES
REVIEW OF PRINCIPLES
OF
HIGH
QUALITY
ASSESSMENT
OF
HIGH
QUALITY
ASSESSMENT
4. CLARITY
OF
LEARNING
TARGETS
APPROPRIATENESS
OF
ASSESSMENT
METHODS
PROPERTIES
OF
ASSESSMENT
METHODS
COGNITIVE
TARGETS SKILLS,
COMPE-TENCIES
AND
PRODUCTS,
OUTPUTS
AND
ABILITIES
TARGETS COGNITIVE TARGETS
PROJECTS
TARGETS
WRITTEN -
RESPONSE
INSTRUMENT
PERFOR-MANCE
TEST
PRODUCT
RATING
SCALES
ORAL
QUESTIO-NING
OBSER-VATION
AND
SELF
REPORTS
VALIDITY
RELIABILITY
FAIRNESS
PRACTICA-LITY
AND
EFFICIENCY
ETHICS
IN
ASSESSMENT
5. Assessment
can be made precise,
accurate and dependable
only if what are to be
achieved are clearly
stated and feasible .
6. We consider learning targets involving knowledge,
reasoning skills, products and effects.
Learning targets need to be stated in
behavioral terms
or
Terms that denote something which
can be observed
through
the behavior of the student.
1. Cognitive Targets
2. Skills, Competencies and Abilities Targets
3. Products, Outputs and Project Targets
7. 1. COGNITIVE TARGETS
As early as the
1950’s, Bloom
(1954), proposed
a hierarchy of
educational
objectives as the
cognitive level.
These are:
8. Knowledge
Refers to the
acquisition of Facts,
Concepts and
Theories.
Knowledge of Historical
Facts
like the DATE of
EDSA revolution
Knowledge about the
Discovery “Philippines”
Magellan
March 15 1521
9. Knowledge
Forms the foundation of all
other cognitive objectives for w/o
knowledge, it is not possible to
move up to the next higher level of
thinking skills in the hierarchy of
educational objectives.
10. Comprehension
Refers to the same concept as
“understanding”.
It is a step higher than mere
acquisition of facts and
involves a cognition of
awareness of the
interrelationships of facts and
concepts
Ex: (knowledge of facts).
The Spaniards ceded the
Philippines
to the Americans in 1898.
In effect, the Philippines declared
independence from the Spanish
rule only to be ruled by yet
another foreign power, the
Americans (comprehension)
11. APPLICATION
Refers to the transfer of
knowledge from one
field
of study to another or
from one concept in the
same discipline.
Ex: The classic experiment Pavlov on
dogs showed that animals can be
conditioned to respond in a certain
way to certain stimuli.
The same principle can be applied in
the context of teaching and learning
on behavior modification for school
children.
12. ANALYSIS
Refers to the breaking
down of a concept or
idea into its
components and
explaining the concept
as a composition of
these concepts.
Ex: Poverty in the Philippines,
particularly at the barangay level,
can be traced back to the low
income levels of families in such
barangays and the propensity for
large households w/ an average of
about 5 children per family.
(Note:Poverty is analyzed in the
context of income and number of
children.
13. SYNTHESIS
Refers to the opposite
of analysis and entails
putting together the
components in order
to summarize the
concept.
Ex: The field of geometry
Replete w/ examples of synthetic
lessons. from the relationship of the
parts of a triangle for instance, one
can deduce that the sum of the
angles of a triangle is 180˚.
14. Evaluate the actors professionals,
amateurs, or students?
Criticize the actors capable of
dealing with the script's
requirements?
(Be fair to the actors in your
assessment of their talents and the
EVALUATION AND
REASONING
Refers to valuing and
judgment or putting the
“worth” of a concept or
principle.
Students make judgments about the
value of ideas, items, materials, and
more.
Students are expected bring in all
they have learned to make informed
and sound evaluations of material.
Key Words for the Evaluation
Category:
evaluate, appraise, conclude, criticize,
critique
Ex:
Watch an stage play and write a
critique of the actor’s performance.
15. 2. SKILLS, COMPETENCIES AND ABILITIES
TARGETS
Skills refer to specific
activities or tasks that a
student can proficiently
do
e.g. skills in coloring,
language skills
Skills can be clustered
together to form specific
competencies e.g.
Birthday card making.
Related competencies
characterize student’s
ability. (DACUM, 2000)
16. Abilities can be roughly categorized into:
cognitive, psychomotor and affective abilities
Ability to work well w/
others & to be trusted
by every classmate
(affective ability)
is an indication
that the student
can most likely
succeed in work
that requires
leadership
abilities.
Other students are better at
doing things alone like
programming & web
designing (cognitive ability)
and, therefore, they would
be good at highly technical
individualized work.
17. 3. PRODUCTS, OUTPUTS AND PROJECTS TARGETS
Tangible and concrete evidence of student’s ability
A clear target for products and projects need to clearly
specify the level of worksmanship of such projects
e.g. expert level, skilled level or novice level.
18. Once the learning targets
are clearly set, it is now
necessary to determine an
appropriate assessment
procedure or method.
19. B. APPROPRIATENESS
OF
ASSESSMENT
METHODS
1.Written-Response Instruments
2. Product Rating Scales
3. Performance Test
4. Oral Questioning
5. Observation and Self Reports
20. 1. WRITTEN-RESPONSE INSTRUMENTS
OBJECTIVE TESTS
a.Multiple Choice
b.True-False
c.Matching or Short
Answer
TESTS,
ESSAYS,
EXAMINATIONS AND
CHECKLIST
21. Appropriate for assessing the various levels of
hierarchy of educational objectives.
Require a user
to choose or provide a response
to a question whose correct answer
is predetermined.
Such a question might require a student to :
a. select a solution from a set of choices
(multiple choice, true-false, matching)
b. identify an object or position (graphical )
c. supply brief numeric or text responses
22. 1. MULTIPLE CHOICE TEST
In particular can be constructed in such a way as to test
What is higher-level thinking?
What do we mean by higher-level thinking? Benjamin Bloom
described six levels of cognitive behavior, listed here from
the most basic – Knowledge – at the bottom to the most
complex – Evaluation – at the top:
Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge
higher order thinking skills.
23. Students must evaluate multiple pieces of evidence, then apply that
evidence to solve a problem, student must select the best action to take
with the evidence.
Tim’s second grade teacher is concerned because of the following observations
about Tim’s behavior in class:
Withdraws from peers on the playground and during groupwork
Often confuses syllables in words (ex: says mazagine instead of magazine)
Often confuses b and d, p and q, etc. when writing or recognizing letters
The teacher has arranged a meeting with Tim’s mother to discuss these
concerns. Which of the following statements is best for the teacher to say
to Tim’s mother?
a. Tim needs extra practice reading and
writing problematic letters and words at
home at least 30 minutes per day.
b. Please discuss the importance of
schoolwork to Tim so that he will increase
his efforts in classwork.
c. These are possible symptoms of
dyslexia so I would like to refer him to a
specialist for diagnosis.
d. Please adjust Tim’s diet because he is
most likely showing symptoms of ADHD
due to food allergies.
Explanation: C is the best answer because
the behaviors could be symptoms of
dyslexia.
24. When properly planned, can test the student’s grasp of the
higher level cognitive skills
particularly in the areas of application analysis, synthesis,
and judgment.
Questions - “precise”,
PARAMETERS - “properly defined”
Write an essay about the first EDSA revolution.
(give add’l. requirements to give focus)
Focus on the main characters and their respective roles in
the revolution
25. 2. PRODUCT RATING SCALES
A Teacher is often tasked to rate
products.
1. Book reports
2. Maps
3. Charts
4. Diagrams
5. Notebooks
6. Essays
7. Creative endeavors
26. Purpose
The CAT is often administered to
determine a child's readiness for
promotion to a more advanced
grade level and may also be used
by schools to satisfy state or local
testing requirements.
The test report includes a scale
score, which is the basic
measurement of how a child
performs on the assessment
scale score: determined by the
total number of test items correct
or through item-pattern scoring
27. One of the most frequently used
measurement instruments is the
checklist.
A performance checklist consists of
a list of behaviors that make up a
certain type of performance (e.g.
Using a microscope, typing a letter,
solving a mathematics performance
and so on).
It is used to determine whether or not
an individual behaves in a certain
(usually desired) way when asked to
complete a particular task.
If a particular behavior is present
when an individual is observed, the
teacher places a check opposite it on
the list.
28. The traditional Greeks used oral questioning extensively as an
assessment method, Socrates himself, considered the epitome
(perfect example of a particular quality) of a teacher, was said to
have handled his classes solely based on questioning and oral
interactions,
Oral questioning is an appropriate assessment method when the
objectives are:
a.) to assess student’s stock knowledge and/or
b.) to determine the student’s ability to communicate ideas
in coherent (logical and consistent) verbal sentences.
Of particular significance are the student’s state of mind and
feelings, anxiety and nervousness in making oral presentations
w/c could mask the student’s true ability.
29. Useful supplementary (additional) assessment methods when
used in conjunction (connects) w/ oral questioning and
performance tests.
A Tally Sheet is a device often
used by teachers to record the
frequency of student behaviors,
activities or remarks.
A Self-checklist is a list of
several characteristics or
activities presented to the
subjects of a study.
30. C. PROPERTIES
OF
ASSESSMENT
METHODS
1. Validity
2. Reliability
3. Fairness
4. Practicality and efficiency
5. Ethics in assessment
31. The quality of the assessment
instrument and method used in education is
very important since the evaluation and
judgment that the teacher gives on a student
are based on the information he obtains using
these instruments.
32. 1. validity
Defined as the
instrument’s ability to
measure what it purports
(intention) to measure.
Defined as referring to
the appropriateness,
correctness,
meaningfulness and
usefulness of the
specific conclusions that
a teacher reaches
regarding the teaching-learning
situation.
33. Content Validity
refers to the content
and format of the
instrument How
appropriate is the
content? How
comprehensive?
How adequately
does the sample
items or questions
represent the
content to be
assessed? Is the
format
appropriate?
Does the instrument
logically get the
intended variable or
factor?
34. Content and
Format
-Consistent w/ the
definition of
variable or factor
to be measured
-1. Do students
have adequate
experience w/ the
type of task posed
by the item?
35. Content and
Format
2. Did the
teachers cover
sufficient material
for most students
to be able to
answer the item
correctly?
36. Content and
Format
3. Does the item
reflect the degree
of emphasis
received during
instruction?
37. Two (2) Forms of
Content Validity Table
FORM A: ITEM VALIDITY
CRITERIA
I T E M
1. Material covered
sufficiently.
2. Most students are
able to answer
item correctly.
3. Students have
prior
experience w/
the type of task.
4. Decision:Accept
or Reject
1 2 3 4 5 6
FORM B: ENTIRE TEST
KNOWLEDGE/
SKILLS AREA
ESTIMATED
PERCENT
OF INSTN.
PERCENT.
OF ITEMS
COVERED
IN TEST
1. Knowledge
2. Comprehension
3. Application
4. Analysis
5. Synthesis
6. Evaluation
38. Two (2) Forms of
Content Validity Table
FORM B: ENTIRE TEST
KNOWLEDGE/
SKILLS AREA
ESTIMATED
PERCENT OF
INSTRUCTION
PERCENT.
OF ITEMS
COVERED
IN TEST
1. Knowledge
2. Comprehension
3. Application
4. Analysis
5. Synthesis
6. Evaluation
Based on Form
B, adjustments
in the number
of items that
relate to a topic
can be made
accordingly.
39. Two (2) Forms of
Content Validity Table
FORM A: ITEM VALIDITY
CRITERIA
I T E M
1. Material covered
sufficiently.
2. Most students are
able to answer
item correctly.
3. Students have
prior
experience w/ the
type
of task.
4. Decision:Accept
or Reject
1 2 3 4 5 6
While Content Validity is important
Two(2)
Types of Validity
1. Face Validity
Outward appearance of the test
lowest form of test validity.
2. Criterion-Related Validity
the test item is judged against
specific criterion, correlating
the test w/ a known valid test.
40. 1.Face
Validity
A test can be said to have
face validity if it "looks like" it
is going to measure what it is
supposed to measure.
For instance, if you prepare
a test to measure whether
students can perform
multiplication, and the people
you show it to all agree that it
looks like a good test of
multiplication ability, you have
shown the face validity of
your test.
41. 2. Criterion-related
Validity
(more important tupe)
The test item is judge
against a specific criterion
Can also be measured by
correlating the test with a
known valid test (as a
criterion)
A test needs to possess
construct validity
A “construct” is
another term for a
factor, and we
already know that
a group of
variables that
correlate highly w/
each other form a
factor.
42. Construct
let us say we are
conducting a study on
success in college. If
we find out there is a
high correlation betwe
en student grades in
high-school math
classes and their
success in college
(which can be
measured by many
possible variables),
43. Construct
We would say there is
high criterion-related
validity between
the intermediate variable
(grades in high-school
math classes) and the
ultimate variable
(success in college).
Essentially, the grades
students received in
high-school math can be
used to predict their
success in college.
44. 2. RELIABILITY
The reliability of an assessment
method refers to its
consistency. It is also a term
that is synonymous w/
dependability or stability.
Stability or internal
consistency as reliability
measures can be estimated in
several ways.
a. The Split-half Method
(using Spearman-Brown
prophecy formula)
b. The Kuder-Richardson
formula
45. a. The Split-half
Method
Involves scoring two halves of a
test separately for each person and
then calculating a correlation
coefficient for the two sets of
scores.
The coefficient indicates the
degreee to w/c the two halves of the
test provide the same results
Hence, describes the internal
consistency of the test.
Splitting a test to estimate reliability.
Example:
10 item test split (2)subtests,
A. 1st 1-5, 2nd 6-10
Responses:
1st half different- 2nd half
Reason:
increase in item difficulty
and fatigue
B. Odd items vs. even items
Guarantee:
each half will contain an equal
number of items from the
beginning, middle, and end of the
original test.
46. The Reliability of the test is
calculated using
The
Spearman–Brown prediction
formula,
also known as the
Spearman–Brown prophecy
formula
The method was published
independently
by Spearmanand Brown
(1910).
Reliability of test=2 x rhalf
1+ rhalf
Where,
rhalf=reliability of
half of the test
Charles Edward Spearman
(Father of the True Score
Theory of Reliability)
47. Correlation Score between the two
halves
Example:Five (5) Students
Test: 10 items Split-Half: odd vs. even
Result: 0.1336
Spearman–Brown prophecy
formula
Reliability of test=2 x rhalf
1+ rhalf
R = 2 x 0.1336
1 + 0.1336
R = 0.2672
1.1336
R = 0.2357
48. Reliability
b. The Kruder-Richardson
is the more frequently
employed formula for
determining internal
consitency,
particularly KR20
(more difficult to
calculate/requires a
computer program)
and KR21
Dr. Frederic Kuder (1903-2000)
one of the premier innovators of
vocational assessments.
His 1938 Kuder Preference
Record became one of the most-used
career guidance instruments
in schools and colleges, and was
taken by more than a million
people worldwide over the course
of several decades.
49. Reliability
The Kruder-Richardson
Formula:
KR20 = K { 1 – __Σ pq__}
(K – 1) (Variance)
Where,
K = number of items in the test
p = proportion of students who
answered the item correctly
q = proportion of students who
answered the item wrongly = 1 – p
pq = variance of a single item schored
dichotomously (right/wrong)
KR21 = K {1 – n (K – M)_}
(K – 1) K(Variance)
Where,
K = number of items on the test,
M = mean of the test,
Variance = variance of the test scores
The mean of a set of scores is simply the
sum of the scores divided bu the number
of scores; its variance is by:
Variance = Sum of differences of
individual scores and mean / n – 1
Where n is the number of test takers
50. Reliability
c. The Test-retest Method
of estimating reliability
Reliability of a test may also
mean the consistency of test
results when the same test is
administered at two different
time periods.
The estimate of test reliability
is then given by the correlation
of the two test results.
The test results only affected by
the amount of time.
The closer the period the
test given to the same
set of examiners
between the 1st and the 2nd ,
the higher the correlation.
The longer the gap
between the two test, the lower
the correlation.
51. 3. Fairness
An assessment procedure needs to be
fair.
Students needs to know
exactly what the learning targets
are and what method of
assessment will be used.
If students do not know what
they are supposed to be achieving,
then they could get lost in the
maze of concepts being discussed in
the class.
likewise, students have to be
informed how their progress will be
assessed in order to allow them to
strategize and optimize their
performance.
Assessment has to be
viewedas an opportunity to learn
rather than an oppurtunity to
weed out poor and slow learners
Fairness also implies
freedom from teacher-stereotyping.(
Biases)
Ex. Boys are better than Girls in
Math or Girls are better than
Boys in Language
52. 4. PRACTICALITY AND EFFICIENCY
Another Quality of a Good Assessment Procedure
Practical in the Sense that the Teacher should be familiar w/ it.
Does not require Too much Time (Implementable)
A Complex Assessmentt Procedure
tends to be Difficult to Score and Interpret.
Resulting in a lot of Misdiagnosis
Or Too Long a Feedback Period
w/c may render the Test Inefficient
8/28/2014 52
53. 5. ETHICS IN ASSESSMENT
The Term “Ethics” refers to questions of Right and Wrong
When Teachers think about Ethics,
they need to ask themselves
If it is Right to Assess a Specific
Knowledge or Investigate a Certain
Question.
Are there some aspects of the
Teaching-Learning situation
that should
Not to be Assessed?
54. ETHICS IN ASSESSMENT
Here are some situations in w/x assessment may not be called for:
Requiring Students to
answer checklist of their
sexual fantasies;
Asking elementary pupils to
answer sensitive questions
w/o consent of their parents;
Testing the mental abilities
of pupils using an instrument
whose validity and reliability
are unknown;
55. ETHICS IN ASSESSMENT
When aTeacher Thinks about Ethics the Basic Question to
ask in this regard is.
“Will any Physical or Psychological harm come to any one
as a result of assessment or testing?”
Naturally, no
Teacher would want
this to happen to
any of his/her
student.
56. ETHICS IN ASSESSMENT
Ethical (behavior) “conforming to the standards of conduct
Of a given profession or group” (Webster)
The Fundamental Responsibility of a Teacher
The Most Important Ethical Consideration of all
To Do All in his/her power to Ensure that
Participants in an Assessment Program
Are Protected from Physical/Psychological harm
“ “ “ Discomfort or Danger
that may arise due to the testing procedure
“A Teacher who wishes to test-Physical Endurance may ask Students
to climb a very steep mountain thus Endagering them physically.”
57. ETHICS IN ASSESSMENT
Should be known only by the student concerned and the teacher
58. Deception
(3rd Ethical issue in
assessment)
There are instances in w/c it is necessary to conceal the objective of the
assessment from the students in order to ensure fair and impartial results.
Teacher’s Special Responsibility
59. Finally, the temptation to assist certain individuals in
class during assessment or testing is ever present.
In this case, it is best if the teacher does not administer
the test himself if he believes that such a concern may,
at a later time, be considered unethical.