Evaluating Orphan and Vulnerable Children Outcomes: Innovative Methodology for, and Results of, Pilot Testing a New Toolkit
1. Evaluating Orphan and Vulnerable Children Outcomes
Innovative Methodology for, and Results of, Pilot Testing a New Toolkit
JENIFER CHAPMAN Futures Group/MEASURE Evaluation, USA // LISA PARKER Futures Group/MEASURE Evaluation, USA // STANLEY AMADIEGWU Futures Group/MEASURE Evaluation, Nigeria // SHEHU SALIHU Futures Group/MEASURE Evaluation, Nigeria // MWILA KANEMA Futures Group, Zambia
BACKGROUND
Despite high donor investment, the impact of orphan and vulnerable children programs is
unclear. Part of the challenge was a lack of standardized and tested measures and tools for
evaluating orphan and vulnerable children (OVC) program outcomes. To fill this gap, MEASURE
Evaluation produced three questionnaires to enable the collection of actionable data for
programs and comparative assessments of outcomes across interventions and regions. The
questionnaires were developed with strong stakeholder input, and aligned to the 2012 PEPFAR
OVC Programming Guidance and the U.S. Government Children in Adversity Action Plan (see
Box 1).
Box 1—MEASURE Evaluation OVC Survey Toolkit
The MEASURE Evaluation OVC survey toolkit includes three questionnaires that measure
the following:
1. Household outcomes and caregiver wellbeing (administered to a caregiver).
2. Wellbeing among children aged 0–9 years (administered to a caregiver).
3. Wellbeing among children aged 10–17 years (administered to a child with
guardian consent and child assent).
The questionnaires were designed to measure changes in child, caregiver and household
well-being that can reasonably be attributed to PEPFAR-funded program interventions.
The questionnaires may be applied in evaluation, situation analysis, or in other research.
We pilot tested the questionnaires in 2013 in Zambia and Nigeria. In Zambia, we piloted tested
the questionnaires under the USAID-funded impact evaluation of savings and internal lending
communities (SILC) on child well-being, led by Futures Group with World Vision and Catholic
Relief Services. In Nigeria, the pilot test was conducted in partnership with the PACT Rapid and
Effective Action Combatting HIV/AIDS (REACH) program and the Catholic Dioceses of Lafia in
Akwanga, Nasawara State.
OBJECTIVES
• Test the construct validity of some questions and concepts.
• Pre-test the reliability of scales.
• Determine whether any questions may be duplicative and, in some instances, to enable a
choice between different question versions.
• Test the clarity of the question sets as an entirety.
• Assess the reliability of recall periods and, in some cases, assess different recall periods; and
test field application of the tools.
METHODS
We applied a three-step methodology:
1. Validation of the translation of the questionnaire with data collectors in a training setting.
2. Cognitive interviews with potential respondents.
3. Pilot-testing the full questionnaires at the household level. These studies were approved
by Health Media Labs, Inc., in the United States, the Biomedical Research IRB in Lusaka, and
the National Health Research Ethics Committee in Abuja.
1. Validation of Translation
During the data collector training for each pilot test, the trainer led a discussion on each
questionnaire. The purpose was to orient data collectors to the questionnaire, enable them to
seek clarification on measures, and validate the translation. Changes were made by the original
questionnaire translator.
2. Cognitive Interview
Cognitive interviewing is a qualitative research technique used to help design questionnaires
by determining whether respondents understand the questions and are able to produce
expected responses (de Leeuw, Borgers & Smits, 2004). In Zambia, we conducted cognitive
interviews with 12 adults and 16 children purposively sampled from the program beneficiary
list in one ward. In Nigeria, we selected a purposive sample of 12 caregivers and 16 children
program beneficiary lists. After gaining participant consent, trained data collectors read
each question and recorded the participant’s response. The data collector then probed for
participant’s understanding of the question.
3. Household Survey Pilot Test of Complete Tools
We pilot tested the complete data collection tools and a few additional measures through
a household survey pre-test to assess respondent’s understanding, question flow, and to
determine the time needed to complete a full questionnaire, including recruitment. We pilot
tested the data collection process from start to finish, including informed consent, application
of the Kish Grid (Kish, 1949) for sampling at the household level, and administration of all
survey tools among 21 households in Zambia and 20 households in Nigeria. Households were
purposively sampled from program beneficiary lists. In Nigeria, data collectors returned to 10
of the 20 households the day after the survey to conduct a reliability check of 16 key measures.
After gaining participant consent, trained data collectors administered the questionnaires.
RESULTS
Overall, items were well understood and the process of data collection was efficient.
Data collectors were successful in conducting informed consent, using the Kish Grid, and
administering the survey tools with both adults and children.
Based on results and lessons learned from piloting, we made changes to a few questions. For
example, for the questions on access to money we changed the formulation of the question
to focus on a respondent’s specific experience accessing money to improve data quality. We
adapted questions requesting whether household expenditures had increased or decreased
by splitting the original questions and including additional skip patterns. We changed three
of four questions on social support based on a reliability analysis conducted on the full scale
from the Rand Medical Outcomes study. Finally we changed various recall periods to improve
participant understanding and response categories to add frequent responses not originally
included.
Of note, Likert scales were not well understand and were replaced in the final tools with binary
response options.
We pilot tested a variety of measures and questions that were omitted from the final tools
because either respondents did not understand the questions or concepts in the questions
well or they demonstrated poor variability, including: an extended asset schedule, two social
capital questions, the Hope Scale (Snyder et al, 1997) and two additional hope questions, two
self-esteem questions from the Rosenburg Self-Esteem Scale (Rosenburg, 1965), the General
Self-Efficacy scale (Schwrzer & Jerusalem, 1995), a stand-alone self-efficacy question, three
parental self-efficacy questions from the Parental Stress Scale (Berry & Jones, 1995), and the
strengths and difficulties questionnaire for children (Goodman, 1997).
There were very few questionnaire revisions after the Nigeria pilot, suggesting that the
majority of the problematic questions had been corrected prior to this second round of
piloting.
Pre-testing raised concern about how to best address child-headed households, as the
caregiver questionnaire is tailored to an adult respondent, and both child questionnaires
assume a previous interview with the adult caregiver.
CONCLUSIONS
Pilot testing of new tools is critical, yet rarely done. Results from these pilot tests informed
revision of the questionnaires, which are now ready for public use. We highly recommend this
three-pronged approach to pilot testing, with some suggested adaptations to the Cognitive
Interviews (Parker et al., 2014), to colleagues seeking to validate their data collection tools.
REFERENCES
Berry JO, & Jones WH (1995). The Parental Stress Scale: Initial psychometric evidence. Journal of
Social and Personal Relationships, 12, 463–472.
de Leeuw ED, Borgers N, & Smits A. (2004). Pretesting questionnaires for children and
adolescents. In S Presser, et al., (Eds.), Methods for testing and evaluating survey
questionnaires. Hoboken, NJ: Wiley & Sons.
Goodman R (1997) The Strengths and Difficulties Questionnaire: A Research Note. Journal of
Child Psychology and Psychiatry, 38, 581–586.
Kish L (1949). A procedure for objective respondent selection within the household. Journal of
the American Statistical Association 44(247): 380–87.
Parker L, Chapman J, Amadiegw S, Salihu S (2014). Using cognitive interviews in Nigeria to pre-test
child, caregiver, and household well-being survey tools for orphan and vulnerable children
programs. AIDS 2014, Melbourne, Australia, July 20–25, 2012.
RAND. Medical Outcomes Study Social Support Survey. Arlington, VA: RAND Corporation.
Available at: http://www.rand.org/health/surveys_tools/mos/mos_mentalhealth.html
(last accessed April 2013).
Schwarzer R, & Jerusalem M (1995). Generalized Self-Efficacy scale. In J. Weinman, S. Wright, &
M. Johnston, Measures in health psychology: A user’s portfolio. Causal and control beliefs
(pp. 35–37). Windsor, England: NFER-NELSON.
Snyder CR, et al., (1997). The Development and Validation of the Children’s Hope Scale. J Pediatr
Psychol. 22(3):399–421.
Rosenberg M. (1965). Society and the Adolescent Self-image. Princeton, NJ: Princeton University
Press.
ACKNOWLEDGEMENTS
This research has been supported by the President’s Emergency Plan for AIDS Relief (PEPFAR)
through the United States Agency for International Development (USAID) under the terms of
MEASURE Evaluation cooperative agreement GHA-A-00-08-00003-00 which is implemented by
the Carolina Population Center, University of North Carolina at Chapel Hill with Futures Group,
ICF International, John Snow, Inc., Management Sciences for Health, and Tulane University.
Views expressed are not necessarily those of PEPFAR, USAID, or the United States government.
CONTACT
Jenifer Chapman, Senior OVC Advisor, MEASURE Evaluation
Email: jchapman@futuresgroup.com
Website: measureevaluation.org
Evaluation