SlideShare a Scribd company logo
1 of 17
A Brief
Introduction to
the 12 Steps of
Evaluation Data
Cleaning
Jennifer Ann Morrow, Ph.D.
University of Tennessee
Importance of Cleaning Data
• As evaluators we need our evaluation data to
be:
– Accurate
– Complete
– High quality
– Reliable
– Unbiased
– Valid
If We Don’t Clean Our Data
• Problems that can occur:
– Inaccurate/biased conclusions
– Increased error
– Reduced credibility
– Reduced generalizability
– Violation of statistical assumptions
1: Create a Data Codebook
• Contains all relevant information for your
evaluation project
• Suggestions for what to include:
– Electronic file names
– Variable names, variable labels, value labels
– Complete list of modified variables
– Citations for instrument sources
– Project diary
2: Create a Data Analysis Plan
• Your analysis plan should list each step you
will take when analyzing your data
• Suggestions for what to include:
– General instructions for data analysts
– List of datasets
– Evaluation questions
– Variables used for each analysis
– Specific analyses and graphics for each
evaluation question
3: Perform Initial Frequencies –
Round 1
• After organizing your codebook and analysis
plan you can now begin to start the data
cleaning process
• Conduct frequency analyses (frequencies,
percentages) for EVERY variable in your
evaluation dataset
• Suggestion:
– request a graphic (bar chart or histogram) for
each variable
4: Check for Coding Mistakes
• Coding errors are any values that are not
within the specified range for your variable
(e.g., you have a rating scale from 1-5 and
you have a value of 9)
• Suggestions:
– Compare all values to what is listed in your
codebook
– In many cases errors are unspecified missing
data values
5: Modify and Create Variables
• It is now time to modify your variables so
they can be used in your planned analyses
• Suggestions:
– Reverse code any variables that need to be
merged with others that are on the opposite
scale
– Recode any variables to match your codebook
– Create new variables (e.g., averages, total
scores) to be used for future analyses
6: Frequencies and Descriptives –
Round 2
• At this step you conduct frequency analyses
on every variable and descriptive analyses
on every continuous variable
• Suggestions:
– Review the following descriptives: mean,
median, mode, standard deviation, skewness,
kurtosis, minimum, and maximum
– Create standardized scores (i.e., Z-scores) for
every continuous variable
7: Search for Outliers
• Review your standardized scores and
histograms to check for outliers
• Outliers are scores that deviate greatly from
the mean (e.g., >/3.29/ standard
deviations) and potentially can create or
cover up statistical significance
• Suggestions:
– delete, transform, or alter (winsorize, trim,
modify) your outliers
8: Assess for Normality
• For many inferential statistics (e.g., analyses
of variance, regressions) your outcome
(dependent) variable should be normally
distributed (i.e., mean=median=mode)
• Suggestions:
– check to see if the values of your skewness and
kurtosis are greater than /2/
– Transform the variable, use a non-parametric
analysis, or modify your alpha level
9: Dealing with Missing Data
• You should always check to see if missing
data is random or non-random (i.e.,
patterns of missing data)
• Evaluation results can be misleading and
less generalizable
• Suggestions:
– Delete cases/variables with missing data,
estimate missing data, conduct analyses with
and without modifying variables
10: Examine Cell Sample Size
• For many of our analyses (e.g., group
difference statistics) we want to have equal
sample sizes in our cells of our design
• Unequal sample sizes lead to lower statistical
power and reduced generalizability
• Suggestions:
– Collapse categories within a variable, use a non-
parametric analysis, or apply a more stringent
alpha level
11: Frequencies and Descriptives –
The Finale
• Your data is now cleaned and ready to be
summarized!
• Conduct a final set of frequencies and
descriptives prior to conducting your
inferential statistics
• Suggestion:
– Use a variety of graphics and visual aids to
showcase your evaluation data for your clients
12: Assumption Testing
• For some inferential statistics (e.g.,
correlational analyses, group difference
analyses) you still need to address a few
additional assumptions in order to conduct
the analysis
• Suggestions:
– Some common assumptions are: homogeneity of
variance, linearity, independence of errors,
multicollinearity, and reliability
Some Helpful Resources
• YouTube videos
– http://www.youtube.com/watch?v=R6Cc5flsbsw
– http://www.youtube.com/watch?
v=5qhLDYr70MM&feature=channel&list=UL
• Websites
– http://clinistat.hk/internetresource.php
– http://pareonline.net/getvn.asp?v=9&n=6
• Software
– http://www.gnu.org/software/pspp/
– http://davidmlane.com/hyperstat/Statistical_analyses.html
Contact Information
Jennifer Ann Morrow, Ph.D.
Associate Professor of Evaluation, Statistics, and Measurement
Department of Educational Psychology and Counseling
University of Tennessee
Knoxville, TN 37996-3452
Email: jamorrow@utk.edu
http://web.utk.edu/~edpsych/eval_assessment/default.html

More Related Content

What's hot

Snowflake Data Governance
Snowflake Data GovernanceSnowflake Data Governance
Snowflake Data Governancessuser538b022
 
Managing Millions of Tests Using Databricks
Managing Millions of Tests Using DatabricksManaging Millions of Tests Using Databricks
Managing Millions of Tests Using DatabricksDatabricks
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionRTTS
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...Pieter De Leenheer
 
Enterprise Data World: Data Governance - The Four Critical Success Factors
Enterprise Data World: Data Governance - The Four Critical Success FactorsEnterprise Data World: Data Governance - The Four Critical Success Factors
Enterprise Data World: Data Governance - The Four Critical Success FactorsDATAVERSITY
 
Tool Kit: Business Analysis product (artefact) checklist
Tool Kit: Business Analysis product (artefact) checklistTool Kit: Business Analysis product (artefact) checklist
Tool Kit: Business Analysis product (artefact) checklistdesigner DATA
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleAdam Doyle
 
Data science life cycle
Data science life cycleData science life cycle
Data science life cycleManoj Mishra
 
Pressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metricsPressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metricsSeema Kamble
 
Data Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesData Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesDATAVERSITY
 
A Data Management Maturity Model Case Study
A Data Management Maturity Model Case StudyA Data Management Maturity Model Case Study
A Data Management Maturity Model Case StudyDATAVERSITY
 
Data Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing ConcernData Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing ConcernAmin Chowdhury
 
Data Management Services
Data Management ServicesData Management Services
Data Management ServicesBackOfficePro
 
Test case template
Test case templateTest case template
Test case templatesephalika
 
Software Engineering - chp3- design
Software Engineering - chp3- designSoftware Engineering - chp3- design
Software Engineering - chp3- designLilia Sfaxi
 
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship ManagementBig MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship ManagementCaserta
 
Scalability Testing: A Complete Guide
Scalability Testing: A Complete GuideScalability Testing: A Complete Guide
Scalability Testing: A Complete GuideProfessional QA
 
Verification and Validation in Software Engineering SE19
Verification and Validation in Software Engineering SE19Verification and Validation in Software Engineering SE19
Verification and Validation in Software Engineering SE19koolkampus
 
Pressman ch-1-software
Pressman ch-1-softwarePressman ch-1-software
Pressman ch-1-softwareAlenaDion
 

What's hot (20)

Snowflake Data Governance
Snowflake Data GovernanceSnowflake Data Governance
Snowflake Data Governance
 
Managing Millions of Tests Using Databricks
Managing Millions of Tests Using DatabricksManaging Millions of Tests Using Databricks
Managing Millions of Tests Using Databricks
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solution
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...
 
Enterprise Data World: Data Governance - The Four Critical Success Factors
Enterprise Data World: Data Governance - The Four Critical Success FactorsEnterprise Data World: Data Governance - The Four Critical Success Factors
Enterprise Data World: Data Governance - The Four Critical Success Factors
 
Tool Kit: Business Analysis product (artefact) checklist
Tool Kit: Business Analysis product (artefact) checklistTool Kit: Business Analysis product (artefact) checklist
Tool Kit: Business Analysis product (artefact) checklist
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
Data science life cycle
Data science life cycleData science life cycle
Data science life cycle
 
Pressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metricsPressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metrics
 
Data Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesData Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical Approaches
 
A Data Management Maturity Model Case Study
A Data Management Maturity Model Case StudyA Data Management Maturity Model Case Study
A Data Management Maturity Model Case Study
 
Data Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing ConcernData Quality: A Raising Data Warehousing Concern
Data Quality: A Raising Data Warehousing Concern
 
Project Estimation
Project EstimationProject Estimation
Project Estimation
 
Data Management Services
Data Management ServicesData Management Services
Data Management Services
 
Test case template
Test case templateTest case template
Test case template
 
Software Engineering - chp3- design
Software Engineering - chp3- designSoftware Engineering - chp3- design
Software Engineering - chp3- design
 
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship ManagementBig MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
 
Scalability Testing: A Complete Guide
Scalability Testing: A Complete GuideScalability Testing: A Complete Guide
Scalability Testing: A Complete Guide
 
Verification and Validation in Software Engineering SE19
Verification and Validation in Software Engineering SE19Verification and Validation in Software Engineering SE19
Verification and Validation in Software Engineering SE19
 
Pressman ch-1-software
Pressman ch-1-softwarePressman ch-1-software
Pressman ch-1-software
 

Similar to Brief Introduction to the 12 Steps of Evaluation Data Cleaning

Mba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation aMba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation aRai University
 
Workshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate LevelWorkshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate LevelHiram Ting
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysisILRI-Jmaru
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxellamangapis2003
 
Lecture_4_Data_Gathering_and_Analysis.pdf
Lecture_4_Data_Gathering_and_Analysis.pdfLecture_4_Data_Gathering_and_Analysis.pdf
Lecture_4_Data_Gathering_and_Analysis.pdfAbdullahOmar64
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and ProcessingMehul Gondaliya
 
Final spss hands on training (descriptive analysis) may 24th 2013
Final spss  hands on training (descriptive analysis) may 24th 2013Final spss  hands on training (descriptive analysis) may 24th 2013
Final spss hands on training (descriptive analysis) may 24th 2013Tin Myo Han
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysisShwetabh Jaiswal
 
Design of experiments BY Minitab
Design of experiments BY MinitabDesign of experiments BY Minitab
Design of experiments BY MinitabDr-Jitendra Patel
 
Data analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptxData analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptxJuma675663
 
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsVivastream
 
Data Analysis and Synthesis & Techniques of System.pptx
Data Analysis and Synthesis & Techniques of System.pptxData Analysis and Synthesis & Techniques of System.pptx
Data Analysis and Synthesis & Techniques of System.pptxTs. Heshalini Rajagopal
 
UNIT 4.pptx
UNIT 4.pptxUNIT 4.pptx
UNIT 4.pptxSreeLatha98
 
Data warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniquesData warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniquesVaibhav Khanna
 

Similar to Brief Introduction to the 12 Steps of Evaluation Data Cleaning (20)

lecture-8.pdf
lecture-8.pdflecture-8.pdf
lecture-8.pdf
 
Mba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation aMba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation a
 
Workshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate LevelWorkshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate Level
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysis
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptx
 
Lecture_4_Data_Gathering_and_Analysis.pdf
Lecture_4_Data_Gathering_and_Analysis.pdfLecture_4_Data_Gathering_and_Analysis.pdf
Lecture_4_Data_Gathering_and_Analysis.pdf
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and Processing
 
Data mining
Data miningData mining
Data mining
 
Final spss hands on training (descriptive analysis) may 24th 2013
Final spss  hands on training (descriptive analysis) may 24th 2013Final spss  hands on training (descriptive analysis) may 24th 2013
Final spss hands on training (descriptive analysis) may 24th 2013
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysis
 
Design of experiments BY Minitab
Design of experiments BY MinitabDesign of experiments BY Minitab
Design of experiments BY Minitab
 
How to Think Like A Statistician
How to Think Like A StatisticianHow to Think Like A Statistician
How to Think Like A Statistician
 
Data analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptxData analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptx
 
Hm306 week 1 ppt 1
Hm306 week 1 ppt 1Hm306 week 1 ppt 1
Hm306 week 1 ppt 1
 
Hm306 week 1 ppt A
Hm306 week 1 ppt AHm306 week 1 ppt A
Hm306 week 1 ppt A
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisions
 
Data Analysis and Synthesis & Techniques of System.pptx
Data Analysis and Synthesis & Techniques of System.pptxData Analysis and Synthesis & Techniques of System.pptx
Data Analysis and Synthesis & Techniques of System.pptx
 
UNIT 4.pptx
UNIT 4.pptxUNIT 4.pptx
UNIT 4.pptx
 
Data warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniquesData warehouse 16 data analysis techniques
Data warehouse 16 data analysis techniques
 

More from Jennifer Morrow

Using Collaborative and Expressive Writing Activities to Educate First-Year S...
Using Collaborative and Expressive Writing Activities to Educate First-Year S...Using Collaborative and Expressive Writing Activities to Educate First-Year S...
Using Collaborative and Expressive Writing Activities to Educate First-Year S...Jennifer Morrow
 
Preparing to go on the job market: Strategies for academic and non-academic j...
Preparing to go on the job market: Strategies for academic and non-academic j...Preparing to go on the job market: Strategies for academic and non-academic j...
Preparing to go on the job market: Strategies for academic and non-academic j...Jennifer Morrow
 
Collecting Longitudinal Evaluation Data in a College Setting
Collecting Longitudinal Evaluation Data in a College SettingCollecting Longitudinal Evaluation Data in a College Setting
Collecting Longitudinal Evaluation Data in a College SettingJennifer Morrow
 
What is program evaluation lecture 100207 [compatibility mode]
What is program evaluation lecture   100207 [compatibility mode]What is program evaluation lecture   100207 [compatibility mode]
What is program evaluation lecture 100207 [compatibility mode]Jennifer Morrow
 
APA Version 6 Quick Guide
APA Version 6 Quick GuideAPA Version 6 Quick Guide
APA Version 6 Quick GuideJennifer Morrow
 
How to create multiple choice questions
How to create multiple choice questionsHow to create multiple choice questions
How to create multiple choice questionsJennifer Morrow
 

More from Jennifer Morrow (6)

Using Collaborative and Expressive Writing Activities to Educate First-Year S...
Using Collaborative and Expressive Writing Activities to Educate First-Year S...Using Collaborative and Expressive Writing Activities to Educate First-Year S...
Using Collaborative and Expressive Writing Activities to Educate First-Year S...
 
Preparing to go on the job market: Strategies for academic and non-academic j...
Preparing to go on the job market: Strategies for academic and non-academic j...Preparing to go on the job market: Strategies for academic and non-academic j...
Preparing to go on the job market: Strategies for academic and non-academic j...
 
Collecting Longitudinal Evaluation Data in a College Setting
Collecting Longitudinal Evaluation Data in a College SettingCollecting Longitudinal Evaluation Data in a College Setting
Collecting Longitudinal Evaluation Data in a College Setting
 
What is program evaluation lecture 100207 [compatibility mode]
What is program evaluation lecture   100207 [compatibility mode]What is program evaluation lecture   100207 [compatibility mode]
What is program evaluation lecture 100207 [compatibility mode]
 
APA Version 6 Quick Guide
APA Version 6 Quick GuideAPA Version 6 Quick Guide
APA Version 6 Quick Guide
 
How to create multiple choice questions
How to create multiple choice questionsHow to create multiple choice questions
How to create multiple choice questions
 

Recently uploaded

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 

Recently uploaded (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 

Brief Introduction to the 12 Steps of Evaluation Data Cleaning

  • 1. A Brief Introduction to the 12 Steps of Evaluation Data Cleaning Jennifer Ann Morrow, Ph.D. University of Tennessee
  • 2. Importance of Cleaning Data • As evaluators we need our evaluation data to be: – Accurate – Complete – High quality – Reliable – Unbiased – Valid
  • 3. If We Don’t Clean Our Data • Problems that can occur: – Inaccurate/biased conclusions – Increased error – Reduced credibility – Reduced generalizability – Violation of statistical assumptions
  • 4. 1: Create a Data Codebook • Contains all relevant information for your evaluation project • Suggestions for what to include: – Electronic file names – Variable names, variable labels, value labels – Complete list of modified variables – Citations for instrument sources – Project diary
  • 5. 2: Create a Data Analysis Plan • Your analysis plan should list each step you will take when analyzing your data • Suggestions for what to include: – General instructions for data analysts – List of datasets – Evaluation questions – Variables used for each analysis – Specific analyses and graphics for each evaluation question
  • 6. 3: Perform Initial Frequencies – Round 1 • After organizing your codebook and analysis plan you can now begin to start the data cleaning process • Conduct frequency analyses (frequencies, percentages) for EVERY variable in your evaluation dataset • Suggestion: – request a graphic (bar chart or histogram) for each variable
  • 7. 4: Check for Coding Mistakes • Coding errors are any values that are not within the specified range for your variable (e.g., you have a rating scale from 1-5 and you have a value of 9) • Suggestions: – Compare all values to what is listed in your codebook – In many cases errors are unspecified missing data values
  • 8. 5: Modify and Create Variables • It is now time to modify your variables so they can be used in your planned analyses • Suggestions: – Reverse code any variables that need to be merged with others that are on the opposite scale – Recode any variables to match your codebook – Create new variables (e.g., averages, total scores) to be used for future analyses
  • 9. 6: Frequencies and Descriptives – Round 2 • At this step you conduct frequency analyses on every variable and descriptive analyses on every continuous variable • Suggestions: – Review the following descriptives: mean, median, mode, standard deviation, skewness, kurtosis, minimum, and maximum – Create standardized scores (i.e., Z-scores) for every continuous variable
  • 10. 7: Search for Outliers • Review your standardized scores and histograms to check for outliers • Outliers are scores that deviate greatly from the mean (e.g., >/3.29/ standard deviations) and potentially can create or cover up statistical significance • Suggestions: – delete, transform, or alter (winsorize, trim, modify) your outliers
  • 11. 8: Assess for Normality • For many inferential statistics (e.g., analyses of variance, regressions) your outcome (dependent) variable should be normally distributed (i.e., mean=median=mode) • Suggestions: – check to see if the values of your skewness and kurtosis are greater than /2/ – Transform the variable, use a non-parametric analysis, or modify your alpha level
  • 12. 9: Dealing with Missing Data • You should always check to see if missing data is random or non-random (i.e., patterns of missing data) • Evaluation results can be misleading and less generalizable • Suggestions: – Delete cases/variables with missing data, estimate missing data, conduct analyses with and without modifying variables
  • 13. 10: Examine Cell Sample Size • For many of our analyses (e.g., group difference statistics) we want to have equal sample sizes in our cells of our design • Unequal sample sizes lead to lower statistical power and reduced generalizability • Suggestions: – Collapse categories within a variable, use a non- parametric analysis, or apply a more stringent alpha level
  • 14. 11: Frequencies and Descriptives – The Finale • Your data is now cleaned and ready to be summarized! • Conduct a final set of frequencies and descriptives prior to conducting your inferential statistics • Suggestion: – Use a variety of graphics and visual aids to showcase your evaluation data for your clients
  • 15. 12: Assumption Testing • For some inferential statistics (e.g., correlational analyses, group difference analyses) you still need to address a few additional assumptions in order to conduct the analysis • Suggestions: – Some common assumptions are: homogeneity of variance, linearity, independence of errors, multicollinearity, and reliability
  • 16. Some Helpful Resources • YouTube videos – http://www.youtube.com/watch?v=R6Cc5flsbsw – http://www.youtube.com/watch? v=5qhLDYr70MM&feature=channel&list=UL • Websites – http://clinistat.hk/internetresource.php – http://pareonline.net/getvn.asp?v=9&n=6 • Software – http://www.gnu.org/software/pspp/ – http://davidmlane.com/hyperstat/Statistical_analyses.html
  • 17. Contact Information Jennifer Ann Morrow, Ph.D. Associate Professor of Evaluation, Statistics, and Measurement Department of Educational Psychology and Counseling University of Tennessee Knoxville, TN 37996-3452 Email: jamorrow@utk.edu http://web.utk.edu/~edpsych/eval_assessment/default.html