The document discusses the importance of data probity, or ensuring the integrity and quality of data. It notes that data harvested today must be able to answer unknown future questions. It advocates for transparency in research through pre-publishing protocols and data, using open licenses, and supporting peer review. The key aspects of data probity discussed are having an identifiable source, transparent methods, publication before analysis, maintaining point data before aggregation, and having a repeatable, auditable trail.
2. www.vperemen.com, CC BY-SA 4.0, via Wikimedia Commons
The screaming need for data
Who is effected?
How are they effected?
What can we do about it?
What might happen in response?
How do we recover afterwards?
Will things ever be the same?
3.
4.
5.
6. Badics, CC BY-SA 3.0, via Wikimedia Commons
The intersection of Policy & Politics
Data, analysis & the evidence illusion
Post-hoc support & plausible deniability
Competing self-interest
Changing circumstance, changing evidence
7. Harvesting longitudinal data is
not joyful
Instant answers don’t happen instantly
Longitudinal source data are incoherent
Data probity takes method, practice & time
Esayas Ayele, CC BY-SA 4.0, via Wikimedia Commons
8. CDC Global, CC BY-SA 2.0, via flickr
What we talk about
when we talk about
probity
Identifiable source
Transparent methods
Publication before analysis
Point data before aggregation
Repeatable, auditable trail
9. Transparency in practice
Pre-publication of research protocol, methods & data
Systematic review
Open licences
No trust without support for peer review & validation
Yakuzakorat, CC BY 4.0, via Wikimedia Commons
10. Photo by Clay Banks on Unsplash
Protocols & ambiguity
Maintain your source
Pick sensible defaults
Make no destructive changes
Document every action
Expect to be audited
11. Photo by Lubo Minar on Unsplash
Uncertainty & the distant future
Data harvested today must answer unknown
questions to unknown problems in an
unknown – but different – future environment
12. Poverty is expensive
A legacy of futility risks becoming self-perpetuating
Olga Ernst, CC BY-SA 4.0, via Wikimedia Commons
14. Where are businesses compared to
where we think they are?
Does a change in tax rates cause business closure?
How should we measure energy consumption?
Who wins & loses from
COVID commute changes?
16. Photo by Sylvie Tittel on Unsplash
Protocol with sensible defaults
1. All units are occupied & pay full rates.
2. When data are ambiguous, refer to 1.
3. Ask for data, even when you know they’ll say no.
4. Never delete anything.
5. Document everything.
6. When in doubt, ask the data source.
7. Accept the weird but keep looking for answers.
8. Ensure the process is public.
23. Photo by Sylvie Tittel on Unsplash
Sqwyre data probity protocol
1. Instant answers don’t happen instantly
2. Data probity takes method, practice & patience
3. Maintain all source data
4. Pick sensible & transparent defaults
5. Transformations must be documented
6. Make no destructive changes
7. Point data before aggregation or analysis
8. Open licences to encourage use & reuse
9. Collaborate to make the data wanted & useful
10. Be ready to explain & be audited
24. Hansueli Krapf This file was uploaded with Commonist., CC BY-SA 3.0, via Wikimedia Commons
Know your business