Presented at the Open Data Science meetup London (January 2016). To fully leverage the potential of the Internet of Things requires the exchange of information between devices. Unfortunately, most data remains in vendor silos. This talk explains how the life sciences have tackled similar issues, and why closed, vendor-specific systems may miss out.
3. modified, image from http://www.householdappliancesworld.com
health
management
air conditioning
smart heating
communications
security
entertainment
lighting controlweather
monitoring
room occupancy
5. ‣ Everything is connected
‣ Big, noisy, often
unstructured data
www.thingslearn.com
Analytics, context integration, machine learning
and predictive modelling for the IoT.
6. 0 clean shirt left
+
washing machine estimates
97% of your last pack of
powder used
+
it’s Wednesday, 23:55
+
the last four Thursdays
had a morning business
meeting
+
the car is parked 20 m from
a shop
+
last retail activity: 8 sec ago
Send immediate text
reminder to pick up
washing powder + send
tweet from @BorisHouse
“need identified” AND
“notification appropriate”Actionable insight.
From everything.
9. THE CONSUMER IOT FAILS FOR
ITS LACK OF CONNECTEDNESS
Matt Hatton, Machina Research
The BLN IoT ‘14
Internet replaces wire
It’s all about the
connectedness
M2M
consumer
IoT
10. • Computational biologist
• Research group leader
• Lecturer in genome biology
• Advisor at
Who is
@BorisAdryan
was
EXPECTATION MANAGEMENT
11. DNA = storage of a blueprint
RNA = ‘active copy’ of DNA
protein = the building blocks
of cells and tissues
LIFE AS WE KNOW IT
transcription
translation
Gregor Johann Mendel,
exhibited in the Library at the NIMR
12. • Reading DNA information
• Determining “the sequence
of a gene” was a PhD in the
early 1980s
• Data processing was mainly
transcribing the observation
into a research paper
BIOLOGY THEN AND NOW
SEQUENCE INFORMATION
Sanger sequencing
ca. 1980
http://www.eplantscience.com
15. 2
6 ATP
• Signal transduction and
metabolic pathways
• Characterisation of proteins
and substrates that mediate
chemical reactions
• Nobel prize material
BIOLOGY THEN AND NOW
BIOCHEMISTRY
16. • We know about 250k
metabolites
• 100k protein structures
• on the order of 10k
different chemical
reactions
BIOLOGY THEN AND NOW
BIOCHEMISTRY
17. ‣ Everything is connected
‣ Big, noisy, often
unstructured data
‣ We had studied how
biological entities depend
on each other
18. LIFE SCIENCE STRATEGIES
DON’T WORK IN THE IOT
- There are no commonly accepted
- ‘catalogue’ of things,
- ‘ontology’ of things,
- ‘data format’ of things,
- ‘meta data’ for things.
- Most businesses are driven by revenue, not
long-term strategic vision
- Service providers have no need to publish
- Data can be highly personal (cheap excuse)
unless they’re
19. WE HAVE A PROBLEM WITH
KNOWLEDGE REPRESENTATION
22. CURRENT GOVERNMENT
INVESTMENTS INTO GENE
ONTOLOGY
NIH alone spent $44,616,906 on the
ontology structure since 2001
(no data for UK/EU spendings)
~100 full-time salaries for experts with
domain-specific knowledge
~40,000 terms
24. META DATA, SHARING AND
DATA REPOSITORIES
founded in Nov. 1999
But this is a complex and ambitious project, and is one of the biggest challenges that
bioinformatics has yet faced. Major difficulties stem from the detail required to describe the
conditions of an experiment, and the relative and imprecise nature of measurements of
expression levels.The potentially huge volume of data only adds to these difficulties.
Nature
Feb. 2000
“
“
Nov. 2000
Oct. 2002
Wide adoption as
requirement for
publication in
scientific journals
25. META DATA, SHARING AND
DATA REPOSITORIES
cf. IoT 2015
since 2003
Semantic Sensor Network Ontologyhttp://en.wikipedia.org/wiki/Silo
26.
27. story
measurements
+ meta data
open, public repositories
human
curators
ontology
terms
community
PUBLISH OR PERISH
ok?
journal
informal exchange - no credit!
funders
assessment
The majority of this
infrastructure is paid for by
governments and charities
industry!
28.
29. measurements
+ meta data
storage &
provenance
human
curators
ontology
terms
user
PUBLISH OR YOU’RE NOT DOING IOT
ok?
Maybe the majority of this
infrastructure should be
paid for by governments?
company
cloud
device
registration
“ “
privileges
dataadded
value
30. 0 clean shirt left
+
washing machine estimates
97% of your last pack of
powder used
+
it’s Wednesday, 23:55
+
the last four Thursdays
had a morning business
meeting
+
the car is parked 20 m from
a shop
+
last retail activity: 8 sec ago
Send immediate text
reminder to pick up
washing powder + send
tweet from @BorisHouse
“need identified” AND
“notification appropriate”Actionable insight.
From everything.
“indicator of esteem”
3% left and
not pressed
“not home”
“buying”
credit card:
“highly personal device”
~ alive and awake