In IoT, the sensor data need to consider Sensor Value, Veracity, Volume, Velocity & Variety within the data classification in its context and cannot be treated equal to be cost efficient for security consideration.
Scaling API-first – The story of a global engineering organization
In IoT systems, the Security System Levels are determined by Data Classifications
1. Tan Guan Hong
Technology Partner
drtangh@rekanext.com
In the Digital Economy using IoT systems,
Data Classification must be designed in
4th IEEE World Forum on IoT
6 Feb 2018
1
2. Smart Nation Strategy
Smart City
Systems
Smart Citizen
Platforms
Digital
Government
Put in place the
technology and
infrastructure
(Smart Nation Platform)
Deliver better and
anticipatory services to
citizens
Empower citizens to
co-create useful
solutions
2
Data Sharing across stake holders
https://www.tech.gov.sg/Programmes-Partnerships/Programmes-Partnerships/Initiatives/Smart-Nation-Sensor-Platform
3. 3
Traditionally Classified Data is stored in a Secured Data
Centre, the data is extracted through a secured Network to
run in others servers.
Data is transaction and/or event based type
ICTRisk Management
• Data Security
• Network Security
Data Classification determines the Security Level used
No one Size fits all approach
Data
Centre
Data
CentreFirewall
Data Sharing
4. 4
ICT
Data
Centre
Data
CentreFirewall
Sensors & IoT
10,000
Sensors
10,000
Cameras
Sensors generate Data @
Value, Veracity, Volume, Velocity & Variety
Sensors operate in the Physical World and affect by Environmental
conditions
Sensor Data is Analog and streams from 1 – 10,000 data points / sec
A VGA Camera (640x480) @ 10 frames / sec streams out 3M pixel
points / sec
5. 5
• Move Sensor Data Processing into Unclassified Data
Processing Zones. (Sensor Data Acquisition and Signal
Processing domains)
• Only when Processed Information is linked to a pre-
registered CLASSIFIED Database, then only the Paired
information together is CLASSIFIED.
Traditional Data Classification Thinking:
Any Processed Data should be Classified
Cost Efficient way: Manage Security Level through
Appropriate Data Classification in System Design
6. 6
What is Data ?
What is Information ?
Classification of Information is NOT
Classification of Data at all
G1234567A
Data needs context to be Information
G1234567A
is just Alphanumeric and has
no meaning at all by itself
RFID is for a cow
Data Classification
7. 7
NRIC Identity card
Alphanumeric Meta Data
• Name
• Race
• Date of Birth
• Sex
• Country of Birth
• Date of Issue
• Residential Address
• Why should the individual Alphanumeric Metadata be classified ?
• By itself, the individual Metadata contains very general data
• When all the Metadata are linked together, the whole dataset becomes
confidential, tracible to an individual
Data Classification
7 Metadata
to identify a
unique
individual
8. 8
Number of combinations
• Name
• 10
• 30 x 12 x 100 = 36,000
• 2
• 60
• 30 x 12 x 100 = 36,000
• 3,000,000
• Imagine to trace a person, there are 10x36,000x2x60x36,000x3,000,000 possible
combinations. In that Billions combination, there is only one unique person for a
3M population.
• If we use Data Analytics, we will probably reduce this search combinations to trace
that individual
• Hence Data Accuracy and Quality is key
Data Classification
NRIC Identity card
Alphanumeric Meta Data
• Name
• Race
• Date of Birth
• Sex
• Country of Birth
• Date of Issue
• Residential Address
9. 9
Data Correctness and Data Quality
To determine your house address :-
Send 10 people to walk and to find out where is your home address. If I leave out all
other metadata , what is the probability , your address is correct ?
Even if in the same zone, they might come back with different addresses.
Why do we trust the address stated in the NRIC ?
They have Quality control process in the Data Collection system already
Then Sex : Male / Female. How was the meta data confirm ? They use Birth
Certificate information.
How was Sex determine in the Birth Certificate. In hospital by Doctor during your
birth. So there is a Quality control process in place for NRIC. But for Data for IoT,
then how do we trust the data generated for decision making ?
12. 12
Video
Analytics
Secured Ammunition Depot
Name NRIC
Unclassified information
Public Road
Data Classification
Classified information
G1234567ATan Wei Yi999 AA + +
Data
Classification is
about Information
with Context
Video
Analytics 999 AA
Unclassified information
Data Classification
13. 13
Video
Analytics
Secured Ammunition Depot
Name NRIC
Unclassified information
Classified information
Public Road
Data Classification
Classified information
G1234567ATan Wei Yi999 AA + +
+ Storage
Location
Data
Classification is
about Information
with Context
Video
Analytics 999 AA
Unclassified information
Data Classification
14. 14
IoT Sensor Data Flow
Sensor
Video camera
Gateway
Box
Data Centre
UNCLASSIFIED
Normal Security
CLASSIFIED
Higher Security
Data Fusion
Applications
Considerations
• Encryption
• Product assurance & Longevity
• Configuration Management
• Vulnerability Management
• Network Management
• Resiliency
Sensor Data
ProcessingSensor Data
Processing
Firewall Firewall
15. 15
Two-dimensional (2D) camera: These sensors capture data over time frames. Using various video
analytics algorithms, these 2D camera sensors can provide different information. For example, within
the same image, the algorithms can extract information such as (i) people count, (ii) number and
colour of cars (iii) lighting condition, etc. Over time, processed metadata can yield further insights such
as tracking of (iv) people’s movement, (v) dwell time, etc.
IoT Sensor Devices:-
Slow Sensor Data: Temperature, Humidity, Hydrostatic pressure, Strain Gauge, Tilt and Infra-red
sensors acquire data in minutes or hours. These are Quasi-static sensors.
Dynamic (Fast) Sensor Data: Accelerometer provides G m/s2 in milliseconds or faster. Acoustic
sound sensor provides voltage signals over time. When these sensor data are processed in the
Frequency Domain using Fast Fourier Transform, the data can provide Peak Vibration Level at various
Frequencies.
17. High Repeatability
High Accuracy
High Repeatability
Low Accuracy
Low Repeatability
High Accuracy
Low Repeatability
Low Accuracy
Sensor
7
Which
sensor data
do you trust
& faster to
process in
Real Time ?
19. Understanding Measurement Principle is important !
Actual
Temperature
Sampled
Temperature
Displayed
Temperature Temperature don’t
change at all !
If sample too slow
Temperature is
actually
fluctuating
Sensor
20. Understanding Measurement Principle is important !
Actual
Temperature
Sampled
Temperature
Displayed
Temperature Temperature don’t
change at all !
If sample too slow
Temperature is
actually
fluctuating
Sensor
21. Understanding Measurement Principle is important !
Actual
Temperature
Sampled
Temperature
Displayed
Temperature Temperature don’t
change at all !
If sample too slow
Temperature is
actually
fluctuating
Sensor
22. Understanding Measurement Principle is important !
Actual
Temperature
Sampled
Temperature
Displayed
Temperature
Nyquist
Frequency:-
Sample at
least Twice
the Highest
frequency
Temperature don’t
change at all !
If sample too slow
Temperature is
actually
fluctuating
Sensor
23. Accuracy of Information depends :-
Accuracy of Sensor
Maintenance & Calibration of Sensor (Function of Time, Drift, Deterioration )
Video Analytics is Processing of Image Data into Structured Information
Accuracy and Repeatability only in controlled environment
Installation of Sensor
Use of Sensor in its context (monitoring & control function)
Expected functional accuracy for decision making
ICT’s view is sensor data is stable, repeatable and maintenance free !
While an Electronics view is always drift, accuracy and noise
ICT is in Cyber World while Electronics view is deployment into
physical environment which Mother Nature controls)
Sensor
23
24. 24
Transits from Structured to Unstructured Data
In each record, it
is usually in rows
1 Data Point / Minute
1 Data Point / Sec
10 Data Points / Sec
100 Data Points / Sec
1,000 Data Points / Sec
10,000 Data Points / Sec
Velocity of Sensor Data
Structured
Data
Unstructured
Data
SQL
Sensor & IoT
data has Value,
Veracity,Volume,
Velocity & Variety
How to handle
10,000 SQL Data
Points / Sec from
just one Sensor ?
Sensor
25. 25
A Microphone Sensor measuring Voice waveform
Expand the Time scale
10,000 points
1,000 points
1 second
0.1 second
Sensor
26. 26
A Microphone Sensor measuring Voice waveform
Expand the Time scale
10,000 points
1,000 points
In Time Domain:-
Average
Root Mean Square (RMS)
Sound Pressure Level (SPL)
Maximum
Minimum
Signal/Data Processing techniques
can extract 9 Parameters
1 second
0.1 second
Sensor
In this example,
what is data and
information ?
27. 27
A Microphone Sensor measuring Voice waveform
Expand the Time scale
10,000 points
1,000 points
In Time Domain:-
Average
Root Mean Square (RMS)
Sound Pressure Level (SPL)
Maximum
Minimum
Signal/Data Processing techniques
can extract 9 Parameters
1 second
0.1 second
In Frequency Domain:-
Peak Amplitude
Peak Frequency
Harmonics
Weighted Amplitude (Curve A weighting)
Time to Frequency Domain Processing via Fast Fourier Transform
Sensor
In this example,
what is data and
information ?
28. 28
Design for Data Quality and NOT just
Availability of Data alone
Sensor
You could also be Sensing unwanted Noise!
SQL
Physical Sensor output can be affected by
Data corruption from
EMI Noise, Humidity, Temperature, Pressure,
Vibration (Lose connections)
Output of data is taken
from a Database and
usually many trust this
data !
When retrieved from SQL dB, the data is Highly
Repeatable and Accurate !
System is Auditable and Computers don’t lie ! ☺
ICTSensor & IoT
31. Accelerometer
Sensor on
Railway Track
Digitizer
Electro Magnetic Interference from
Motors, Welding Equipment, etc
Digital DataAnalogue Signals
Wanted Sensor Signal
EMI Noise
1.0 G = 0.9 G + 0.1 G
= 0.8 G + 0.2 G
Real Data Noise
Sensor
When train passes over the Railway track, it
generates 1.0 KHz vibration levels
What G number are you
actually measuring ?
Signal to Noise Ratio
31
32. Accelerometer
Sensor on
Railway Track
Digitizer
Electro Magnetic Interference from
Motors, Welding Equipment, etc
Digital DataAnalogue Signals
Use of a Spectrum
Analyzer to check the
Signal to Noise Ratio to
verify Quality of Signal
presented to the Digitizer
Wanted Sensor Signal
EMI Noise
1.0 G = 0.9 G + 0.1 G
= 0.8 G + 0.2 G
Real Data Noise
Sensor
When train passes over the Railway track, it
generates 1.0 KHz vibration levels
What G number are you
actually measuring ?
Signal to Noise Ratio
32
33. 33
Real Impact of Electro-Magnetic Interference (EMI) on
Sensor Information
Sensor
LTA Real Time
Strut Force
Readings
Load(kN)
Lunch Lunch
200 kN
Fluctuating
reduction in
Load = Weight of
15 Merc E200
34. 34
Two-dimensional (2D) camera: These sensors capture data over time frames. Using various video
analytics algorithms, these 2D camera sensors can provide different information. For example, within
the same image, the algorithms can extract information such as (i) people count, (ii) number and
colour of cars (iii) lighting condition, etc. Over time, processed metadata can yield further insights such
as tracking of (iv) people’s movement, (v) dwell time, etc.
IoT Sensor Devices:-
Slow Sensor Data: Temperature, Humidity, Hydrostatic pressure, Strain Gauge, Tilt and Infra-red
sensors acquire data in minutes or hours. These are Quasi-static sensors.
Dynamic (Fast) Sensor Data: Accelerometer provides G m/s2 in milliseconds or faster. Acoustic
sound sensor provides voltage signals over time. When these sensor data are processed in the
Frequency Domain using Fast Fourier Transform, the data can provide Peak Vibration Level at various
Frequencies.
35. 35
Using Camera as a Sensor
• Accurate & Reliable Data
• Outdoor Operating Conditions are
huge challenges
• One Camera gives many Metadata
and is a Contactless Sensor
Camera as a Sensor
37. 37
Less Measurement
Uncertainties
Every Facial Marker measurement has Uncertainty
Measurement
Uncertainties
With higher
Uncertainties, the
recognition is less
reliable. It is like
Noise is added onto
original data
Size of measurement dot indicates uncertainty range
Camera as a Sensor
Outdoor Accuracy
affected by Image
Quality and Lighting
variation
38. 38
Physical World Sensor data have Statistical Variations, while
SQL extracted data is always consistent.
Physical
Object under
measurement Sensor
Video
Analytics
Processed
into
Information
No. of
Sensing
Parameters
< 20 Facial
Sensor
Markers
1 Temperature
Reading
Each Sensing point do
have reading variations
RFID Tag
information
RFID
Reader
1 Digital
information
Sensing
Repeatability
Converts
uV to T oC
Need to
know where
are the
possible
Statistical
Sensing
Errors and
mitigate the
risks
SQL
System
usually takes
one
snapshot
reading and
stores in dB
Always
Repeatable
@ +/- 0 σ