SlideShare a Scribd company logo
1 of 17
Download to read offline
May 22, 2015
Data Science Consulting
Héloïse Nonne
Senior Data Scientist - Manager
Big Data Analytics for connected home
Data analytics for disconnected homes
2
𝑦𝑡 = 𝜇 + 𝜖 𝑡 + 𝜙1 𝑦𝑡−1 + ⋯ + 𝜙 𝑛 𝑦𝑡−𝑛 − 𝜃1 𝜖 𝑡−1 − ⋯ − 𝜃 𝑛 𝜖 𝑡−𝑛
ARIMA models
(AutoRegressive Integrated Moving Average)
𝑦𝑡 = electric load at time t
𝜖 𝑡 = noise at time t
• Very low frequency resolution for local
(household) measurements (< trimestrial)
• Only aggregated data (sum of individual
loads) for higher frequency
measurements (region, neighborhood)
• Data storage issues
• Computation power
• Limited knowledge at local
level
• Limited predictive power
• Complex sophisticated models
exist but are difficult to tune
• Sun
• Wind
• Cloud cover
• Humidity
• Temperature
Reducing electricity costs: a complete data ecosystem
3
Weather
Energy production
Energy price
Historical data
Actual measurement (real-time)
Forecast
• Appliances and
use
• Heating
• Electricity storage
• Elevators
• Doors / lights
• Network activity
-> current
occupation
• Renewable
energy
• Shutter
orientation
• Anthropologic
data
• Building structure
(thermal mass)
Electricity
demand ????
Regional / national scale
Local / neighborhood scale
Anthropologic data
• Energy consumption
patterns
Anthropologic data
• comfort temperature
• children at school
• activity of occupants
• Weekday /holiday
• Hour of day
Multiple sources of data for multiple models
• Volume
– vast amounts of data
– too large to store and analyse using
traditional technology
• Velocity
– speed at which new data is generated
– speed at which data change
• Variety
– types of data (number, text, images, video)
– types of sources (real-time, static)
• Veracity
– accuracy of data (frequency, errors)
– quality of data (sampling errors, typos)
4
Technology choices depend on the usecase
Transaction-oriented
• Write/Read
• Logs
• Transactions
Streaming-oriented
• Compute on the fly
• Reactivity
• Real-time decisions
Computationally intensive
• CPU/GPU bound
• Complex problem to solve
Storage-oriented
• Loads of data
• Analysis
• Algorithms
Hadoop
SQL
interactive
Tez
Mahout
Spark
Hbase
Cassa
ndra
HPC
Storm
Kafka
Spark
Hardware
Software
Need
Bank – Stock market
Web logs
In/out
Image recognition
Research on DNA,
…
Energy load management
Industrial processes
Aeronautics
Customers Web journey
Bank – Insurance
Customer management
Records, archiving
5
Anomaly
detection
Load
prediction
Statistics for
reporting on
dashboards
Identification
of
consumption
patterns
Data analytics on energy load
6
• Moving average and thresholds
• Outlier detection
• ARIMA
• Neural networks
• Recurrent neural networks
+ +
• Clustering: K-means, DBScan
• Self-organizing maps
• Recommendations to reschedule appliances
• Storage of energy (photovoltaic, geothermic, etc.)
Many usecases
• Detect precarity (underheating)
• Detect people in distress (illnesses, elderly, heat wave, …)
• Improved safety (fire detection, security, …)
Business Society
Research / knowledge Sustainability
• Building optimization (thermal mass, isolation,
configuration, windows orientation)
• Consumption patterns
• Social behaviors
• Optimize use and storage of energy (light
management, applicances use, demand reduction, …)
• Improve comfort in neighborhood
• Reduce waste (energy, water, appliances)
• Scoring and customer segmentation
• Predict the demand in energy
• Predictive maintenance (elevators, HVAC, photovoltaic, ..)
• Cost reduction
But remain pragmatic and think about the whole picture
-> predictive maintenance on light bulbs ??!
7
Predictive maintenance
Data
• Shaft speed
• Vibrations (X, Y, Z)
• Sound measurements
• Rail vibrations
• Motor temperature
• Oil buffer
• …
Wear, failure
• Bearing fault
• Door: Shoe deformation
• Unbalance
• Misalignment
• Resonance
• …
Elevator maintenance
predict failure before breakage
Cost reduction and improvement of reliability through predictive maintenance
8
A predictive maintenance management system
• Continuous adaptation of diagnostic
• Build, increase and maintain knowledge
• Handle large quantity of data
• Handle uncertainty in diagnostic
• Assess fault severity
Requirements
• Symptoms are a mix of different causes
• Information is unclear
• Limited frequency resolution
• Missing data
• Noise
Challenges
Data center
Remote management
system
Richer knowledge
multiple
sources
9
Bayesian networks
• Compact representation of entities states or
events as random variables
• Contains knowledge about how states /events are
related
BF Bearing fault
DF Door deformation
WU Weight unbalance
RN Resonance
MA Misalignment
AYX
Vibration freq peak on axis A
at Y X
TP Temperature > x °C
SP Shaft speed freq peaks
SdB Sound > x dB
MA
RN
SP
SdB
BF
DF
WU
X1X X2X
Y1X Y2X
Z1X Z2XTP
• Qualitative = dependence relations
• Quantitative = the strengths of the relations
• Mix a priori knowledge with experimental (real-time) data
• Explanatory (human understanding of phenomena vs black-box
models)
• Uncertainty management (assessment of probability of failure)
• Possibility to learn
• Parameters
• Structures (events, entities, causes and effects)
AdvantagesBayesian network
Decision rules for
action
10
Absolute need of prior
knowledge from
professionals
Bayesian networks
MA
RN
SP
SdB
BF
DF
WU
X1X X2X
Y1X Y2X
Z1X Z2XTP
WU
True (failure) 0.60
False 0.40
Experience 10
A priori conditional probability table Update with new experience
P n + 1 =
(P n ∗ nb_experiences) + 1
nb_experiences + 1
WU
True (failure) 0.636
False 0.364
Experience 11
One can unlearn (forget the past (outdated) experiences)
by using fading tables
Add a fading factor in front of the oldest experiences
11
The big (data) picture
• Many sources of data: weather, energy production, economic, social, behavioral data, appliances characteristics,
current building occupation, activity, etc.
• Different scales: worldwide, regional, local, individual
• Different times: historical data, year, month, day, hour, real-time
• The system is not going to be perfect at once -> design it constant improvement
• A single model is useless: each model has its use and models feed each other with their knowledge and prediction
• Choose the right model and the right technology: according to usecase, time cost, energy cost,
pragmatism, realism
• Build models with the professionals who know the problem
-> build on existing knowledge
An efficient system implies close collaboration
business, researchers, manufacturers, maintainers, owners, users, developpers, data
scientists, data managers, optimization specialists, and end-users
12
Quantmetry – Spécialiste de la Data science
Agir
Prédire
Analyser
Stocker
Collecter
13
De plus en plus de data disponibles
Tout stocker!
Analyser pour mieux comprendre signaux forts et faibles
Prévoir ce qui peut advenir grâce aux tendances du passé
Automatiser la décision et l’action
Quantmetry accompagne ses clients sur l’ensemble des strates de la pyramide des données et
participe ainsi à leur transformation digitale par le quantitatif
pour des résultats concrets sur leur performance business.
• un cabinet de conseil « pure player » du Big Data et de la Data science dont le développement commercial a démarré en 2013
• des méthodes statistiques avancées, le machine learning et les technologies Big data
• 2014: 1,5 millions d’euros de chiffres d’affaire avec une forte ambition de croissance, en France et à l’étranger
• Une vingtaine de data scientists / consultants
Activités de Quantmetry
14
Optimisation Business par la Data
Structuration d’un Data Lab
Conseil Accompagnement Réalisation
• Détection et priorisation d’opportunités
par la data
• Construction de schéma d’architecture IT
• Retours d’expérience et bonnes pratiques
• Schéma d’organisation et de gouvernance
• Choix d’une architecture technologique
Conduite du changement
Conduite de projet
• Cadrage, projet d’industrialisation
• Méthodologie (modèles statistiques
et algorithmes)
• Technologies Big Data
• Montée en compétences
• Recrutement
• Gouvernance
Projets pilotes
Industrialisation
• Proof of concept de Data science
• Pilotes technologiques
• Industrialisation de pilotes (API, …)
• Création d’une architecture Big Data
et mise en place de flux de données
Veillle technologique et expérimentations
• Des thèmes d’investigation :
– Online learning
– Deep learning et réseaux de neurones
– Industrialisation
– Analyse sémantique
– Energie (analyse de séries temporelles)
– Smart cities
– Amélioration de l’expérience utilisateur
• Acteur de l’écosystème Big Data : participation à des
séminaires, conférences internationales, hackathons,
compétitions Kaggle, partenariats éditeurs… Collaborations
avec des laboratoires de recherche et des écoles.
15
• Création et développement de produits spécifiques autour des technologies Big Data
• Recherche et développement en Data science
Baseline
(régression
logistique)
Gradient
Boosting
Données
non
structurée
s
Feature
engineeri
ng
Lift =
2
Lift =
6
Quelques Références en Data science
16
Amélioration du lift pour la
conquête en banque des
clients assurés
Détection de churn pour un
opérateur télécom
0 20 40
URL page résilitation
Age
Groupe
Nb pages vues…
Durée session
Mise en place d’un Data
Lab pour un assureur
Analyse de comportements
pour une mutuelle
Optimisation d’un outil de pricing
pour un acteur de la distribution B2B Modèles prédictifs de
consommation d’énergie
Excellence
Altruisme
Résultats
…
et Big Data
Visitez notre blog
quantmetry-blog.com
www.quantmetry.com

More Related Content

What's hot

Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Amol Patil
 
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitMeetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitDigipolis Antwerpen
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Data Science for Business Managers - Trends and Evolutions
Data Science for Business Managers - Trends and EvolutionsData Science for Business Managers - Trends and Evolutions
Data Science for Business Managers - Trends and EvolutionsAkin Osman Kazakci
 
VSSML18. Deepnets and Time Series
VSSML18. Deepnets and Time SeriesVSSML18. Deepnets and Time Series
VSSML18. Deepnets and Time SeriesBigML, Inc
 
Introducing google’s mobile nets
Introducing google’s mobile netsIntroducing google’s mobile nets
Introducing google’s mobile netsLarry Guo
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...
Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...
Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...Thomas Ploetz
 
許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI台灣資料科學年會
 
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENT
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENTAPPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENT
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENTsipij
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 

What's hot (14)

Dl2 computing gpu
Dl2 computing gpuDl2 computing gpu
Dl2 computing gpu
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)
 
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitMeetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
AHaH Computing
AHaH Computing AHaH Computing
AHaH Computing
 
機器學習速遊
機器學習速遊機器學習速遊
機器學習速遊
 
Data Science for Business Managers - Trends and Evolutions
Data Science for Business Managers - Trends and EvolutionsData Science for Business Managers - Trends and Evolutions
Data Science for Business Managers - Trends and Evolutions
 
VSSML18. Deepnets and Time Series
VSSML18. Deepnets and Time SeriesVSSML18. Deepnets and Time Series
VSSML18. Deepnets and Time Series
 
Kelly gaither
Kelly gaitherKelly gaither
Kelly gaither
 
Introducing google’s mobile nets
Introducing google’s mobile netsIntroducing google’s mobile nets
Introducing google’s mobile nets
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...
Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...
Bridging the Gap: Machine Learning for Ubiquitous Computing -- ML and Ubicomp...
 
許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI許永真/Crowd Computing for Big and Deep AI
許永真/Crowd Computing for Big and Deep AI
 
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENT
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENTAPPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENT
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENT
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 

Viewers also liked

Plateforme bigdata orientée BI avec Hortoworks Data Platform et Apache Spark
Plateforme bigdata orientée BI avec Hortoworks Data Platform et Apache SparkPlateforme bigdata orientée BI avec Hortoworks Data Platform et Apache Spark
Plateforme bigdata orientée BI avec Hortoworks Data Platform et Apache SparkALTIC Altic
 
Présentation Big Data et REX Hadoop
Présentation Big Data et REX HadoopPrésentation Big Data et REX Hadoop
Présentation Big Data et REX HadoopJoseph Glorieux
 
Bases de données NoSQL
Bases de données NoSQLBases de données NoSQL
Bases de données NoSQLSamy Dindane
 
NoSql : conception des schémas, requêtage, et optimisation
NoSql : conception des schémas, requêtage, et optimisationNoSql : conception des schémas, requêtage, et optimisation
NoSql : conception des schémas, requêtage, et optimisationMicrosoft Technet France
 
Enquête RegionsJob : emploi et réseaux sociaux, deuxième édition
Enquête RegionsJob : emploi et réseaux sociaux, deuxième éditionEnquête RegionsJob : emploi et réseaux sociaux, deuxième édition
Enquête RegionsJob : emploi et réseaux sociaux, deuxième éditionHelloWork
 
Casablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à HadoopCasablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à HadoopBenoît de CHATEAUVIEUX
 
Hadoop Hbase - Introduction
Hadoop Hbase - IntroductionHadoop Hbase - Introduction
Hadoop Hbase - IntroductionBlandine Larbret
 
Présentation pfe Big Data Hachem SELMI et Ahmed DRIDI
Présentation pfe Big Data Hachem SELMI et Ahmed DRIDIPrésentation pfe Big Data Hachem SELMI et Ahmed DRIDI
Présentation pfe Big Data Hachem SELMI et Ahmed DRIDIHaShem Selmi
 
Architectures techniques NoSQL
Architectures techniques NoSQLArchitectures techniques NoSQL
Architectures techniques NoSQLOCTO Technology
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Hatim CHAHDI
 
Valtech - Du BI au Big Data, une révolution dans l’entreprise
Valtech - Du BI au Big Data, une révolution dans l’entrepriseValtech - Du BI au Big Data, une révolution dans l’entreprise
Valtech - Du BI au Big Data, une révolution dans l’entrepriseValtech
 
Big Data : concepts, cas d'usage et tendances
Big Data : concepts, cas d'usage et tendancesBig Data : concepts, cas d'usage et tendances
Big Data : concepts, cas d'usage et tendancesJean-Michel Franco
 
Big data - Cours d'introduction l Data-business
Big data - Cours d'introduction l Data-businessBig data - Cours d'introduction l Data-business
Big data - Cours d'introduction l Data-businessVincent de Stoecklin
 

Viewers also liked (20)

Plateforme bigdata orientée BI avec Hortoworks Data Platform et Apache Spark
Plateforme bigdata orientée BI avec Hortoworks Data Platform et Apache SparkPlateforme bigdata orientée BI avec Hortoworks Data Platform et Apache Spark
Plateforme bigdata orientée BI avec Hortoworks Data Platform et Apache Spark
 
Présentation Big Data et REX Hadoop
Présentation Big Data et REX HadoopPrésentation Big Data et REX Hadoop
Présentation Big Data et REX Hadoop
 
Bases de données NoSQL
Bases de données NoSQLBases de données NoSQL
Bases de données NoSQL
 
NoSql : conception des schémas, requêtage, et optimisation
NoSql : conception des schémas, requêtage, et optimisationNoSql : conception des schémas, requêtage, et optimisation
NoSql : conception des schémas, requêtage, et optimisation
 
Enquête RegionsJob : emploi et réseaux sociaux, deuxième édition
Enquête RegionsJob : emploi et réseaux sociaux, deuxième éditionEnquête RegionsJob : emploi et réseaux sociaux, deuxième édition
Enquête RegionsJob : emploi et réseaux sociaux, deuxième édition
 
Casablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à HadoopCasablanca Hadoop & Big Data Meetup - Introduction à Hadoop
Casablanca Hadoop & Big Data Meetup - Introduction à Hadoop
 
Une introduction à MapReduce
Une introduction à MapReduceUne introduction à MapReduce
Une introduction à MapReduce
 
Hadoop Hbase - Introduction
Hadoop Hbase - IntroductionHadoop Hbase - Introduction
Hadoop Hbase - Introduction
 
Hadopp Vue d'ensemble
Hadopp Vue d'ensembleHadopp Vue d'ensemble
Hadopp Vue d'ensemble
 
Présentation pfe Big Data Hachem SELMI et Ahmed DRIDI
Présentation pfe Big Data Hachem SELMI et Ahmed DRIDIPrésentation pfe Big Data Hachem SELMI et Ahmed DRIDI
Présentation pfe Big Data Hachem SELMI et Ahmed DRIDI
 
Introduction à HDFS
Introduction à HDFSIntroduction à HDFS
Introduction à HDFS
 
Un introduction à Pig
Un introduction à PigUn introduction à Pig
Un introduction à Pig
 
Une introduction à Hive
Une introduction à HiveUne introduction à Hive
Une introduction à Hive
 
Architectures techniques NoSQL
Architectures techniques NoSQLArchitectures techniques NoSQL
Architectures techniques NoSQL
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
 
Une introduction à HBase
Une introduction à HBaseUne introduction à HBase
Une introduction à HBase
 
Valtech - Du BI au Big Data, une révolution dans l’entreprise
Valtech - Du BI au Big Data, une révolution dans l’entrepriseValtech - Du BI au Big Data, une révolution dans l’entreprise
Valtech - Du BI au Big Data, une révolution dans l’entreprise
 
Big Data : concepts, cas d'usage et tendances
Big Data : concepts, cas d'usage et tendancesBig Data : concepts, cas d'usage et tendances
Big Data : concepts, cas d'usage et tendances
 
Big data - Cours d'introduction l Data-business
Big data - Cours d'introduction l Data-businessBig data - Cours d'introduction l Data-business
Big data - Cours d'introduction l Data-business
 
Les BD NoSQL
Les BD NoSQLLes BD NoSQL
Les BD NoSQL
 

Similar to Big Data Analytics for connected home

How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsExtraHop Networks
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
Smarter Innovation at Scale
Smarter Innovation at ScaleSmarter Innovation at Scale
Smarter Innovation at ScaleGovnet Events
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalVMware Tanzu Korea
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSPhilip Filleul
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Dataconomy Media
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Maya Lumbroso
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data ScientistsRichard Garris
 
Machine learning’s impact on utilities webinar
Machine learning’s impact on utilities webinarMachine learning’s impact on utilities webinar
Machine learning’s impact on utilities webinarSparkCognition
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor NetworksOscar Corcho
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfahmedibrahimghnnam01
 
Don't Be Scared. Data Don't Bite. Introduction to Big Data.
Don't Be Scared. Data Don't Bite. Introduction to Big Data.Don't Be Scared. Data Don't Bite. Introduction to Big Data.
Don't Be Scared. Data Don't Bite. Introduction to Big Data.KGMGROUP
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsGanesan Narayanasamy
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptxXanGwaps
 

Similar to Big Data Analytics for connected home (20)

How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Smarter Innovation at Scale
Smarter Innovation at ScaleSmarter Innovation at Scale
Smarter Innovation at Scale
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from Pivotal
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Big Data on The Cloud
Big Data on The CloudBig Data on The Cloud
Big Data on The Cloud
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
DNA: an overview
DNA: an overviewDNA: an overview
DNA: an overview
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
Machine learning’s impact on utilities webinar
Machine learning’s impact on utilities webinarMachine learning’s impact on utilities webinar
Machine learning’s impact on utilities webinar
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
 
Lecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdfLecture 1-big data engineering (Introduction).pdf
Lecture 1-big data engineering (Introduction).pdf
 
Don't Be Scared. Data Don't Bite. Introduction to Big Data.
Don't Be Scared. Data Don't Bite. Introduction to Big Data.Don't Be Scared. Data Don't Bite. Introduction to Big Data.
Don't Be Scared. Data Don't Bite. Introduction to Big Data.
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
 

Recently uploaded

RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 

Recently uploaded (20)

RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 

Big Data Analytics for connected home

  • 1. May 22, 2015 Data Science Consulting Héloïse Nonne Senior Data Scientist - Manager Big Data Analytics for connected home
  • 2. Data analytics for disconnected homes 2 𝑦𝑡 = 𝜇 + 𝜖 𝑡 + 𝜙1 𝑦𝑡−1 + ⋯ + 𝜙 𝑛 𝑦𝑡−𝑛 − 𝜃1 𝜖 𝑡−1 − ⋯ − 𝜃 𝑛 𝜖 𝑡−𝑛 ARIMA models (AutoRegressive Integrated Moving Average) 𝑦𝑡 = electric load at time t 𝜖 𝑡 = noise at time t • Very low frequency resolution for local (household) measurements (< trimestrial) • Only aggregated data (sum of individual loads) for higher frequency measurements (region, neighborhood) • Data storage issues • Computation power • Limited knowledge at local level • Limited predictive power • Complex sophisticated models exist but are difficult to tune
  • 3. • Sun • Wind • Cloud cover • Humidity • Temperature Reducing electricity costs: a complete data ecosystem 3 Weather Energy production Energy price Historical data Actual measurement (real-time) Forecast • Appliances and use • Heating • Electricity storage • Elevators • Doors / lights • Network activity -> current occupation • Renewable energy • Shutter orientation • Anthropologic data • Building structure (thermal mass) Electricity demand ???? Regional / national scale Local / neighborhood scale Anthropologic data • Energy consumption patterns Anthropologic data • comfort temperature • children at school • activity of occupants • Weekday /holiday • Hour of day
  • 4. Multiple sources of data for multiple models • Volume – vast amounts of data – too large to store and analyse using traditional technology • Velocity – speed at which new data is generated – speed at which data change • Variety – types of data (number, text, images, video) – types of sources (real-time, static) • Veracity – accuracy of data (frequency, errors) – quality of data (sampling errors, typos) 4
  • 5. Technology choices depend on the usecase Transaction-oriented • Write/Read • Logs • Transactions Streaming-oriented • Compute on the fly • Reactivity • Real-time decisions Computationally intensive • CPU/GPU bound • Complex problem to solve Storage-oriented • Loads of data • Analysis • Algorithms Hadoop SQL interactive Tez Mahout Spark Hbase Cassa ndra HPC Storm Kafka Spark Hardware Software Need Bank – Stock market Web logs In/out Image recognition Research on DNA, … Energy load management Industrial processes Aeronautics Customers Web journey Bank – Insurance Customer management Records, archiving 5
  • 6. Anomaly detection Load prediction Statistics for reporting on dashboards Identification of consumption patterns Data analytics on energy load 6 • Moving average and thresholds • Outlier detection • ARIMA • Neural networks • Recurrent neural networks + + • Clustering: K-means, DBScan • Self-organizing maps • Recommendations to reschedule appliances • Storage of energy (photovoltaic, geothermic, etc.)
  • 7. Many usecases • Detect precarity (underheating) • Detect people in distress (illnesses, elderly, heat wave, …) • Improved safety (fire detection, security, …) Business Society Research / knowledge Sustainability • Building optimization (thermal mass, isolation, configuration, windows orientation) • Consumption patterns • Social behaviors • Optimize use and storage of energy (light management, applicances use, demand reduction, …) • Improve comfort in neighborhood • Reduce waste (energy, water, appliances) • Scoring and customer segmentation • Predict the demand in energy • Predictive maintenance (elevators, HVAC, photovoltaic, ..) • Cost reduction But remain pragmatic and think about the whole picture -> predictive maintenance on light bulbs ??! 7
  • 8. Predictive maintenance Data • Shaft speed • Vibrations (X, Y, Z) • Sound measurements • Rail vibrations • Motor temperature • Oil buffer • … Wear, failure • Bearing fault • Door: Shoe deformation • Unbalance • Misalignment • Resonance • … Elevator maintenance predict failure before breakage Cost reduction and improvement of reliability through predictive maintenance 8
  • 9. A predictive maintenance management system • Continuous adaptation of diagnostic • Build, increase and maintain knowledge • Handle large quantity of data • Handle uncertainty in diagnostic • Assess fault severity Requirements • Symptoms are a mix of different causes • Information is unclear • Limited frequency resolution • Missing data • Noise Challenges Data center Remote management system Richer knowledge multiple sources 9
  • 10. Bayesian networks • Compact representation of entities states or events as random variables • Contains knowledge about how states /events are related BF Bearing fault DF Door deformation WU Weight unbalance RN Resonance MA Misalignment AYX Vibration freq peak on axis A at Y X TP Temperature > x °C SP Shaft speed freq peaks SdB Sound > x dB MA RN SP SdB BF DF WU X1X X2X Y1X Y2X Z1X Z2XTP • Qualitative = dependence relations • Quantitative = the strengths of the relations • Mix a priori knowledge with experimental (real-time) data • Explanatory (human understanding of phenomena vs black-box models) • Uncertainty management (assessment of probability of failure) • Possibility to learn • Parameters • Structures (events, entities, causes and effects) AdvantagesBayesian network Decision rules for action 10 Absolute need of prior knowledge from professionals
  • 11. Bayesian networks MA RN SP SdB BF DF WU X1X X2X Y1X Y2X Z1X Z2XTP WU True (failure) 0.60 False 0.40 Experience 10 A priori conditional probability table Update with new experience P n + 1 = (P n ∗ nb_experiences) + 1 nb_experiences + 1 WU True (failure) 0.636 False 0.364 Experience 11 One can unlearn (forget the past (outdated) experiences) by using fading tables Add a fading factor in front of the oldest experiences 11
  • 12. The big (data) picture • Many sources of data: weather, energy production, economic, social, behavioral data, appliances characteristics, current building occupation, activity, etc. • Different scales: worldwide, regional, local, individual • Different times: historical data, year, month, day, hour, real-time • The system is not going to be perfect at once -> design it constant improvement • A single model is useless: each model has its use and models feed each other with their knowledge and prediction • Choose the right model and the right technology: according to usecase, time cost, energy cost, pragmatism, realism • Build models with the professionals who know the problem -> build on existing knowledge An efficient system implies close collaboration business, researchers, manufacturers, maintainers, owners, users, developpers, data scientists, data managers, optimization specialists, and end-users 12
  • 13. Quantmetry – Spécialiste de la Data science Agir Prédire Analyser Stocker Collecter 13 De plus en plus de data disponibles Tout stocker! Analyser pour mieux comprendre signaux forts et faibles Prévoir ce qui peut advenir grâce aux tendances du passé Automatiser la décision et l’action Quantmetry accompagne ses clients sur l’ensemble des strates de la pyramide des données et participe ainsi à leur transformation digitale par le quantitatif pour des résultats concrets sur leur performance business. • un cabinet de conseil « pure player » du Big Data et de la Data science dont le développement commercial a démarré en 2013 • des méthodes statistiques avancées, le machine learning et les technologies Big data • 2014: 1,5 millions d’euros de chiffres d’affaire avec une forte ambition de croissance, en France et à l’étranger • Une vingtaine de data scientists / consultants
  • 14. Activités de Quantmetry 14 Optimisation Business par la Data Structuration d’un Data Lab Conseil Accompagnement Réalisation • Détection et priorisation d’opportunités par la data • Construction de schéma d’architecture IT • Retours d’expérience et bonnes pratiques • Schéma d’organisation et de gouvernance • Choix d’une architecture technologique Conduite du changement Conduite de projet • Cadrage, projet d’industrialisation • Méthodologie (modèles statistiques et algorithmes) • Technologies Big Data • Montée en compétences • Recrutement • Gouvernance Projets pilotes Industrialisation • Proof of concept de Data science • Pilotes technologiques • Industrialisation de pilotes (API, …) • Création d’une architecture Big Data et mise en place de flux de données
  • 15. Veillle technologique et expérimentations • Des thèmes d’investigation : – Online learning – Deep learning et réseaux de neurones – Industrialisation – Analyse sémantique – Energie (analyse de séries temporelles) – Smart cities – Amélioration de l’expérience utilisateur • Acteur de l’écosystème Big Data : participation à des séminaires, conférences internationales, hackathons, compétitions Kaggle, partenariats éditeurs… Collaborations avec des laboratoires de recherche et des écoles. 15 • Création et développement de produits spécifiques autour des technologies Big Data • Recherche et développement en Data science
  • 16. Baseline (régression logistique) Gradient Boosting Données non structurée s Feature engineeri ng Lift = 2 Lift = 6 Quelques Références en Data science 16 Amélioration du lift pour la conquête en banque des clients assurés Détection de churn pour un opérateur télécom 0 20 40 URL page résilitation Age Groupe Nb pages vues… Durée session Mise en place d’un Data Lab pour un assureur Analyse de comportements pour une mutuelle Optimisation d’un outil de pricing pour un acteur de la distribution B2B Modèles prédictifs de consommation d’énergie
  • 17. Excellence Altruisme Résultats … et Big Data Visitez notre blog quantmetry-blog.com www.quantmetry.com