SlideShare a Scribd company logo
1 of 16
Dato Confidential1
Analyzing Video with GraphLab Create
June 16, 2016
Guy Rapaport, Data Scientist, Dato EMEA
guy@dato.com
Dato Confidential2
Dato: We Intelligent Applications
Dato Confidential
Some of our Customers
3
Dato Confidential4
Business
must be intelligent
Machine learning
applications
• Recommenders
• Fraud detection
• Ad targeting
• Financial models
• Personalized medicine
• Churn prediction
• Smart UX
(video & text)
• Personal assistants
• IoT
• Socials networks
• Log analysis
Last decade:
Data management
Now:
Intelligent apps
?
Last 5 years:
Traditional analytics
Dato Confidential
Dato Confidential
Creating a model pipeline
exploration
data
modeling
- Images
- Text
- Graphs
- Tabular Data
Dato Confidential
Creating a model pipeline
Ingest Transform Model Deploy
Unstructured Data
Dato Confidential
Creating a model pipeline using Dato products
Ingest Transform Model Deploy
Unstructured Data
SFrame Engine
(FREE, open
source)
GraphLab Create
(Scalable Machine
Learning Python
Library,
4K/machine/year)
Predictive Services
(Serving + Load Balancing
+ AB Testing,
10K/machine/year)
Dato Confidential9
$ pip install –U graphlab-create
Dato Confidential10
What will we cover today?
1. Match a movie’s screenplay with its subtitles.
- Now we know who says what and when.
2. Extract frames, then actors’ faces, from the movie.
- We’ll use opencv for video manipulation and face detection.
3. Train a face recognition model over the faces.
- What’s the smallest portion of the movie we can get good
results from?
10
Dato Confidential11
Python vs. Anaconda
• You can download Python for free from python.org .
- Python with its standard library.
• Or, you could download the Anaconda distribution.
- Python + tons of installed packages + package managers.
• It’s the same Python, but Anaconda includes both pip and
also with it’s own package manager, conda.
11
Dato Confidential12
pip vs. conda vs. virtualenv
pip – install Python packages.
conda – install Python packages + any OS packages required
for your package to work (libraries etc).
$ conda install -c menpo opencv3=3.1.0
virtualenv – separate environment (by manipulating the
$PYTHONPATH etc.) so packages won’t break.
You can have multiple Python versions on the same machine,
and use a Python version in different environments.
12
Dato Confidential13
Look Deeper!
1) Building a Face Recognition System with OpenCV in the blink of an Eye
• https://github.com/rragundez/PyData
• Live video from webcam, online analytics
2) Using mxnet for deep feature extraction
• https://github.com/dmlc/mxnet/blob/master/example/notebooks/predict-
with-pretrained-model.ipynb
• mxnet is now integrated into GraphLab!
3) mxnet-face
• https://github.com/tornadomeet/mxnet-face
Dato Confidential
Confidential – Dato internal use only. ©2015 Dato, Inc.
Questions?
“For the purpose of learning the Answer to the
Ultimate Question of Life, The Universe, and Everything,
the supercomputer Deep Thought was specially built.
It takes Deep Thought 7½ million years to compute and check the
answer, which turns out to be 42. Deep Thought points out that
the answer seems meaningless because
the beings who instructed it
never actually knew what the Question was.”
- Douglas Adams, “The Hitchhiker’s Guide to the Galaxy”
Dato Confidential15
Our Machine Learning Specialization
in Coursera
https://www.coursera.org/learn/ml-foundations
Dato Confidential
Confidential – Dato internal use only. ©2015 Dato, Inc.
Thanks!
Install using pip: $ pip install -U graphlab-create
Dato Launcher Download:
https://dato.com/download/
The benchmarks on GitHub:
https://github.com/guy4261/glc_pagerank_benchmark
Coursera Course:
https://www.coursera.org/learn/ml-foundations
Reach out: guy@dato.com

More Related Content

What's hot

Driverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiDriverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiSri Ambati
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskDatabricks
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflixCody Rioux
 
Productionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceProductionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceDatabricks
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In ProductionSamir Bessalah
 
BigML Webcast: September 25, 2013
BigML Webcast:  September 25, 2013BigML Webcast:  September 25, 2013
BigML Webcast: September 25, 2013BigML, Inc
 
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Sri Ambati
 
DATA @ NFLX (Tableau Conference 2014 Presentation)
DATA @ NFLX (Tableau Conference 2014 Presentation)DATA @ NFLX (Tableau Conference 2014 Presentation)
DATA @ NFLX (Tableau Conference 2014 Presentation)Blake Irvine
 
UX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentUX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentTrieu Nguyen
 
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...Data Science Milan
 
BigML Winter 2015 Release Webinar
BigML Winter 2015 Release WebinarBigML Winter 2015 Release Webinar
BigML Winter 2015 Release WebinarBigML, Inc
 
The Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIsThe Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIsBigML, Inc
 
Quoc Le at AI Frontiers : Automated Machine Learning
Quoc Le at AI Frontiers : Automated Machine LearningQuoc Le at AI Frontiers : Automated Machine Learning
Quoc Le at AI Frontiers : Automated Machine LearningAI Frontiers
 
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Formulatedby
 
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)Albert Wong
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeIdo Shilon
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
 
H2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicH2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicSri Ambati
 

What's hot (20)

Driverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiDriverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.ai
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
 
Real time analytics @ netflix
Real time analytics @ netflixReal time analytics @ netflix
Real time analytics @ netflix
 
Productionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceProductionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness Marketplace
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In Production
 
BigML Webcast: September 25, 2013
BigML Webcast:  September 25, 2013BigML Webcast:  September 25, 2013
BigML Webcast: September 25, 2013
 
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
 
DATA @ NFLX (Tableau Conference 2014 Presentation)
DATA @ NFLX (Tableau Conference 2014 Presentation)DATA @ NFLX (Tableau Conference 2014 Presentation)
DATA @ NFLX (Tableau Conference 2014 Presentation)
 
UX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentUX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product Development
 
Generative models in the arts
Generative models in the artsGenerative models in the arts
Generative models in the arts
 
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
 
BigML Winter 2015 Release Webinar
BigML Winter 2015 Release WebinarBigML Winter 2015 Release Webinar
BigML Winter 2015 Release Webinar
 
The Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIsThe Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIs
 
Quoc Le at AI Frontiers : Automated Machine Learning
Quoc Le at AI Frontiers : Automated Machine LearningQuoc Le at AI Frontiers : Automated Machine Learning
Quoc Le at AI Frontiers : Automated Machine Learning
 
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
 
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
2016 Tableau in the Cloud - A Netflix Original (AWS Re:invent)
 
Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
H2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicH2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom Kraljevic
 

Viewers also liked

Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Turi, Inc.
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsTuri, Inc.
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab CreateTuri, Inc.
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Turi, Inc.
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataTuri, Inc.
 
Cassandra synergy
Cassandra synergyCassandra synergy
Cassandra synergyniallmilton
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Turi, Inc.
 
Accenture maximizing-customer-retention
Accenture maximizing-customer-retentionAccenture maximizing-customer-retention
Accenture maximizing-customer-retentionKhellil Khellil
 
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementT-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementVivastream
 
The Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the MassesThe Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the MassesAlice Zheng
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Managementfarhanmajeed
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom IndustrySatyam Barsaiyan
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom Hong Bui Van
 
Get started with dropbox
Get started with dropboxGet started with dropbox
Get started with dropboxBeverly Solano
 
2012 DuPage Environmental Summit
2012 DuPage Environmental Summit2012 DuPage Environmental Summit
2012 DuPage Environmental SummitNapervilleNCEC
 
Setting up Your LinkedIn Account
Setting up Your LinkedIn AccountSetting up Your LinkedIn Account
Setting up Your LinkedIn AccountNET:101
 
Xstrata Article
Xstrata ArticleXstrata Article
Xstrata ArticleVicki Shaw
 

Viewers also liked (20)

Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log Data
 
Crystal qube™ presentation tpr
Crystal qube™ presentation tprCrystal qube™ presentation tpr
Crystal qube™ presentation tpr
 
Cassandra synergy
Cassandra synergyCassandra synergy
Cassandra synergy
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
 
Accenture maximizing-customer-retention
Accenture maximizing-customer-retentionAccenture maximizing-customer-retention
Accenture maximizing-customer-retention
 
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementT-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
 
The Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the MassesThe Challenges of Bringing Machine Learning to the Masses
The Challenges of Bringing Machine Learning to the Masses
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Management
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom Industry
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom
 
Haiti
HaitiHaiti
Haiti
 
Get started with dropbox
Get started with dropboxGet started with dropbox
Get started with dropbox
 
2012 DuPage Environmental Summit
2012 DuPage Environmental Summit2012 DuPage Environmental Summit
2012 DuPage Environmental Summit
 
Resume
ResumeResume
Resume
 
Setting up Your LinkedIn Account
Setting up Your LinkedIn AccountSetting up Your LinkedIn Account
Setting up Your LinkedIn Account
 
Xstrata Article
Xstrata ArticleXstrata Article
Xstrata Article
 

Similar to Webinar - Analyzing Video

Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at workSammy Fung
 
Jose l ugia 6 wunderkinder, momenta
Jose l ugia  6 wunderkinder, momentaJose l ugia  6 wunderkinder, momenta
Jose l ugia 6 wunderkinder, momentaapps4allru
 
Build your cross-platform service in a week with App Engine
Build your cross-platform service in a week with App EngineBuild your cross-platform service in a week with App Engine
Build your cross-platform service in a week with App EngineJl_Ugia
 
DockerDay2015: Keynote
DockerDay2015: KeynoteDockerDay2015: Keynote
DockerDay2015: KeynoteDocker-Hanoi
 
Machine learning in cybersecutiry
Machine learning in cybersecutiryMachine learning in cybersecutiry
Machine learning in cybersecutiryVishwas N
 
Introduction of python programming
Introduction of python programmingIntroduction of python programming
Introduction of python programmingNitin Kumar Kashyap
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
 
Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixAll Things Open
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonInsuk (Chris) Cho
 
What is Python? An overview of Python for science.
What is Python? An overview of Python for science.What is Python? An overview of Python for science.
What is Python? An overview of Python for science.Nicholas Pringle
 
IOT with Drupal 8 - Webinar Hyderabad Drupal Community
IOT with Drupal 8 -  Webinar Hyderabad Drupal CommunityIOT with Drupal 8 -  Webinar Hyderabad Drupal Community
IOT with Drupal 8 - Webinar Hyderabad Drupal CommunityPrateek Jain
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the tradeFangda Wang
 
Continuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon OttoContinuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon OttoPeter Bittner
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGDSCNiT
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and FutureKeiichiro Ono
 
Next Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and SnykNext Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and SnykDevOps.com
 
Dublin Unity User Group Meetup Sept 2015
Dublin Unity User Group Meetup Sept 2015Dublin Unity User Group Meetup Sept 2015
Dublin Unity User Group Meetup Sept 2015Dominique Boutin
 
EclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allEclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allMarc Dutoo
 

Similar to Webinar - Analyzing Video (20)

Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at work
 
Jose l ugia 6 wunderkinder, momenta
Jose l ugia  6 wunderkinder, momentaJose l ugia  6 wunderkinder, momenta
Jose l ugia 6 wunderkinder, momenta
 
Build your cross-platform service in a week with App Engine
Build your cross-platform service in a week with App EngineBuild your cross-platform service in a week with App Engine
Build your cross-platform service in a week with App Engine
 
DockerDay2015: Keynote
DockerDay2015: KeynoteDockerDay2015: Keynote
DockerDay2015: Keynote
 
Machine learning in cybersecutiry
Machine learning in cybersecutiryMachine learning in cybersecutiry
Machine learning in cybersecutiry
 
Introduction of python programming
Introduction of python programmingIntroduction of python programming
Introduction of python programming
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at Netflix
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of Python
 
What is Python? An overview of Python for science.
What is Python? An overview of Python for science.What is Python? An overview of Python for science.
What is Python? An overview of Python for science.
 
IOT with Drupal 8 - Webinar Hyderabad Drupal Community
IOT with Drupal 8 -  Webinar Hyderabad Drupal CommunityIOT with Drupal 8 -  Webinar Hyderabad Drupal Community
IOT with Drupal 8 - Webinar Hyderabad Drupal Community
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Continuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon OttoContinuous Delivery for Python Developers – PyCon Otto
Continuous Delivery for Python Developers – PyCon Otto
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptx
 
Supratik_CV_Photo
Supratik_CV_PhotoSupratik_CV_Photo
Supratik_CV_Photo
 
Supratik_CV_Photo
Supratik_CV_PhotoSupratik_CV_Photo
Supratik_CV_Photo
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
Next Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and SnykNext Generation Vulnerability Assessment Using Datadog and Snyk
Next Generation Vulnerability Assessment Using Datadog and Snyk
 
Dublin Unity User Group Meetup Sept 2015
Dublin Unity User Group Meetup Sept 2015Dublin Unity User Group Meetup Sept 2015
Dublin Unity User Group Meetup Sept 2015
 
EclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allEclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them all
 

More from Turi, Inc.

Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Turi, Inc.
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine LearningTuri, Inc.
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesTuri, Inc.
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinTuri, Inc.
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data scienceTuri, Inc.
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Turi, Inc.
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender SystemsTuri, Inc.
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in productionTuri, Inc.
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringTuri, Inc.
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with DatoTuri, Inc.
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Turi, Inc.
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTuri, Inc.
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemTuri, Inc.
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsTuri, Inc.
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Turi, Inc.
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Turi, Inc.
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperienceTuri, Inc.
 
Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark Turi, Inc.
 

More from Turi, Inc. (20)

Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender Systems
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
SFrame
SFrameSFrame
SFrame
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data Experience
 
Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark
 

Recently uploaded

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Recently uploaded (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Webinar - Analyzing Video

  • 1. Dato Confidential1 Analyzing Video with GraphLab Create June 16, 2016 Guy Rapaport, Data Scientist, Dato EMEA guy@dato.com
  • 2. Dato Confidential2 Dato: We Intelligent Applications
  • 3. Dato Confidential Some of our Customers 3
  • 4. Dato Confidential4 Business must be intelligent Machine learning applications • Recommenders • Fraud detection • Ad targeting • Financial models • Personalized medicine • Churn prediction • Smart UX (video & text) • Personal assistants • IoT • Socials networks • Log analysis Last decade: Data management Now: Intelligent apps ? Last 5 years: Traditional analytics
  • 6. Dato Confidential Creating a model pipeline exploration data modeling - Images - Text - Graphs - Tabular Data
  • 7. Dato Confidential Creating a model pipeline Ingest Transform Model Deploy Unstructured Data
  • 8. Dato Confidential Creating a model pipeline using Dato products Ingest Transform Model Deploy Unstructured Data SFrame Engine (FREE, open source) GraphLab Create (Scalable Machine Learning Python Library, 4K/machine/year) Predictive Services (Serving + Load Balancing + AB Testing, 10K/machine/year)
  • 9. Dato Confidential9 $ pip install –U graphlab-create
  • 10. Dato Confidential10 What will we cover today? 1. Match a movie’s screenplay with its subtitles. - Now we know who says what and when. 2. Extract frames, then actors’ faces, from the movie. - We’ll use opencv for video manipulation and face detection. 3. Train a face recognition model over the faces. - What’s the smallest portion of the movie we can get good results from? 10
  • 11. Dato Confidential11 Python vs. Anaconda • You can download Python for free from python.org . - Python with its standard library. • Or, you could download the Anaconda distribution. - Python + tons of installed packages + package managers. • It’s the same Python, but Anaconda includes both pip and also with it’s own package manager, conda. 11
  • 12. Dato Confidential12 pip vs. conda vs. virtualenv pip – install Python packages. conda – install Python packages + any OS packages required for your package to work (libraries etc). $ conda install -c menpo opencv3=3.1.0 virtualenv – separate environment (by manipulating the $PYTHONPATH etc.) so packages won’t break. You can have multiple Python versions on the same machine, and use a Python version in different environments. 12
  • 13. Dato Confidential13 Look Deeper! 1) Building a Face Recognition System with OpenCV in the blink of an Eye • https://github.com/rragundez/PyData • Live video from webcam, online analytics 2) Using mxnet for deep feature extraction • https://github.com/dmlc/mxnet/blob/master/example/notebooks/predict- with-pretrained-model.ipynb • mxnet is now integrated into GraphLab! 3) mxnet-face • https://github.com/tornadomeet/mxnet-face
  • 14. Dato Confidential Confidential – Dato internal use only. ©2015 Dato, Inc. Questions? “For the purpose of learning the Answer to the Ultimate Question of Life, The Universe, and Everything, the supercomputer Deep Thought was specially built. It takes Deep Thought 7½ million years to compute and check the answer, which turns out to be 42. Deep Thought points out that the answer seems meaningless because the beings who instructed it never actually knew what the Question was.” - Douglas Adams, “The Hitchhiker’s Guide to the Galaxy”
  • 15. Dato Confidential15 Our Machine Learning Specialization in Coursera https://www.coursera.org/learn/ml-foundations
  • 16. Dato Confidential Confidential – Dato internal use only. ©2015 Dato, Inc. Thanks! Install using pip: $ pip install -U graphlab-create Dato Launcher Download: https://dato.com/download/ The benchmarks on GitHub: https://github.com/guy4261/glc_pagerank_benchmark Coursera Course: https://www.coursera.org/learn/ml-foundations Reach out: guy@dato.com

Editor's Notes

  1.  The team, the history of the product
  2. Company began 7 years ago in Carnegie Mellon University as an open-source project. Now a company with 50+ employees and a recently opened EMEA office here in Israel. Customers 
  3. Yes, we are selling  (100+ paying customers, brand names)  Intelligent apps are predictive
  4. From analytics (queries over known data) to predictive (discovering the unknown). Supported data types 
  5. # end of corporate slides GLC in a line 
  6. Steps in the model pipeline creation 
  7. From inspiration to production 
  8. The tools that we are making and what are they doing for this pipeline. My goal today is that you’ll install it. 
  9. My goal today.
  10. Check our Coursera course 
  11. Thanks 