SlideShare a Scribd company logo
1 of 25
1
Utilizing Spark Streaming
for analyzing a real-time
sport data feeds
Demonstration
2
4.5 Trillion
Frames per second
60 Frames
Visible to the human eye
3
Camera Tracking Systems
An array of cameras around the
field capture the players and ball
positions LIVE
4
So what?
• Cool usecase and all, but what's the value?
• Real-time streams from robotic manufacturing (Audi, Ford, BMW, Toyota)
• Real-time traffic analysis for Smart Cities / Theme Parks (Denver, Cincinnati,
London, Disney, Universal)
• Real-time mechanical data from devices (Aircraft - Air France, Windmills – GE)
• And before you discount this whole sports things
• UK tax office collects 1.3B pounds ~2B USD in taxes each year from EPL teams
• Greater than the GDP of the bottom 25% of all countries
• 95 billion dollars wagered annually on NFL and college football
• #1 on Forbes 2000 list by a lot…
5
6
7
What version do you need to solve the problem?
8
Flow
d d d
+
+
9
Raw vs Encoded
150mpbs at 4k per camera
d d d
+
+
Stadiums have on avg 20-30 cameras
10
From Seen To Described
d d d
+
+
Gigs of Video data to KB/MB description data
Most applications that convert are proprietary
but seeing investment in space by the usual suspects
11
Phone home?
d d d
+
+
Data tends to be JSON or XML
Onvif Standard for Security
Messaging vs Web services?
12
Where does it reside?
d d d
+
+
13
©2015 Talend Inc
14
15
16
aggregate the
speed and
distance run of
each player IN
REAL TIME
Our goal:
17
• The camera array sends a feed of 25 frames per second
• Each frame captures the x,y,z coordinates of every player
• A live feed of sport data is actually pretty serious Big Data!
Challenges
18
Analytics Architecture
Database
Ingestion Process Store VisualizeDeliver
ALL designed in Talend – NO coding
19
• It let's you publish and subscribe to
streams of records. In this respect it
is similar to a message queue or
enterprise messaging system.
• It let's you store streams of records in
a fault-tolerant way.
• It let's you process streams of records
as they occur.
Distributed Streaming Platform
Kafka Background
20
• Fast and general engine for large-scale data processing
• Developed in response to processing limitations with MapReduce
• 10x faster than MapReduce on disk
• 100x faster than MapReduce in memory
• Has a stack of libraries including Spark Streaming & MLib (machine learning)
• Runs everywhere; on Hadoop or Standalone
Spark Background
21
©2015 Talend Inc
22
Next Step: From Analysis to Prediction
Team stats
Who is the most likely to score
next?
Which team is going to win?
Individual players stats
Which player need a rest/bench?
Which player are being traded
( bring in historical data)
23
Free Trial: Talend Big Data Sandbox
• A ready-to-run Docker environment
• A step-by-step expert guide: the cookbook
• Real-world scenarios using Spark, Kafka,
MapReduce & NoSQL
• Iot Analytics
• Real-time Recommendation
• Clickstream Analysis
• Weblogs Analysis
• EDW Offload
www.talend.com/BigDataSandbox
Hit the Easy Button for Hadoop, Spark and Machine Learning
24
• An active community
• 80,000 visitors/week
• 3M of total downloads
• Engaged members
• Individual members &
partners
• Active User Groups
• 1,000+components built by
the community
The NEW Talend Community
25
Talend Data Masters Awards
• Share your Talend story &
win in $1,500 for your
favorite charity
• Deadline: July 28th
• https://info.talend.com/d
atamasters2017all.html

More Related Content

Similar to Analyze real-time sport data feeds with Spark Streaming

Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMWalmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMRedis Labs
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSAmazon Web Services
 
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service   meetup ovh bordeauxOvh analytics data compute with apache spark as a service   meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service meetup ovh bordeauxMojtaba Imani
 
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a ServiceOVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a ServiceOVHcloud
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsScyllaDB
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Maya Lumbroso
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Dataconomy Media
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57AAMIR FAROOQUI
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryMemVerge
 
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...DevGAMM Conference
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysisAmazon Web Services
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Tugdual Grall
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingKai Wähner
 
Apache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehnerApache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehnerconfluent
 
Transforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleTransforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleScyllaDB
 
Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...J On The Beach
 

Similar to Analyze real-time sport data feeds with Spark Streaming (20)

Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMWalmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
 
Tech
TechTech
Tech
 
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service   meetup ovh bordeauxOvh analytics data compute with apache spark as a service   meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
 
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a ServiceOVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a Service
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
 
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks Presentation
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
 
Apache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehnerApache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehner
 
Transforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleTransforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at Scale
 
The NECSTLab Multi-Faceted Experience with AWS F1
The NECSTLab Multi-Faceted Experience with AWS F1The NECSTLab Multi-Faceted Experience with AWS F1
The NECSTLab Multi-Faceted Experience with AWS F1
 
Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Analyze real-time sport data feeds with Spark Streaming

  • 1. 1 Utilizing Spark Streaming for analyzing a real-time sport data feeds Demonstration
  • 2. 2 4.5 Trillion Frames per second 60 Frames Visible to the human eye
  • 3. 3 Camera Tracking Systems An array of cameras around the field capture the players and ball positions LIVE
  • 4. 4 So what? • Cool usecase and all, but what's the value? • Real-time streams from robotic manufacturing (Audi, Ford, BMW, Toyota) • Real-time traffic analysis for Smart Cities / Theme Parks (Denver, Cincinnati, London, Disney, Universal) • Real-time mechanical data from devices (Aircraft - Air France, Windmills – GE) • And before you discount this whole sports things • UK tax office collects 1.3B pounds ~2B USD in taxes each year from EPL teams • Greater than the GDP of the bottom 25% of all countries • 95 billion dollars wagered annually on NFL and college football • #1 on Forbes 2000 list by a lot…
  • 5. 5
  • 6. 6
  • 7. 7 What version do you need to solve the problem?
  • 9. 9 Raw vs Encoded 150mpbs at 4k per camera d d d + + Stadiums have on avg 20-30 cameras
  • 10. 10 From Seen To Described d d d + + Gigs of Video data to KB/MB description data Most applications that convert are proprietary but seeing investment in space by the usual suspects
  • 11. 11 Phone home? d d d + + Data tends to be JSON or XML Onvif Standard for Security Messaging vs Web services?
  • 12. 12 Where does it reside? d d d + +
  • 14. 14
  • 15. 15
  • 16. 16 aggregate the speed and distance run of each player IN REAL TIME Our goal:
  • 17. 17 • The camera array sends a feed of 25 frames per second • Each frame captures the x,y,z coordinates of every player • A live feed of sport data is actually pretty serious Big Data! Challenges
  • 18. 18 Analytics Architecture Database Ingestion Process Store VisualizeDeliver ALL designed in Talend – NO coding
  • 19. 19 • It let's you publish and subscribe to streams of records. In this respect it is similar to a message queue or enterprise messaging system. • It let's you store streams of records in a fault-tolerant way. • It let's you process streams of records as they occur. Distributed Streaming Platform Kafka Background
  • 20. 20 • Fast and general engine for large-scale data processing • Developed in response to processing limitations with MapReduce • 10x faster than MapReduce on disk • 100x faster than MapReduce in memory • Has a stack of libraries including Spark Streaming & MLib (machine learning) • Runs everywhere; on Hadoop or Standalone Spark Background
  • 22. 22 Next Step: From Analysis to Prediction Team stats Who is the most likely to score next? Which team is going to win? Individual players stats Which player need a rest/bench? Which player are being traded ( bring in historical data)
  • 23. 23 Free Trial: Talend Big Data Sandbox • A ready-to-run Docker environment • A step-by-step expert guide: the cookbook • Real-world scenarios using Spark, Kafka, MapReduce & NoSQL • Iot Analytics • Real-time Recommendation • Clickstream Analysis • Weblogs Analysis • EDW Offload www.talend.com/BigDataSandbox Hit the Easy Button for Hadoop, Spark and Machine Learning
  • 24. 24 • An active community • 80,000 visitors/week • 3M of total downloads • Engaged members • Individual members & partners • Active User Groups • 1,000+components built by the community The NEW Talend Community
  • 25. 25 Talend Data Masters Awards • Share your Talend story & win in $1,500 for your favorite charity • Deadline: July 28th • https://info.talend.com/d atamasters2017all.html

Editor's Notes

  1. More often that not, most data people anayze today is voliate – it comes and goes, in analyzed and gone. The idea was that you needed to download twitter to do anything of value with social analytics but that’s not true… there’s an api for that. The things Data anayltics is important to every originzation, doesn’t matter the size so “big” is different for everyone and that doesn’t Velocity and variety of the data Who here is a sports fan? Big fantasy league players here? Big data is an interesting marketing
  2. The 4.5 trillion frames per second is the FASTEST slow motion camera to date, it is used to capture the moments leading up to, during and after a chemical reaction… not something we’d need for a goal line review but it certainly exemplifies the big data challenge we are presenting. If you were to manually watch this, It would take you ~ hundreds of thousands of years to process…hope you didn’t have plans
  3. NFL Zebra – RFID’s in jerseys – Force impact, speed, concussion rates NBA, you’d think they could keep the traveling down to a minimum Goal Line technology
  4. There is a lot of value in the data that is created behind this… influence even by a small fraction we’re talking about millions
  5. Now we’re going to break this challenge up into two sections, the first will cover all aspects of the image collection and video processing, the second covers the analytics
  6. The first question that needs to be asked when architecting a solution for processing video and image data is what do I need to solve the problem. A lot of architectural decisions will be made depending on this question. Is the challenge to identify that what I am seeing is a car? do I need to know what color it is? Or what the model is? Or in the case of video, can I tell the difference between one car and another? Perhaps I am just getting a general flow of traffic on a highway, or am I trying to identify the market share of one of my competitors by identifying the ratio of my car brands vs theirs within a given area?
  7. Almost all video and image processing pipelines look like this. We’re capturing the raw video format and they compressing / encoding. Next we process the video to extract relevant metadata and then pass that information further downstream to our analytical process. There are a lot of questions as to where and when to do certain steps and we’ll walk though them in the following slides.
  8. * This makes a very strong argument for processing and handling it as locally as possible to work with that high bandwidth *18.88 Mbps in most urban areas with it even higher for a premium The FCC recently found that 39% of rural populations lack target levels of speed: 25 Mbps for downloads and 3 Mbps uploads This impacts things like smart farming or smart aggriculter Some HD video cameras output uncompressed video, whereas others compress the video using a lossy compression method such as MPEG or H.264 H265 is also picking up HEVC was developed with the goal of providing twice the compression efficiency of the previous standard, H.264 / AVC At an identical level of visual quality, HEVC enables video to be compressed to a file that is about half the size (or half the bit rate) of AVC, When compressed to the same file size or bit rate as AVC, HEVC delivers significantly better visual quality.
  9. NFL stadiums tend to have hundreds to the thousand servers within the stadium devoted to encoding and metadata processing. The usual suspects, Amazon, Google, Microsoft, IBM …. Just to name a few While a lot of the camera hardware vendors will provide this processing capability, I did a check and there are some 30 + available API’s out there to handle the video processing. This is likely the most complex and use case specific process and I have yet to find a one size fits all API.
  10. This makes a very strong argument for processing and handling it as locally as possible to work with that high bandwidth But as discussed as work continues in codec compression and infrastructure improves upload bandwidth we might get to the point where this discussion becomes mute. In short, the better we get at lossless compression the more flexible we can be in this step.... Where’s pied piper when you need them  So with that in mind I’d like to show you how you could build a process like this. We’re going to take the google vision API for a little spin, I am going to gather you up and we’re going to take a picture that I’ll post on twitter and pull down using Talend to analyze with the Google Vision API. It will spit out some interest results and hopefully recognize you all as people and see your faces.
  11. So we just covered how to architect something to handle video processing and discussed some of the trade offs for locality of service finishing off with a demo highlighting some of the work cloud based companies like google are doing to democratize the video and image meta data gathering process.
  12. So now lets focus on the analytical side. Where we left off from the video processing architecture was that the video data had been converted into a metadata representation. We’re going to want to work with that in a more general analytical setting.
  13. So going back to our conversation earlier about sports analytics and the gobs of money it brings it, we see coaches, analysts even the average sports viewer looking for insight into their favorite players; looking for ways to optimize their strategy to improve success.. In the case we have here which is focused on data collected from the EPL, players are often running all over the place and identifying when they are getting tired can be important intel for both teams. When you have players playing well into their 40s’ you want to make sure one of them isn’t going to break a hip or something…. The NFL is doing similar fact finding with regards to force impact analysis.. With so much attention on concussion rates and effects you bet everyone is making sure they keep their 120 Million franchise player safe and healthy.
  14. Heres just an example of what is in the JSON information we receive, while it’s not the 4.5 trillion frames per second
  15. Consistent Growth 1,500 members in the new Community.Talend.com INTERNAL ONLY 3M of total download of Talend software to-date since the company was founded (includes TOS + evals) In 2016, we had 360,000 total downloads, up 14% since 2015 (total downloads include TOS + evals) Engaged members: Members: Our community members are “strategic partners” in solving data challenges—not just Talend challenges. Talend Advocates: Small-to-medium SIs and VARs are the some of the greatest Talend champions in the community. They share their technical expertise and by sharing their knowledge, they get visibility and find new customers Thought Leaders: We’re about to launch a new Discussion Board about IoT/Smart Cities. By comparison, competitors use their forum for product support only. The health of a community is measure by the engagement—not just growth User Groups: Not only do we have community members that actively respond to questions on the forum …. …. we also have customers who are creating and managing User Groups around the world (US, UK, Germany, France, Belgium, Switzerland, and India) Our User Group in Portland, Maine, and Vancouver, Canada were launched by customers, and so were many others. The Community Team is launching one NEW user group/quarter. In 2017, we plan to have a new user group in Chicago, Dallas, Toronto, and Atlanta in 2017. Vancouver was launched in Q1. Every day, we have about 400 online concurrent users. Monetization: Both Talend and the Talend partners know how to monetize the community. Talend has been converting open source customers (i.e. Judicial Court of California, Mogo Finance Technology) from Open Studion to the commercial version, Talend Data Integration And partners who are active on the community are finding new business (some of the most active members are SI partners)
  16. Criteria Creativity and uniqueness of use Scope and complexity of project Business transformation and improvement Timeline We are accepting entries until July 28, 2017. Hurry and send your entries now! Winners will be notified in September. Winners will be announced in November. Eligibility Requirements Award winners should be willing to have their story shared publicly on Talend web site (company logo, video and case study) and promoted on social media and in press announcements.