SlideShare a Scribd company logo
1 of 57
Download to read offline
天下武功唯快不破: 
 利用串流資料實做出即時分類器和即時推薦系統 
Yahoo! Taiwan EC Data Team
Who I am 
▪ Norman Huang (normany@yahoo-inc.com) 
▪ Software & Data Engineer of Yahoo! Taiwan 
▪ Aims to retrieve and deliver data insights via BI 
platform and data mining algorithms. 
2
Who I am 
▪ Jason Lin (jasonysl@yahoo-inc.com) 
▪ Software & Data Engineer of Yahoo! Taiwan 
▪ Responsible for recommendation system 
personalization mechanisms and cloud 
computing developer. 
3
Agenda 
▪ Challenges 
▪ Solution: Pinball 
▪ Q&A 
4
Challenges 
! 
! 
! 
! 
! 
! 
▪ Static content until next batch job. 
! 
! 
! 
5 
Processing
Challenges 
! 
! 
! 
! 
! 
! 
▪ Static content until next batch job. 
▪ Batched product recommendation algorithms have become common 
features among e-commerce platforms. 
! 
6 
Processing
Challenges 
! 
! 
! 
! 
! 
! 
▪ Nearly 72% of visitors made their decision at the same day. 
7 
Absorbed into batch views Not yet absorbed 
Time 
Several hours of data
Challenges 
! 
! 
! 
! 
! 
! 
▪ Nearly 72% of visitors made their decision at the same day. 
▪ Real-time solution to interact with potential buyers. 
8 
Absorbed into batch views Not yet absorbed 
Time 
Several hours of data
Solution: Pinball 
9
Pinball 
10 
Classifier 
Classifier 
User 
Profile A Profile B Profile C
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
11
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
12
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
› Storm 
13
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
› Storm! 
› How to determine customers’ purchasing desire? 
14
Pinball 
! 
▪ Real-time classifier 
▪ Detect buyers’ preferences by streaming data processing 
▪ Deliver personalized ads and product recommendations on the fly 
! 
▪ Challenges 
› How do to it in real-time? 
› Storm! 
› How to determine customers’ purchasing desire? 
› Buying Intention Detection 
15
Solution: Pinball 
▪ Storm Overview 
▪ Buying Intention (BI) 
▪ Architecture and Design 
16
Pinball 
17 
Storm Learning 
Buyer
Pinball 
18 
Storm Learning 
Buyers
Pinball 
19 
Learning 
Storm 
Is Potential 
Buyer? 
Buyers 
Visitor 
Promotions
Pinball 
Pinball 
20 
Learning 
Storm 
Is Potential 
Buyer? 
Buyers 
Visitor 
Promotions
Pinball 
Pinball 
21 
Learning 
Storm 
Is Potential 
Buyer? 
Buyers 
Buyer 
Promotions
Storm Concepts 
▪ Tuple & Streams 
▪ Spouts & Bolts 
▪ Topologies 
Yahoo Confidential & Proprietary 
22
Tuple & Streams 
▪ Tuple 
! 
! 
! 
! 
▪ Stream 
Yahoo Confidential & Proprietary 
23 
Field 1 Field 2 Field 3 Field 4 Field 5 
Tuple 
Tuple 1 Tuple 2 Tuple 3 Tuple n 
Stream
Spouts & Bolts 
Yahoo Confidential & Proprietary 
24 
Spout T T T T T Bolt T T T
Topology 
25 
Spout Bolt Bolt 
Streams 
▪ Hadoop map-reduce job vs. Storm topology
Topology 
26 
Spout Bolt Bolt 
Streams 
▪ Hadoop map-reduce job vs. Storm topology
Storm Concepts 
Yahoo Confidential & Proprietary 
27 
Computational 
Primitives 
Use Case 
High-level! 
Language 
Hadoop Map & Reduce 
Batch 
Processing 
Pig 
Storm Spout & Bolt 
Stream 
Processing 
Trident
Storm 
28 
Nimbus 
Zookeeper 
Zookeeper 
Zookeeper 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Master node, similar to the Hadoop JobTracker
Storm 
29 
Nimbus 
Zookeeper 
Zookeeper 
Zookeeper 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Coordinates the Storm cluster
Storm 
30 
Nimbus 
Zookeeper 
Zookeeper 
Zookeeper 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Supervisor 
Run worker processes
Buying Intention 
▪ Based on our findings: 
› The more page views, the higher the chance a visitor will buy it. 
› BUT, the buying intension value of each category will vary. 
31 
2 6
How to leverage 
Storm with Buying Intention (BI)?
Data Flow Diagram 
33
Adaptive Learning 
34
Learning & Classifier 
▪ Online Binary Classification 
› Simple and computationally efficient 
▪ e.g. 
› assumptions: γ=0.1, BI = 3 
› scenario: a user makes 6 page views before purchasing 
• BI’ = 3 + (6-3) x 0.1 
• BI’ = 3.3 
35 
BI ' = BI +(PV − BI )×γ
Buying Intention Qualification 
36
37 
Topology Design
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch Real-time processing 
Yahoo Confidential & Proprietary 
38
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch Real-time processing 
Yahoo Confidential & Proprietary 
39
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch + Real-time processing 
› Hybrid batch and real-time processing 
Yahoo Confidential & Proprietary 
40
Lambda Architecture 
▪ Term created by Nathan Marz (Creator of Apache Storm) 
! 
▪ Batch + Real-time processing 
› Hybrid batch and real-time processing 
› Batch processing is treated as source of truth, and real-time updates 
models/insights between batches. 
Yahoo Confidential & Proprietary 
41
Lambda Architecture 
Yahoo Confidential & Proprietary 
42 
[REF] http://lambda-architecture.net/
Lambda Architecture 
Yahoo Confidential & Proprietary 
43 
[REF] http://lambda-architecture.net/
Lambda Architecture 
Yahoo Confidential & Proprietary 
44 
Storm Streaming 
[REF] http://lambda-architecture.net/
Lambda Architecture 
Summingbird 
Yahoo Confidential & Proprietary 
45 
[REF] http://lambda-architecture.net/
Pinball Demonstration
47
How to keep it generic and flexible? 
▪ to add more signals 
▪ to add more online learning algorithms 
▪ to add more channels
How to keep it generic and flexible? 
Signals 
Algorithms 
Channels 
49 
Click Login 
Buy 
View 
Bounce 
Time 
Spent 
Buying Intention 
Email Y! Webpages Mobile 
Apps 
Messenger 
Fraud Detection 
Webpage 
Sequence
Summary 
▪ Scalable to process real-time data 
▪ Supports online learning algorithms 
▪ Flexible interactions with visitors 
▪ Increase user's engagement 
▪ Increase the conversion rate 
▪ To create synergy by combining batched recommender and Pinball 
Yahoo Confidential & Proprietary 
50
Simple Hands-on 
-> Find out the heavy users!
Find out the heavy users! 
▪ Memorize the numbers of page views for each user 
▪ If the numbers are great than 3, it’s a heavy user 
Yahoo Confidential & Proprietary 
52
Find out the heavy users! 
Yahoo Confidential & Proprietary 
53 
User Log 
Spout 
Learning 
Bolt 
userid, type, catlv1, catlv2, timestamp
Find out the heavy users! 
Yahoo Confidential & Proprietary 
54 
User Log 
Spout 
Learning 
Bolt 
userid, type, catlv1, catlv2, timestamp 
Learning 
Bolt 
shuffleGroup 
userA, xxxxx 
userB, xxxxx 
userD, xxxxx 
userB, xxxxx 
userE, xxxxx 
userC, xxxxx 
userB, xxxxx 
userC, xxxxx
Find out the heavy users! 
Yahoo Confidential & Proprietary 
55 
User Log 
Spout 
Learning 
Bolt 
userid, type, catlv1, catlv2, timestamp 
Learning 
Bolt 
fieldGroup 
userA, xxxxx 
userD, xxxxx 
userF, xxxxx 
userF, xxxxx 
userE, xxxxx 
userC, xxxxx 
userB, xxxxx 
userB, xxxxx 
userB, xxxxx 
userC, xxxxx
Find out the heavy users! 
Yahoo Confidential & Proprietary 
56 
User Log 
Spout 
Learning 
Bolt 
Learning 
Bolt 
fieldGroup 
userA, xxxxx 
userD, xxxxx 
userF, xxxxx 
userF, xxxxx 
userE, xxxxx 
userC, xxxxx 
userB, xxxxx 
userB, xxxxx 
userB, xxxxx 
userC, xxxxx 
Qualification 
Bolt 
userA, totalPV 
userB, totalPV 
userC, totalPV 
userF, totalPV
Questions? 
Norman! 
@normanyhuang! 
www.linkedin.com/in/normany 
Jason! 
@kalijason! 
www.linkedin.com/pub/jason-lin/67/93/743

More Related Content

What's hot

Hadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for BioinformaticsHadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for Bioinformaticsosintegrators
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and RoadmapImply
 
Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)Javier de la Rosa
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...MongoDB
 
Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010Rob Ragan
 
Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jSuroor Wijdan
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...NoSQLmatters
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseMindfire Solutions
 
Caching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesCaching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesRoi Blanco
 
Tenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of BingTenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of BingRob Ragan
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4jjexp
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...jexp
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j FundamentalsMax De Marzi
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache DruidImply
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherMongoDB
 
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...Rob Ragan
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB
 
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)barcelonajug
 

What's hot (20)

Hadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for BioinformaticsHadoop and Neo4j: A Winning Combination for Bioinformatics
Hadoop and Neo4j: A Winning Combination for Bioinformatics
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and Roadmap
 
Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
 
Big Data made easy with a Spark
Big Data made easy with a SparkBig Data made easy with a Spark
Big Data made easy with a Spark
 
Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010Lord of the Bing - Black Hat USA 2010
Lord of the Bing - Black Hat USA 2010
 
Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph Database
 
Caching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesCaching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental Indices
 
Tenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of BingTenacious Diggity - Skinny Dippin in a Sea of Bing
Tenacious Diggity - Skinny Dippin in a Sea of Bing
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 
How Graph Databases efficiently store, manage and query connected data at s...
How Graph Databases efficiently  store, manage and query  connected data at s...How Graph Databases efficiently  store, manage and query  connected data at s...
How Graph Databases efficiently store, manage and query connected data at s...
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j Fundamentals
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache Druid
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
Black Hat 2011 - Pulp Google Hacking: The Next Generation Search Engine Hacki...
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
 
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)Introduction to Graph databases and Neo4j (by Stefan Armbruster)
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
 

Viewers also liked

李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接台灣資料科學年會
 
一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易台灣資料科學年會
 
林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅台灣資料科學年會
 
Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析台灣資料科學年會
 
[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123台灣資料科學年會
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)台灣資料科學年會
 

Viewers also liked (7)

李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接李慕約&王向榮/如何備料:資料的抓取、清理以及串接
李慕約&王向榮/如何備料:資料的抓取、清理以及串接
 
Z > B 的資料科學
Z > B 的資料科學Z > B 的資料科學
Z > B 的資料科學
 
一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易一個賭徒的告白:從預測市場看金融交易
一個賭徒的告白:從預測市場看金融交易
 
林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅林佳賢/資料視覺化的 20 個小訣竅
林佳賢/資料視覺化的 20 個小訣竅
 
Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析Collaboration with Statistician? 矩陣視覺化於探索式資料分析
Collaboration with Statistician? 矩陣視覺化於探索式資料分析
 
[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123[系列活動] 給工程師的統計學及資料分析 123
[系列活動] 給工程師的統計學及資料分析 123
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
 

Similar to 天下武功唯快不破:利用串流資料實做出即時分類器和即時推薦系統

Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products Lili Wu
 
Trending with Purpose
Trending with PurposeTrending with Purpose
Trending with PurposeJason Dixon
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowRichard Wallis
 
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...Dataconomy Media
 
Big Data Berlin - Criteo
Big Data Berlin - CriteoBig Data Berlin - Criteo
Big Data Berlin - CriteoSofian Djamaa
 
Wireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the JobWireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the JobCatharine Robertson
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionNeo4j
 
Architecting a next generation data platform
Architecting a next generation data platformArchitecting a next generation data platform
Architecting a next generation data platformhadooparchbook
 
Learn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeLearn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeDynamic Yield
 
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsSemantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsPeter Mika
 
Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion Inside Analysis
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Impetus Technologies
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamCraig Sullivan
 
Complex things explained easily
Complex things explained easilyComplex things explained easily
Complex things explained easilyLuca Tumedei
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajSri Ambati
 
IronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour PresentationIronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour PresentationIronEdge Group
 
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, EuropePatterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, EuropeFlip Kromer
 
2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockups2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockupsBaltimore Lean Startup
 
iPhone game development - Joash Chee
iPhone game development - Joash CheeiPhone game development - Joash Chee
iPhone game development - Joash Cheejasonong
 

Similar to 天下武功唯快不破:利用串流資料實做出即時分類器和即時推薦系統 (20)

Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products Avatara: OLAP for Web-scale Analytics Products
Avatara: OLAP for Web-scale Analytics Products
 
Trending with Purpose
Trending with PurposeTrending with Purpose
Trending with Purpose
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & How
 
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
From 6 hours to 1 minute... in 2 days! How we managed to stream our (long) Ha...
 
Big Data Berlin - Criteo
Big Data Berlin - CriteoBig Data Berlin - Criteo
Big Data Berlin - Criteo
 
NoSQL e Python RuPy 2012
NoSQL e Python RuPy 2012NoSQL e Python RuPy 2012
NoSQL e Python RuPy 2012
 
Wireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the JobWireframes: Choose the Right Tool for the Job
Wireframes: Choose the Right Tool for the Job
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in Production
 
Architecting a next generation data platform
Architecting a next generation data platformArchitecting a next generation data platform
Architecting a next generation data platform
 
Learn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeLearn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-Time
 
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsSemantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistants
 
Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion Jam
 
Complex things explained easily
Complex things explained easilyComplex things explained easily
Complex things explained easily
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
 
IronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour PresentationIronEdge PowerBI World Tour Presentation
IronEdge PowerBI World Tour Presentation
 
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, EuropePatterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
Patterns of the Lambda Architecture -- 2015 April - Hadoop Summit, Europe
 
2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockups2012-09-24 Workshop: Wireframe mockups
2012-09-24 Workshop: Wireframe mockups
 
iPhone game development - Joash Chee
iPhone game development - Joash CheeiPhone game development - Joash Chee
iPhone game development - Joash Chee
 

More from 台灣資料科學年會

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用台灣資料科學年會
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告台灣資料科學年會
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇台灣資料科學年會
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 台灣資料科學年會
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵台灣資料科學年會
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用台灣資料科學年會
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告台灣資料科學年會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人台灣資料科學年會
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維台灣資料科學年會
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察台灣資料科學年會
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰台灣資料科學年會
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳台灣資料科學年會
 

More from 台灣資料科學年會 (20)

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
 
台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

天下武功唯快不破:利用串流資料實做出即時分類器和即時推薦系統

  • 2. Who I am ▪ Norman Huang (normany@yahoo-inc.com) ▪ Software & Data Engineer of Yahoo! Taiwan ▪ Aims to retrieve and deliver data insights via BI platform and data mining algorithms. 2
  • 3. Who I am ▪ Jason Lin (jasonysl@yahoo-inc.com) ▪ Software & Data Engineer of Yahoo! Taiwan ▪ Responsible for recommendation system personalization mechanisms and cloud computing developer. 3
  • 4. Agenda ▪ Challenges ▪ Solution: Pinball ▪ Q&A 4
  • 5. Challenges ! ! ! ! ! ! ▪ Static content until next batch job. ! ! ! 5 Processing
  • 6. Challenges ! ! ! ! ! ! ▪ Static content until next batch job. ▪ Batched product recommendation algorithms have become common features among e-commerce platforms. ! 6 Processing
  • 7. Challenges ! ! ! ! ! ! ▪ Nearly 72% of visitors made their decision at the same day. 7 Absorbed into batch views Not yet absorbed Time Several hours of data
  • 8. Challenges ! ! ! ! ! ! ▪ Nearly 72% of visitors made their decision at the same day. ▪ Real-time solution to interact with potential buyers. 8 Absorbed into batch views Not yet absorbed Time Several hours of data
  • 10. Pinball 10 Classifier Classifier User Profile A Profile B Profile C
  • 11. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly 11
  • 12. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? 12
  • 13. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? › Storm 13
  • 14. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? › Storm! › How to determine customers’ purchasing desire? 14
  • 15. Pinball ! ▪ Real-time classifier ▪ Detect buyers’ preferences by streaming data processing ▪ Deliver personalized ads and product recommendations on the fly ! ▪ Challenges › How do to it in real-time? › Storm! › How to determine customers’ purchasing desire? › Buying Intention Detection 15
  • 16. Solution: Pinball ▪ Storm Overview ▪ Buying Intention (BI) ▪ Architecture and Design 16
  • 17. Pinball 17 Storm Learning Buyer
  • 18. Pinball 18 Storm Learning Buyers
  • 19. Pinball 19 Learning Storm Is Potential Buyer? Buyers Visitor Promotions
  • 20. Pinball Pinball 20 Learning Storm Is Potential Buyer? Buyers Visitor Promotions
  • 21. Pinball Pinball 21 Learning Storm Is Potential Buyer? Buyers Buyer Promotions
  • 22. Storm Concepts ▪ Tuple & Streams ▪ Spouts & Bolts ▪ Topologies Yahoo Confidential & Proprietary 22
  • 23. Tuple & Streams ▪ Tuple ! ! ! ! ▪ Stream Yahoo Confidential & Proprietary 23 Field 1 Field 2 Field 3 Field 4 Field 5 Tuple Tuple 1 Tuple 2 Tuple 3 Tuple n Stream
  • 24. Spouts & Bolts Yahoo Confidential & Proprietary 24 Spout T T T T T Bolt T T T
  • 25. Topology 25 Spout Bolt Bolt Streams ▪ Hadoop map-reduce job vs. Storm topology
  • 26. Topology 26 Spout Bolt Bolt Streams ▪ Hadoop map-reduce job vs. Storm topology
  • 27. Storm Concepts Yahoo Confidential & Proprietary 27 Computational Primitives Use Case High-level! Language Hadoop Map & Reduce Batch Processing Pig Storm Spout & Bolt Stream Processing Trident
  • 28. Storm 28 Nimbus Zookeeper Zookeeper Zookeeper Supervisor Supervisor Supervisor Supervisor Supervisor Master node, similar to the Hadoop JobTracker
  • 29. Storm 29 Nimbus Zookeeper Zookeeper Zookeeper Supervisor Supervisor Supervisor Supervisor Supervisor Coordinates the Storm cluster
  • 30. Storm 30 Nimbus Zookeeper Zookeeper Zookeeper Supervisor Supervisor Supervisor Supervisor Supervisor Run worker processes
  • 31. Buying Intention ▪ Based on our findings: › The more page views, the higher the chance a visitor will buy it. › BUT, the buying intension value of each category will vary. 31 2 6
  • 32. How to leverage Storm with Buying Intention (BI)?
  • 35. Learning & Classifier ▪ Online Binary Classification › Simple and computationally efficient ▪ e.g. › assumptions: γ=0.1, BI = 3 › scenario: a user makes 6 page views before purchasing • BI’ = 3 + (6-3) x 0.1 • BI’ = 3.3 35 BI ' = BI +(PV − BI )×γ
  • 38. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch Real-time processing Yahoo Confidential & Proprietary 38
  • 39. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch Real-time processing Yahoo Confidential & Proprietary 39
  • 40. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch + Real-time processing › Hybrid batch and real-time processing Yahoo Confidential & Proprietary 40
  • 41. Lambda Architecture ▪ Term created by Nathan Marz (Creator of Apache Storm) ! ▪ Batch + Real-time processing › Hybrid batch and real-time processing › Batch processing is treated as source of truth, and real-time updates models/insights between batches. Yahoo Confidential & Proprietary 41
  • 42. Lambda Architecture Yahoo Confidential & Proprietary 42 [REF] http://lambda-architecture.net/
  • 43. Lambda Architecture Yahoo Confidential & Proprietary 43 [REF] http://lambda-architecture.net/
  • 44. Lambda Architecture Yahoo Confidential & Proprietary 44 Storm Streaming [REF] http://lambda-architecture.net/
  • 45. Lambda Architecture Summingbird Yahoo Confidential & Proprietary 45 [REF] http://lambda-architecture.net/
  • 47. 47
  • 48. How to keep it generic and flexible? ▪ to add more signals ▪ to add more online learning algorithms ▪ to add more channels
  • 49. How to keep it generic and flexible? Signals Algorithms Channels 49 Click Login Buy View Bounce Time Spent Buying Intention Email Y! Webpages Mobile Apps Messenger Fraud Detection Webpage Sequence
  • 50. Summary ▪ Scalable to process real-time data ▪ Supports online learning algorithms ▪ Flexible interactions with visitors ▪ Increase user's engagement ▪ Increase the conversion rate ▪ To create synergy by combining batched recommender and Pinball Yahoo Confidential & Proprietary 50
  • 51. Simple Hands-on -> Find out the heavy users!
  • 52. Find out the heavy users! ▪ Memorize the numbers of page views for each user ▪ If the numbers are great than 3, it’s a heavy user Yahoo Confidential & Proprietary 52
  • 53. Find out the heavy users! Yahoo Confidential & Proprietary 53 User Log Spout Learning Bolt userid, type, catlv1, catlv2, timestamp
  • 54. Find out the heavy users! Yahoo Confidential & Proprietary 54 User Log Spout Learning Bolt userid, type, catlv1, catlv2, timestamp Learning Bolt shuffleGroup userA, xxxxx userB, xxxxx userD, xxxxx userB, xxxxx userE, xxxxx userC, xxxxx userB, xxxxx userC, xxxxx
  • 55. Find out the heavy users! Yahoo Confidential & Proprietary 55 User Log Spout Learning Bolt userid, type, catlv1, catlv2, timestamp Learning Bolt fieldGroup userA, xxxxx userD, xxxxx userF, xxxxx userF, xxxxx userE, xxxxx userC, xxxxx userB, xxxxx userB, xxxxx userB, xxxxx userC, xxxxx
  • 56. Find out the heavy users! Yahoo Confidential & Proprietary 56 User Log Spout Learning Bolt Learning Bolt fieldGroup userA, xxxxx userD, xxxxx userF, xxxxx userF, xxxxx userE, xxxxx userC, xxxxx userB, xxxxx userB, xxxxx userB, xxxxx userC, xxxxx Qualification Bolt userA, totalPV userB, totalPV userC, totalPV userF, totalPV
  • 57. Questions? Norman! @normanyhuang! www.linkedin.com/in/normany Jason! @kalijason! www.linkedin.com/pub/jason-lin/67/93/743