Detailed design for a robust counter as well as design for a completely on-line multi-armed bandit implementation that uses the new Bayesian Bandit algorithm.
2. The Challenge
• Hadoop is great of processing vats of data
– But sucks for real-time (by design!)
• Storm is great for real-time processing
– But lacks any way to deal with batch processing
• It sounds like there isn’t a solution
– Neither fashionable solution handles everything
4. Hadoop is Not Very Real-time
Unprocessed now
Data
t
Fully Latest full Hadoop job
processed period takes this
long for this
data
5. Need to Plug the Hole in Hadoop
• We have real-time data with limited state
– Exactly what Storm does
– And what Hadoop does not
• Can Storm and Hadoop be combined?
6. Real-time and Long-time together
Blended now
View
view
t
Hadoop works Storm
great back here works
here
7. An Example
• I want to know how many queries I get
– Per second, minute, day, week
• Results should be available
– within <2 seconds 99.9+% of the time
– within 30 seconds almost always
• History should last >3 years
• Should work for 0.001 q/s up to 100,000 q/s
• Failure tolerant, yadda, yadda
9. Counter Bolt Detail
• Input: Labels to count
• Output: Short-term semi-aggregated counts
– (time-window, label, count)
• Non-zero counts emitted if
– event count reaches threshold (typical 100K)
– time since last count reaches threshold (typical 1s)
• Tuples acked when counts emitted
• Double count probability is > 0 but very small
10. Counter Bolt Counterintuitivity
• Counts are emitted for same label, same time
window many times
– these are semi-aggregated
– this is a feature
– tuples can be acked within 1s
– time windows can be much longer than 1s
• No need to send same label to same bolt
– speeds failure recovery
11. Design Flexibility
• Counter can persist short-term transaction log
– counter can recover state on failure
– log is normally burn after write
• Count flush interval can be extended without
extending tuple timeout
– Decreases currency of counts
– System is still real-time at a longer time-scale
• Total bandwidth for log is typically not huge
12. Counter Bolt No-nos
• Cannot accumulate entire period in-memory
– Tuples must be ack’ed much sooner
– State must be persisted before ack’ing
– State can easily grow too large to handle without
disk access
• Cannot persist entire count table at once
– Incremental persistence required
13. Guarantees
• Counter output volume is small-ish
– the greater of k tuples per 100K inputs or k tuple/s
– 1 tuple/s/label/bolt for this exercise
• Persistence layer must provide guarantees
– distributed against node failure
– must have either readable flush or closed-append
– HDFS is distributed, but no guarantees
– MapRfs is distributed, provides both guarantees
14. Failure Modes
• Bolt failure
– buffered tuples will go un’acked
– after timeout, tuples will be resent
– timeout ≈ 10s
– if failure occurs after persistence, before acking, then
double-counting is possible
• Storage (with MapR)
– most failures invisible
– a few continue within 0-2s, some take 10s
– catastrophic cluster restart can take 2-3 min
– logger can buffer this much easily
15. Presentation Layer
• Presentation must
– read recent output of Logger bolt
– read relevant output of Hadoop jobs
– combine semi-aggregated records
• User will see
– counts that increment within 0-2 s of events
– seamless meld of short and long-term data
16. Mobile Network Monitor
Transaction
data
Geo-dispersed
ingest servers Batch aggregation
Retro-analysis
interface
Map
Real-time dashboard
and alerts
16
17. Example 2 – Real-time learning
• My system has to
– learn a response model
and
– select training data
– in real-time
• Data rate up to 100K queries per second
18. Door Number 3
• I have 15 versions of my landing page
• Each visitor is assigned to a version
– Which version?
• A conversion or sale or whatever can happen
– How long to wait?
• Some versions of the landing page are horrible
– Don’t want to give them traffic
19. Real-time Constraints
• Selection must happen in <20 ms almost all
the time
• Training events must be handled in <20 ms
• Failover must happen within 5 seconds
• Client should timeout and back-off
– no need for an answer after 500ms
• State persistence required
20. Rough Design
Selector Query Event Counter
DRPC Spout Timed Join Model
Layer Spout Bolt
Conversion Logger
Logger Model
Detector Bolt
Bolt State
Raw
Logs
21. A Quick Diversion
• You see a coin
– What is the probability of heads?
– Could it be larger or smaller than that?
• I flip the coin and while it is in the air ask again
• I catch the coin and ask again
• I look at the coin (and you don’t) and ask again
• Why does the answer change?
– And did it ever have a single value?
22. A First Conclusion
• Probability as expressed by humans is
subjective and depends on information and
experience
23. A Second Conclusion
• A single number is a bad way to express
uncertain knowledge
• A distribution of values might be better
27. Bayesian Bandit
• Compute distributions based on data
• Sample p1 and p2 from these distributions
• Put a coin in bandit 1 if p1 > p2
• Else, put the coin in bandit 2
28. And it works!
0.12
0.11
0.1
0.09
0.08
0.07
regret
0.06
ε- greedy, ε = 0.05
0.05
0.04 Bayesian Bandit with Gam m a- Norm al
0.03
0.02
0.01
0
0 100 200 300 400 500 600 700 800 900 1000 1100
n
29. The Basic Idea
• We can encode a distribution by sampling
• Sampling allows unification of exploration and
exploitation
• Can be extended to more general response
models