Presented at Snowplow London Meetup, 8 February 2017
Bruce Pannaman, data scientist at Busuu, talked about why they are using Snowplow to validate and enrich data, enable one source of truth across different data sources, cope with peaks and troughs in the data stream, and easily integrate with third party systems such as Intercom, a customer messaging platform. One of Busuu’s future projects is to load multiple A/B tests into the apps and monitor their results in real time.
2. 2
busuu is the world’s leading social
network for language learning
Language courses Social network
+
• Access to native speakers
• Peer to peer text corrections
• High quality courses in 12 languages
• Beginner to advanced intermediate level
1 2
3. How does busuu work?
Most important
vocabulary
Key
grammar
Practice with
native speakers
Faster
fluency
busuu is a complete self-study and language practice environment
3
4. busuu 2016
What sort of data do we use?
● Front end tracking data
● Progress data
● Backend db data
● Third party data
6. busuu 2016
Problems
My data says
X, why does
yours say Y?
Cloudwatch
Alert!
“Why can’t i find
the results of my
A/B test till
tomorrow again?
“Oh my god, do
we really have to
put yet another
tracker in?”
12. busuu 2016
Snowplow delivery
How do we get
Snowplow to deliver
the events to
everybody/thing that
needs it, instead of
adding more trackers
to the frontend
18. busuu 2016
Plug & Play Integrations
18
● One source of truth
● Scalability
● Third party systems can be added very quickly
19. busuu 2016
Lambda?
19
Parse through each
field of enriched
data looking for
custom schema
name
One lambda function
per type of data and
per integration
Relay required data to
third party service
through REST api or
given python client
23. busuu 2016
Future projects
● Live A/B test trains & results
● Live machine learning results in app
● Automated alerting on complex company metrics.