Distilling insights @ AppsFlyer

Distilling insights @
Arnon Rotem-Gal-Oz
Chief Data Officer

Data’s hierarchy of needs*
*With apologies to Maslow

What is AppsFlyer?
What is AppsFlyer?
Mobile Attribution Measurement and Analytics
Mobile attribution measurement and analysts

Kafka
Columnar Database
(Redshift- evaluating Vertica)
Secor
Aggregations
SparkSQL
(evaluating
Drill,
Presto)
SQL
SQL
Raw
(sequence
files)
DW
(parquet
files)
DM
(Aggregations)
Vishnu
Self-serve
BI
(TBD)
Spark
Spark
ML
Latest Events
Scoring
Blueshift
Mojito
installs clicksinapplaunches
Spark
Spark
ETL
Accounts
Application
dashboard
Latestevent
exploration
Kafka
Columnar Database
(Redshift- evaluating Vertica)
Secor
Aggregations
SparkSQL
(evaluating
Drill,
Presto)
SQL
SQL
Raw
(sequence
files)
DW
(parquet
files)
DM
(Aggregations)
Vishnu
Self-serve
BI
(TBD)
Spark
Spark
ML
Latest Events
Scoring
Blueshift
Mojito
installs clicksinapplaunches
Spark
Spark
ETL
Accounts
Application
dashboard
Latestevent
exploration

Get the data from the
big data lake

Or locate it somehow in the
big data swamp…

What’s the distance between
two IP addresses
20

Big data doesn’t always mean we
need to analyze petabytes of data
sometimes it means we can find
just the right sample
21

Model selection
• Naive Bayes (built in)
• Logistic Regression (built in)
• SVM (built in)
• Decision trees (built in)
• Locality sensitive hashing
(https://github.com/mrsqueeze/spark-hash)
22

Transform from
Data frames to MLLib
23
LabeledPoint
Vectors.Dense
Row
Schema categoricalFeature

Torture the data enough and it
will confess to anything
25

• Big data is not just about big data
• Getting insights - It’s a process
• Spark is great but can drive you crazy :)
26
Takeaways

Summary
• Understand the problem
• Data exploration
• Feature selection (and building)
• (ETLing)
• Model selection
• Model evaluation
27

28
We’re hiring….
jobs@appsﬂyer.com

Distilling insights @ AppsFlyer

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Distilling insights @ AppsFlyer

Similar to Distilling insights @ AppsFlyer (20)

More from Arnon Rotem-Gal-Oz

More from Arnon Rotem-Gal-Oz (20)

Recently uploaded

Recently uploaded (20)

Distilling insights @ AppsFlyer