Ideas for designing data science products for: bots, knowledge gaps, IOT + fairness. Combining elements from Apache Spark and Turi's GraphLab products.
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Design for X: Exploring Product Design with Apache Spark and GraphLab
1. DESIGN FOR X
exploring data science product design with apache spark + graphlab {create}
@amcasari @Concur
data science summit 2016, san francisco
nasa
2. data science via random walks
senior product mgr +
data scientist
@ Concur Labs
control systems
engineering +
robotics + legos
officer in USN
operations research
analyst
wandering dirtbag +
conservation volunteer
EE +
applied math
+ complex systems
underwater robotics
engineer
technology
consultant
SAHM
3. INSANELY QUICK INTRO TO +
➤ Concur Accelerator Team
➤ Concur Labs
➤ Incubator (still brewing)
850K
Users log into Concur
300K
Expense reports
processed
120K
Trips booked
170M
Trips & expense
reports warehoused
Typical Day at Concur
How do we encourage a culture of innovation
while delivering quality service to our existing
33,000 business clients and 40M users?
4. DESIGN SPRINTS FOR DATA SCIENCEY PROTOTYPES
courtesy google ventures {we iterated…because data}
5. INSANELY QUICK INTRO TO
➤ “fast and general engine for large-scale data processing”
➤ advanced cyclic data flow and in-memory computing > runs
10x-100x faster than Hadoop MR
➤ interactive shells in several languages (incl. SQL)
➤ performant + scalable
courtesy databricks
6. ALMOST AS INSANELY QUICK INTRO TO +
➤ graphlab create is based on a python data science library
developed + (some) os’d by turi
➤ SFrame <<>> Spark DataFrame | SparkRDD
➤ (yes it works with Open Source SFrame and GLC)
courtesy turi
7. WHAT PROBLEM DO WE WANT TO DATA SCIENCE?
Knowledge
Gaps
IOT
Networks
Bots
Fairness
+
8. ➤ “We could {build this} {answer this better} if….”
➤ Reciprocal Data Applications
DESIGN FOR KNOWLEDGE GAPS
rda rdarda
choose
your data
storage
choose
your data
storage
choose
your data
storage
the app you
really
want to make
9. ➤ “Can we trust our sensors?”
➤ “Has our network been hacked?”
DESIGN FOR IOT NETWORKS
device
device
device
alerts,
notifications,
monitoring
dashboards
data
services
Anomaly Detection Toolkit
TimeSeries <<>> SFrame
10. ➤ “How do we create a conversational interface?”
….nothing new, just the burning question since Turing, 1950
DESIGN FOR BOTS
what NOT to do….
non-creepy
unisex
animal mascot
conversational
ui
choose
or
create
your
framework
choose your data storage
Advanced Deep Learning
Text Analysis Toolkit
Graph Analytics Toolkit
11. ➤ know your biases + limitations
➤ in your data, their data, all the data
➤ in your feature selection
➤ in your algorithm
…..because ethics (these ALL bias your results + communications)
DESIGN FOR FAIRNESS
learn more at data & society’s case studies
+ +
open source. reproducible. transparent.
12. {THANKS MUCH}
➤ Concur is hiring!
➤ SAP + SAP Ariba are
hiring!
concurlabs.com
github.com/
concurlabs
➤ example notebooks will
be posted on our
github in the future
@amcasari