Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Webinar - Analyzing Video
Next
Download to read offline and view in fullscreen.

0

Share

Download to read offline

Intelligent Applications with Machine Learning Toolkits

Download to read offline

Presented by Shawn Scully.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Intelligent Applications with Machine Learning Toolkits

  1. 1. 11 Shawn Scully - VP of Customer Success & Applications scully@dato.com @backwoodsbrains Intelligent Applications with Machine Learning Toolkits
  2. 2. Within 5 years, every innovative application will be intelligent.
  3. 3. 33 Intelligent applications create tremendous value …but take a lot of time & specialized skills to build. Recommenders Lead Scoring Churn Prediction Multi-channel Targeting Auto-Summarization Fraud detection Intrusion Detection Demand Forecasting Data Matching Failure Prediction
  4. 4. Our mission is to Accelerate innovators to create intelligent applications with agile machine learning.
  5. 5. Needs of an Agile ML Platform 5 Dato Predictive Services GraphLab Create rapid development deploy as microservice live serving, monitoring, & model management iterate w/feedback
  6. 6. A toolkit view of the world
  7. 7. 77 Algorithms vs. toolkits SVD++ w/SGD vs.SVD Recommender • item similarity • SVD++ • iALS • factorization machine • many more! • PhD students care a lot about these! • many papers focused on “my curve is better than your curve” • Not always the most practical… • Grouped by a common task • Focused on meaningful differences in data & problem • Practical implementations
  8. 8. 8 import graphlab as gl data = gl.SFrame.read_csv('my_data.csv') model = gl.recommender.create( data, user_id='user', item_id='movie’, target='rating') recommendations = model.recommend(k=5) cluster = gl.deploy.load(‘s3://path’) cluster.add(‘servicename’, model) Easily create a live machine learning service Create a Recommender 5 lines of code Toolkit w/auto selection Deploy in minutes
  9. 9. 99 Dato Machine Learning Toolkits Applications • recommender • sentiment_analysis • similarity_search • churn_predictor • data_matching • lead_scoring • clickthrough_predictor Fundamentals • regression • classifier • nearest_neighbors • clustering • deeplearning • anomaly_detection • pattern_mining • text_analytics • graph_analytics Utilities • model_parameter_search • cross_validation • evaluation • comparison • feature_engineering https://dato.com/products/create/docs/graphlab.toolkits.html 50+ models including factorization machines, convolutional neural nets, label propagation, & topic models all in one framework!
  10. 10. Toolkit: Recommender 10
  11. 11. 1111 Examples of Recommenders
  12. 12. Recommend 12 Value: • Increase user engagement • Sell more/increase clickthrough • Create better user experiences Goal: Find or recommend similar or related items.
  13. 13. 1313 Recommend - Data + Toolkit user_id item_id item_name 103 1 ‘Empire Strikes Back’ 102 2 ‘Wrath of Khan’ 104 3 ‘Sleepless in Seattle’ 102 4 ‘Rambo’ 104 5 ‘Chocolate’ 103 6 ‘The Avengers’ 102 1 ‘Empire Strikes Back’ 104 1 ‘Empire Strikes Back’ 103 4 ‘Rambo’ 104 7 ‘When Harry Met Sally’ 102 2 ‘Wrath of Khan’ 104 8 ‘Up’ recommender graphlab.recommender.create
  14. 14. Toolkit: Sentiment Analysis & Product Sentiment
  15. 15. 1515 Examples of sentiment scoring & summarization
  16. 16. Sentiment Analysis & Product Sentiment 16 Value: • Quantitative measures from unstructured text • Eliminate the need to read everything • Summarize on aspects you care about Goal: Score sentiment of a sentence, document, or aspect.
  17. 17. 1717 Sentiment scoring- Data + Toolkit sentiment_analysis graphlab.sentiment_analysis.create graphlab.product_sentiment.create
  18. 18. Toolkit: Similarity Search
  19. 19. 1919 Examples of image search & tagging
  20. 20. Image Search & Tagging 20 Value: • create more intuitive user experiences • learn interesting things like style • reduce manual processes (like tagging) Goal: Find visually similar images.
  21. 21. 2121 Image search - Data + Toolkit similarity_search graphlab.data_matching.similarity_search.create
  22. 22. Toolkit: Churn Predictor
  23. 23. Churn Prediction 23 Value: • Keep your customers • Optimize marketing/customer success spend • Identify issues with product or business Goal: Identify users that are likely to stop doing something (e.g. paying for your service, using a product feature, etc.)
  24. 24. Confidential - GraphLab internal use only Problem setup Period 1 Period 2 Period 3 Features Target Hold out set Goal: model that predicts if a user does not appear in Period 2 Evaluation: score for (app, user) pairs absent in Period 3 Machine learning model Evaluation
  25. 25. Data Transformations 25 Time Unique pairs app user time etc app user feature 1 feature 2 Features: ● time since last use ● time since first use ● # unique days user has used app ● # times user used app in last delta days ● Rolling aggregates ● etc Aggregate to generate predictive featuresopens
  26. 26. 2626 Predict Churn - Data + Toolkit user_id event datetimestamp 103 play ‘01-01-15’ 102 click ’02-05-15’ 102 visit ‘03-06-15’ 102 visit ’03-09-15’ 103 purchase ’03-21-15’ 103 click ’03-22-15’ 102 click ’03-23-15’ 103 click ’04-02-15’ 103 play ‘04-01-15’ 103 purchase ’05-02-15’ 103 play ‘05-01-15’ 103 play ’05-15-15’ churn_predictor graphlab.churn_predictor.create
  27. 27. Toolkit: Data Matching 27
  28. 28. 2828 Examples of data matching record= {‘SSN’:None, ‘Name’:’Smith, Will’ ‘Sex’:’Male’, ‘ZIP;:94701}
  29. 29. Data Matching 29 Value: • Deduplicate contacts/records • “360 view” of customer across multiple properties • Improve data quality Goal: Identify entities & appropriately link records.
  30. 30. 3030 Data matching – Data + Toolkit data_matching graphlab.deduplication.create graphlab.record_linker.create
  31. 31. More than 50,000 developers are using Dato 31
  32. 32. 3232
  33. 33. Tools built for innovators The Agile Machine Learning Platform Dato Confidential - Do not Distribute
  34. 34. 34 Agility to create machine learning services GraphLab Create Application Toolkits: • Auto-select the best algorithm • Auto-prepare the data for ML • Task-oriented methods Data Layer for ML • Manipulate all-relevant data types • Out-of-core design eliminates scale pains Robust Enterprise-Grade Algorithms • 50+ of best-practice & novel algorithms • Robust to real-world data
  35. 35. 3535 Dato Predictive ServicesReal-time Recommendations Online Ad Scoring & Serving Transactional Fraud detection Agility to deploy – Microservices on AWS, premises, Yarn
  36. 36. How will you make your enterprise intelligent?
  37. 37. 37 Thanks! get the software!: https://www.dato.com/download/ platform overview: https://dato.com/products/ talk about ML at your company: scully@dato.com Toolkits: overview:https://dato.com/products/create/docs/graphlab.toolkits.html recommender: https://dato.com/products/create/docs/graphlab.toolkits.recommender.html churn_predictor: https://dato.com/products/create/docs/graphlab.toolkits.churn_predictor.html similarity_search: https://dato.com/products/create/docs/graphlab.toolkits.data_matching.html#similarity-search-model sentiment_analysis: https://dato.com/products/create/docs/graphlab.toolkits.sentiment_analysis.html data_matching: https://dato.com/products/create/docs/graphlab.toolkits.data_matching.html

Presented by Shawn Scully.

Views

Total views

564

On Slideshare

0

From embeds

0

Number of embeds

2

Actions

Downloads

31

Shares

0

Comments

0

Likes

0

×