Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Implementing Artificial Intelligence with Big Data

Presented by Raymond Fu, Big Data Architect, Trace3

  • Login to see the comments

Implementing Artificial Intelligence with Big Data

  1. 1. Implementing AI with Big Data SoCal Data Science Conference 2017 Raymond Fu Los Angeles, CA 10-22-2017
  2. 2. Raymond Fu Data Architect Trace3 Email: rfu@trace3.com Twitter: @RaymondxFu
  3. 3. The Future World with AI
  4. 4. What Problem is AI Solving Today Input Emails Images Audio Chinese (你妹) Text Response Is it a Spam? (0/1) What is it? (1, …, 100) Text English (Your Sister) Audio
  5. 5. “The massive economic value of AI today is driven by supervised learning.” - Andrew Ng
  6. 6. Machine Learning Features Emails Labels Is it a Spam? (0/1) 1 f1 f2 f.. fn 2 ... m 1 0 0 1 ... ... ... ... ... ? Training Predicting
  7. 7. Machine Learning
  8. 8. AI vs. Machine Learning vs. Deep Learning Artificial Intelligence - Machine thinks, talks, and behaves as human. Machine Learning - Computer makes decision without being explicitly programmed. Deep Learning - A network of multi-layer non- linear processing unit capable of adapting itself to new data.
  9. 9. “AI problem is a Data Problem. The more data, the merrier.” - Raymond Fu
  10. 10. Machine Learning vs. Statistics Machine Learning Goal: “learning” from data of all sorts No assumptions about data distributions Generalization is through training, validation and test datasets Tolerant of redundant features. Does not promote data reduction prior to learning. Statistics Goal: Analyzing and summarizing data Tight assumptions about data distributions Generalization is pursued using statistical tests on the training dataset. Preferable to use less input features Promotes data reduction as much as possible before modeling
  11. 11. Computing Cluster GPU Cloud Large Scale Data Processing
  12. 12. Dataset Labeling Labeled data is a group of samples with one specific meaning or tag. ● Label an image with objects in it. ● Label an X-ray photo with whether or not the patient has certain disease. ● Join datasets that may correlate with each other.
  13. 13. Big Data Engineering 1. Data Cleansing: Create both better features and better labels 2. Self Service Analytics: Give data analyst tools to easily prepare their data 3. Data Storage: Build performance and cost efficient data storage strategy. 4. Streaming: Fast data feed + AI = Fast decision making.
  14. 14. AI in Today’s Industry
  15. 15. Questions?

×