The document discusses the goal of building an NLP text recommender system to provide customer service agents with relevant answers to customer questions, the approach taken including developing features for an ML ranking model and architecture for recommendations serving, model training, and system evolution over multiple versions to support multi-tenancy, dynamic training, and rollbacks.
6. Business Metrics
● Agent Time to Resolution
● Agent Time spent per case
● Case-Article Attach Rate
● # of recommendations served
● MAO, MAU
● Serving Latency
10. Data Prep & Feature Engineering
● Multi tenant data ingestion pipeline
● Data Cleansing and Sanity checks
● Precompute TDF, Corpus Statistics
● Feature Vectors computation
● 100+ of NLP features across different statistical feature categories
● Serving Training Drift
11. Model Training
● Ranking Model
● Auto tuned hyperparams
● Auto Model comparison
● Metrics
○ AUC
○ F-Measure
○ Precision, Recall
○ Hit Rate @K
13. ML System Evolution
version 0
● Heuristic based answer recommendations POC. First pilot sign up.
● Communities use case: community selected bestAnswer, as positive label.
● Generic model trained on open source dataset Stanford SQuAD
version 1
● Ranking model : <question, answer> pairwise probability
● Notebooks based on-demand training
● Static configured data filtering
14. ML System Evolution
version 2
● Dynamically configured training dataset attributes
● Model retraining
● Multilingual Support
● Multitenant Auto-trained models
● Observability
● Trained Model Deployments & Rollbacks
17. Challenges
● Data
○ Privacy and sharing compliances – GDPR, HIPAA, Accessibility
○ Freshness / Hydration
○ Handling encrypted data at rest and in motion
○ Too sparse, not meeting thresholds
○ Too dense, training performance SLA not met
● Custom, non standard fields and datatypes
● Building ML Infrastructure along the way
● Training Serving Skew
● Cold start problem
18. Takeaways
● Start small, Ship and Iterate
● Prioritize ML infrastructure
● Start with simple interpretable models
● Scale model learning to the size of your data
● Prioritize Observability
● Prioritize Data privacy over model quality