Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AutoML in NeurIPS 2018

7,119 views

Published on

Presented at NeurIPS 2018 Yomikai @ PFN
https://connpass.com/event/115476/

Published in: Technology
  • Login to see the comments

AutoML in NeurIPS 2018

  1. 1. AutoML NeurIPS 2018 Yomikai @ PFN (2019/01/26) Shotaro Sano
  2. 2. Agenda • AutoML @ NeurIPS 2018 • 1 “Massively Parallel Hyperparameter Tuning” [Li, et al.] • 2 “Neural Architecture Optimization” [Luo, et al.] 2
  3. 3. What is AutoML? • Hyperparameter Optimization (HPO): • Neural Architecture Search (NAS): NN • Meta Learning: 3 “The user simply provides data, and the AutoML system automatically determines approach that performs best for particular applications.” Futter et al., 2018 AutoML: Methods, Systems, Challenges
  4. 4. AutoML @ NeurIPS 2018 • AutoML • HPO, NAS, Meta Learning • Meta Learning • AutoML Meetup @ Google AI • – System for ML – Meta Learning – NeurIPS 2018 Competition Track 4
  5. 5. Hyperparameter Optimization @ NeurIPS 2018 • Bayesian Optimization Meta-learning • 10 @ – “Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior” – “Automating Bayesian Optimization with Bayesian Optimization” – etc. • Systems for ML – “Massively Parallel Hyperparameter Tuning” – “Population Based Training as a Service” – etc. 5
  6. 6. Neural Architecture Search @ NeurIPS 2018 • , Semantic Segmentation • 4+ @ – “Neural Architecture Optimization” – “Neural Architecture Search with Bayesian Optimization and Optimal Transport” – etc. • 2019 AutoDL 6
  7. 7. Meta Learning @ NeurIPS 2018 • Keywords: Model-agnostic Meta-Learning, Few-shot Learning, Transfer Learning, etc. • 20 @ – “Bayesian Model-Agnostic Meta Learning” – “Meta-Reinforcement Learning of Structured Exploration Strategies” – etc. • Meta Learning – HPO NAS 7
  8. 8. Competition Track: AutoML3 @ NeurIPS 2018 • AutoML3: – – • Tree-parzen Estimator + LightGBM/XGBoost 8 Train&Test Task A Task B Task C Task D Task E
  9. 9. Today’s Papers • Hyperparameter Optimization “Massively Parallel Hyperparameter Tuning” [Li, et al.] • Neural Architecture Search “Neural Architecture Optimization” [Luo, et al.] 9
  10. 10. Systems for ML Workshop (NeurIPS 2018) Massively Parallel Hyperparameter Tuning
  11. 11. 11 Blackbox Optimization such as Grid Search Bayesian Optimization … Hyperparameter Tuner LR: 0.00001 ~ 0.1 Dropout: 0.0 ~ 0.5
  12. 12. Massively Parallel Hyperparameter Tuning 12 • • – – Optuna – Successive Halving
  13. 13. Related Work: Successive Halving (SHA) 13 • ( ) • • Hyperband [16, Li, et al.]
  14. 14. Related Work: Successive Halving (SHA) 14 N resource config config config config config config config config config
  15. 15. resource config config config config config config config config config Related Work: Successive Halving (SHA) 15 N / η η ( η=3 )
  16. 16. Related Work: Successive Halving (SHA) 16 N / η2 η2 resource config config config config config config config config config ( η=3 )
  17. 17. Related Work: Successive Halving (SHA) 17 resource config config config config config config config config config
  18. 18. Simple and Powerful! 18 Successive Halving Random Search Faster & better!
  19. 19. rung 2 rung 11 1 1 1 2 2 3 resource 19 Related Work: Synchronous SHA parallelize rung 3 config config config config 1 2 3 worker 1 1 1 2 worker 2 1 rung 1 rung 2 rung 3
  20. 20. • : rung • / 20 Problem with Synchronous SHA Synchronous Asynchronous ( ) Config worker 1 1 1 2 worker 2 1 2 31 rung 1 rung 2 rung 3
  21. 21. • “ ” 1/η config • config config 21 Proposed Method: Asynchronous SHA 1 1 ? 1 2 1 2 ? ( η=2 ) 1 2 1
  22. 22. • PROS: rung • CONS: mis-promote – Config – Mis-promote N -1/2 22 Proposed Method: Asynchronous SHA PROS (Massively Parallel Hyperparameter Tuning) 1 1 2 1 2 31 1 2 1 1 2 31 2 3 1
  23. 23. 23 Experiments: Single Worker Setting Mis-promote Synchronous ( ) Synchronous SHA Asynchronous SHA
  24. 24. 24 Experiments: Multi Worker Setting config ( ) Synchronous SHA Asynchronous SHA
  25. 25. 25 Conclusion Successive Halving
  26. 26. NeurIPS 2018 Neural Architecture Optimization
  27. 27. Neural Architecture Search • • ImageNet SOTA 27
  28. 28. 28 Chain-structured Space Tree-structured Space Multi-branch Network Cell Search Space … Full Evaluation Lower Fidelities Learning Curve Extrapolation One-shot Architecture Search … Reinforcement Learning Evolutionary Search Bayesian Optimization Monte Carlo Tree Search …
  29. 29. Neural Architecture Optimization 29 • • – CIFAR (+cutout) SOTA – 2018 PFN Neural Architecture Search
  30. 30. Related Work: NASNet Search Space 30 • [16, Zoph et al.] • NASNet Space ImageNet SOTA [17, Zoph et al.] – – ResNet ResBlock
  31. 31. Proposed Method: NAONet • – ? • NASNet ( ) 31
  32. 32. 32 LSTMEncoder Embedding Vector LSTMDecoder FC Layers Accuracy Prediction
  33. 33. 33 LSTMEncoder Embedding Vector LSTMDecoder Encoder-decoder
  34. 34. 34 LSTMEncoder Embedding Vector LSTMDecoder FC Layers Accuracy Prediction Embedding Multi-task Loss
  35. 35. 35 LSTMEncoder Embedding Vector FC Layers Accuracy Prediction
  36. 36. 36 LSTMEncoder Embedding Vector FC Layers Accuracy Prediction
  37. 37. 37 LSTMEncoder Embedding Vector LSTMDecoder FC Layers Accuracy Prediction
  38. 38. 38 NAONet
  39. 39. 39 Experiments: SOTA on CIFAR-10 SOTA
  40. 40. 40 Experiments: Transferring CIFAR-10 to CIFAR-100 SOTA
  41. 41. 41 Experiments: Transferring PTB to WikiText-2
  42. 42. 42 Conclusion Neural Architecture Search CIFAR SOTA
  43. 43. Define-by-run style hyperparameter search framework. ! Fat config & poor control syntax! " High modularity! " High representation power!

×