12. 13
DEEP LEARNING INSIGHT
従来のアルゴリズム ディープラーニング
0%
20%
40%
60%
80%
100%
overall passenger
channel
indoor public area sunny day rainny day winter summer
Pedestrian detection Recall rate
Traditional Deep learning
70
75
80
85
90
95
100
vehicle color brand model sun blade safe belt phone calling
Vehicle feature accuracy increased by Deep Learning
traditional algorithm deep learning
監視カメラ
14. 子供の成長の問題を
AI が検出
Detecting growth-related problems in children
requires calculating their bone age. But it’s an
antiquated process that requires radiologists to
match X-rays with images in a 1950s textbook.
Massachusetts General Hospital, which conducts
the largest hospital-based research program in
the United States, developed an automated
bone-age analyzer built on NVIDIA cuDNN and the
NVIDIA DIGITS DevBox. The system is 99%
accurate and delivers test results in seconds
versus days.
15. Deep Learning for early detection of Age-
related Macular Degeneration
________________________________________
– UW developed a deep learning system to
read OCT scans and automatically detect
Age-related Macular Degeneration.
– There were 5.4 Million Scans in 2014
– In under one month of training, the
system is over 90% accurate
80% of people above 80 have Age-related
Macular Degeneration and it is treatable
-Aaron Lee, Assistant Professor of Ophthalmology,
University of Washington
57. NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
NVIDIA DGX-1
世界初ディープラーニング スーパーコンピューター
ディープラーニング向けに設計
170 TF FP16
8個 Tesla P100 ハイブリッド・キューブメッシュ
主要なAIフレームワークを加速
58. 0x
16x
32x
48x
64x
0 16 32 48 64
ストロングスケール
1つのストロングノードは多くのウィークノードよりも高速
VASP 性能
2x P100
CPU: Dual Socket Intel E5-2680v3 12 cores, 128 GB DDR4 per node, FDR IB
VASP 5.4.1_05Feb16, Si-Huge Dataset. 16, 32 Nodes are estimated based on same scaling from 4 to 8 nodes
Caffe AlexNet scaling data: https://software.intel.com/en-us/articles/caffe-training-on-multi-node-distributed-memory-systems-based-on-intel-xeon-processor-e5
CAFFE ALEXNET 性能
4x P100
8x P100
Single P100 PCIe Node vs Lots of Weak Nodes
# of CPU Server Nodes
Speed-upvs1CPUServerNode
0x
2x
4x
6x
8x
10x
12x
0 4 8 12 16 20 24 28 32
2x P100
8x P100
Single P100 PCIe Node vs Lots of Weak Nodes
# of CPU Server Nodes
Speed-upvs1CPUServerNode
4x P100
64 CPU Nodes
32 CPU Nodes
59. Fastest AI Supercomputer in TOP500
4.9 Petaflops Peak FP64 Performance
19.6 Petaflops DL FP16 Performance
124 NVIDIA DGX-1 Server Nodes
Most Energy Efficient Supercomputer
#1 on Green500 List
9.5 GFLOPS per Watt
2x More Efficient than Xeon Phi System
Rocket for Cancer Moonshot
CANDLE Development Platform
Optimized Frameworks
DGX-1 as Single Common Platform
INTRODUCING DGX SATURNV
World’s Most Efficient AI Supercomputer
60. To speed advances in the fight against cancer, the
Cancer Moonshot initiative unites the Department
of Energy, the National Cancer Institute and other
agencies with researchers at Oak Ridge, Lawrence
Livermore, Argonne, and Los Alamos National
Laboratories. NVIDIA is collaborating with the labs
to help accelerate their AI framework called
CANDLE as a common discovery platform, with
the goal of achieving 10X annual increases in
productivity for cancer researchers.
AI PLATFORM TO
ACCELERATE
CANCER RESEARCH
61. エヌビディア ディープラーニング プラットフォーム
COMPUTER VISION SPEECH AND AUDIO BEHAVIOR
Object Detection Voice Recognition Translation
Recommendation
Engines
Sentiment Analysis
DEEP LEARNING MATH LIBRARIES
cuBLAS cuSPARSE
GPU-INTERCONNECT
NCCLcuFFT
Mocha.jl
Image Classification
DEEP LEARNING
SDK
FRAMEWORKS
APPLICATIONS
GPU PLATFORM
CLOUD GPU
Tesla
P100
Tesla
K80/M40/M4
P100/P40/P4
Jetson TX1
SERVER
DGX-1
TensorRT
DRIVEPX2