8. 8
SPARK ⾼速化の為の RAPIDS ACCELERATOR
UCX Libraries
RAPIDS libcudf
(C++ Libraries)
CUDA
JNI bindings
Mapping From Java/Scala to C++
RAPIDS Accelerator
for Spark
DISTRIBUTED SCALE-OUT SPARK APPLICATIONS
Spark SQL API Spark Shuffle
DataFrame API
if gpu_enabled(operation, data_type)
call-out to RAPIDS
else
execute standard Spark operation
JNI bindings
Mapping From Java/Scala to C++
●Custom Implementation of Spark
Shuffle
●Optimized to use RDMA and GPU-
to-GPU direct communication
APACHE SPARK CORE
RAPIDS –データ分析およびマシンラーニングを GPU ⾼速化する為のオープンソース
https://developer.nvidia.com/rapids
9. 9
GTC21: Deep-Learning Data-Pipeline Optimization for Network Data Analysis in
SK Telecom by Employing Spark Rapids for Custom Data Source
https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s31400/
ユースケース
10. 10
GTC21: Deep-Learning Data-Pipeline Optimization for Network Data Analysis in
SK Telecom by Employing Spark Rapids for Custom Data Source
https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s31400/
ユースケース