More Related Content
Similar to GPGPU deいろんな問題解いてみた
Similar to GPGPU deいろんな問題解いてみた (20)
GPGPU deいろんな問題解いてみた
- 1. GPU DE いろんな ちょっとした問題解いてみた
Terumi YAMADA
- 2. 自己紹介
• 山田てるみ(研修中
• SIMD大好きっ子
• Twitter: telmin_orca
- 3. もくじ
• 自己紹介
• 前フリ
• 巡回セールスマン問題解いてみた
• Aobench走らせてみた
• まとめ
- 11. NVIDIA
• Geforce GTX 580
• Fermi
• 512 CUDA core
• 3GB RAM
• PCIe 2.0
- 12. AMD
• Radeon HD 7970
• GCN
• 2048 Streaming Processor
• 3GB RAM
• PCIe 3.0
- 13. HOST
• Intel Core i7 2600K
• SandyBridge
• 8GB RAM
- 16. 解法
• 遺伝的アルゴリズム
• 蟻コロニー最適化
• μ-opt法
• LK法
- 22. Result
CPU NVIDIA AMD
10万 152.241 114.02 2472.06
12万 235.05 168.58 3487.41
14万 296.395 266.211
16万 427.161 328.547
- 23. …?
CPU NVIDIA AMD
10万 152.241 114.02 2472.06
12万 235.05 168.58 3487.41
14万 296.395 266.211
16万 427.161 328.547
- 24. …?
CPU NVIDIA AMD
10万 152.241 114.02 2472.06
12万 235.05 168.58 3487.41
14万 296.395 266.211
16万 427.161 328.547
- 25. …?
CPU NVIDIA AMD
10万 152.241 114.02 2472.06
12万 235.05 168.58 3487.41
14万 296.395 266.211
16万 427.161 328.547
\(^o^)/
- 27. Aobench?
• Ambient Occlution benchmark.
• @syoyo氏制作
• 浮動小数点演算のベンチマーク
- 29. 重いのは?
• Intersection
• Sphere * 3 + Plane = 4
• AO sample 64 * 64 = 256
- 31. Result
CPU NVIDIA AMD
256 * 256
6.30 0.057 0.061
64 * 64
512 * 512
24.58 0.213 0.131
64 * 64
1024 * 1024
96.735 0.831 0.4462
64 * 64
- 32. :
: が : //: /:::|::',|::'、:::::::::\:.:\.:.:.ヽ:.:.:\:.:..\::::::::::::\、::::\ : :
:
: 何 : /!::|::l:::: /|:::l:ヽ:\::ヽ:.:\:.:\.:::ヽ:.:.:ヽ:.:.:.:\::::::::::::\ ̄ : :
:
: だ : |/l::|::|::|: ト、:::::::::、、:ヽ、:.:.:.:::::::::::::::ヽ::::.:ヽ:.:.:.:.\:.:.:.ヽ:::\. :
: :
: か : |::|::/l::|::|r-ヽ:::::ヽ(ヽー,―\::::::、::::::::::ヽ::.:.::::::.:::::::ヾ. ̄ : :
:
: : }//l::|:::|{(:::)ヾ、:::ヽ \!(:::) ヽ,:::ヽ:::::::::::::::::::::::::::::::::::ヾ、 :
: :
: わ :. |/l::|::|:::|ヽ==''" \:ヽ、ヽ=='" |:::::::::::::::::::::::::::::::::::ヽ、::::\
か / ',|::|:::| / `゛ |!::::::::::::::::::::::::::::ト、::ト、_` ゛`
ら l::!::::ト、 '、 _ ||::::::::::::::::::::::::ト:ヽヾ| | ̄ ̄ ̄
`ヽ、
な r'"´||',::::', |:::::/l:::::|\:::ト、ヾ | | / / \
い / ll ',::', 、 ーこニ=- /!::/ ヽ:::| ヾ、 ノ ノ / ,イ
ヽ、
- 33. Device type: Unknown
???
Max resource 2D width/height: 16384/16384
Total GPU memory size: 3072 MB
Total CPU cached space size: 508 MB
Total CPU uncached space size: 1788 MB
GPU engine clock: 925 MHz
GPU memory clock: 1375 MHz
Number of timing loops: 100
[ 16 bytes] CPU->GPU= 800.000 KB/sec, GPU->CPU 533.333 KB/sec
[ 32 bytes] CPU->GPU= 1.600 MB/sec, GPU->CPU 1.067 MB/sec
[ 64 bytes] CPU->GPU= 2.133 MB/sec, GPU->CPU 2.133 MB/sec
[ 128 bytes] CPU->GPU= 2.560 MB/sec, GPU->CPU 4.267 MB/sec
[ 256 bytes] CPU->GPU= 8.533 MB/sec, GPU->CPU 8.533 MB/sec
[ 512 bytes] CPU->GPU= 17.067 MB/sec, GPU->CPU 25.600 MB/sec
[ 1024 bytes] CPU->GPU= 51.200 MB/sec, GPU->CPU 34.133 MB/sec
[ 2048 bytes] CPU->GPU= 102.400 MB/sec, GPU->CPU 68.267 MB/sec
[ 4096 bytes] CPU->GPU= 204.800 MB/sec, GPU->CPU 204.800 MB/sec
[ 8192 bytes] CPU->GPU= 409.600 MB/sec, GPU->CPU 409.600 MB/sec
[ 16384 bytes] CPU->GPU= 409.600 MB/sec, GPU->CPU 819.200 MB/sec
[ 32768 bytes] CPU->GPU= 1.638 GB/sec, GPU->CPU 1.638 GB/sec
[ 65536 bytes] CPU->GPU= 2.185 GB/sec, GPU->CPU 3.277 GB/sec
...
[ 4194304 bytes] CPU->GPU= 6.658 GB/sec, GPU->CPU 4.033 GB/sec
[ 8388608 bytes] CPU->GPU= 6.658 GB/sec, GPU->CPU 3.884 GB/sec
[ 16777216 bytes] CPU->GPU= 6.684 GB/sec, GPU->CPU 3.233 GB/sec
[ 33554432 bytes] CPU->GPU= 6.697 GB/sec, GPU->CPU 2.993 GB/sec
[ 67108864 bytes] CPU->GPU= 6.697 GB/sec, GPU->CPU 2.870 GB/sec
[ 134217728 bytes] CPU->GPU= 6.704 GB/sec, GPU->CPU 2.789 GB/sec
[ 268435456 bytes] CPU->GPU= 6.699 GB/sec, GPU->CPU 2.767 GB/sec
[ 536870912 bytes] CPU->GPU= 6.705 GB/sec, GPU->CPU 2.797 GB/sec
[1073741824 bytes] CPU->GPU= 6.705 GB/sec, GPU->CPU 2.771 GB/sec
calResAllocRemote2D() returned an error when trying to allocate 1874853888 bytes (uncached)!
Peak CPU->GPU Bandwidth = 6.705 GB/sec [data size = 536870912 bytes]
Peak GPU->CPU Bandwidth = 4.369 GB/sec [data size = 131072 bytes]
- 34. ????
GeForce GTX 580
Quick Mode
Host to Device Bandwidth, 1 Device(s), Paged memory, direct access
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 5561.7
Device to Host Bandwidth, 1 Device(s), Paged memory, direct access
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 5466.2
Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 138261.9
- 37. !
NVIDIA AMD
0.355824 1.70634
1000
0.16186 0.7224
3.54601 14.1305
10000
1.697 6.1982
35.4747 128.583
100000
16.213 58.0289
- 39. • GPGPUやるならGeforce GTX 580
• Radeon HD 7970は…
• スロースターター 足に爆弾
• カーネルが大きくなれば…
Editor's Notes
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n
- \n