認識 C++11 新標準及使用 AMP 函式庫作平行運算

Dev306

認識 C++11 新標準
及使用 AMP 函式庫
作平行運算
王建興
qing.chwang at gmail.com

講者簡介
現職
聖藍科技技術長
興趣
網路應用系統開發
譯作
Thinking in Java 4th Edition, 正體中文版
Thinking in Java 2nd Edition, 正體中文版
Essential C++, 正體中文版
專欄
iTHome 電腦報程式人專欄
連絡方式：qing at gmail.com

Agenda
C++新標準－ C++11
使用C++ AMP做平行運算

C++11
C++11，先前被稱作C++0x，即ISO/IEC
14882:2011
C++11，是目前的C++程式語言的正式標準
取代第二版標準ISO/IEC 14882:2003
在2011年八月正式成為標準，故名為C++11

C++11要達成的目標
讓C++成為在系統程式設計以及程式庫設計
領域中較好的程式語言
直接使用C++而非為特定應用領域提供專門的開
發語言
使C++更易於教導及學習
增加一致性、加強安全性、為初學者提供相關的
機制
如何達成
在維持效能甚至提升效能的同時，提供更高階、
更抽象化

*http://www.stroustrup.com/C++11FAQ.html#aims

C++ 11中的增進 – Core Language

執行時期的效能增進，例如：
rvalue reference
對POD定義的修改
建構時期的效能增進，例如：
extern template
使用性的加強，例如：
型別的推論
Range-based for-loop
功能的改進，例如
static assertion

C++ 11 －從 VC10 到 VC11 (1/3)

C++ 11 －從 VC10 到 VC11 (2/3)

C++ 11 －從 VC10 到 VC11 (3/3)

Concurrency – 從 VC10 到 VC11

今天要談的 C++11 特色 (1/2)
Right angle brackets
extern template
enum Class
POD定義的修改
區域及不具名的型別做為模版的參數
Range-for述句

今天要談的 C++11 特色 (2/2)
auto Keyword
decltype Type Specifier
Lambda Expressions
rvalue references
static_assert
nullptr

Right Angle Brackets
list<vector<string>> lvs;

在C++98中，這是語法錯誤的寫法
編譯器會將 >> 認為是運算子 >>
新編譯器將能理解此種語法

extern template
避免多重實例化模版，明確指定模版於它處
實例化
編譯器毋需重複實例化，待鏈結時期再尋找
實例化後的程式碼即可
#include "MyVector.h"
extern template class MyVector<int>; // Suppresses implicit instantiation below --
// MyVector<int> will be explicitly instantiated elsewhere
void foo(MyVector<int>& v)
{
// use the vector in here
}

-------------------------------------------

#include "MyVector.h"
template class MyVector<int>; // Make MyVector available to clients (e.g., of the shared library

*http://www.stroustrup.com/C++11FAQ.html#extern-templates
*http://zevoid.blogspot.tw/2012/04/c11-extern-template.html

傳統C++在enum上的問題
傳統的enum在轉型時會被自動視為int，不
符合此規則時則會引發錯誤
傳統的enum其列舉值在其外圍的程式碼範
圍內皆為可見，這易於產生名稱上的衝突
無法指定enum的底層型別，容易造成誤解
以及相容性的問題
無法進行先行宣告

http://www.stroustrup.com/C++11FAQ.html#enum

enum class（更高階的enum）
Scoped and Strongly typed enums
VC10中並未支援，VC11已支援
enum Alert { green, yellow, election, red };

enum class Color { red, blue };

enum class TrafficLight { red, yellow, green };

Alert a = 7;
Color c = 7;
int a2 = red;
int a3 = Alert::red;
int a4 = blue;
int a5 = Color::blue;

Color a6 = Color::blue;
Error Error in C++98; OK in C++11

指定enum的底層型別
並同時決定enum的佔用空間
在過去，佔用空間取決於實作而定
底層型別必須是整數型的型別，預設為int
enum class Color : char { red, blue };

enum class TrafficLight { red, yellow, green };

enum E { E1 = 1, E2 = 2, Ebig = 0xFFFFFFF0U }; // how big is an E?

enum EE : unsigned long { EE1 = 1, EE2 = 2, EEbig = 0xFFFFFFF0U };

enum的先行宣告
C++11中可以做先行宣告了

enum class Color_code : char;
void foobar(Color_code* p);
// ...
enum class Color_code : char { red, yellow, green, blue };

enum Class的意義
讓列舉型別的表示方式更高階、更抽象、更
直覺
即使是C++的後繼者Java，一開始也沒有專
門的enum type，而是在J2SE 5.0之後加入

對POD定義的修改
POD（Plain Old Data）
符合這種定義的型別能夠允許產生與C相容
的物件佈局
在C++98，POD指的是像C中struct那樣的資
料
能使用memcpy()
能使用memset()進行初始化
struct S { int a; }; // S is a POD
struct SS { int a; SS(int aa) : a(aa) { } }; // SS is not a POD
struct SSS { virtual void f(); /* ... */ };

C++11中的POD
struct S { int a; }; // S is a POD
struct SS { int a; SS(int aa) : a(aa) { } }; // SS is a POD
struct SSS { virtual void f(); /* ... */ };

在C++11中，SS也是個POD
建構式並不影響
SSS 不會是POD，因為會有個虛擬函式表
C++定義POD是可複製的型別、極簡型別、
以及具有標準佈局的型別
VC10中並未支援，VC11已支援

極簡（trivial）型別或結構
極簡的預設建構式。這可以使用預設建構式
語法
例如SomeConstructor() = default;
極簡的複製建構式，可使用預設語法
極簡的賦值運算子，可使用預設語法
極簡的解構式，不可以是虛擬的

*http://zh.wikipedia.org/zh-tw/C%2B%2B11

標準佈局(standard-layout)的型別或結構
只有非靜態的資料成員，且這些成員也是符
合標準佈局的型別
對所有非靜態成員有相同的存取控制
沒有虛擬函式，也沒有虛擬基礎類別
只有符合標準佈局的基礎類別
沒有和第一個定義的非靜態成員相同型別的
基礎類別
若非沒有帶有非靜態成員的基礎類別，就是
最底層(繼承最末位)的型別沒有非靜態資料
成員而且至多一個帶有非靜態成員的基礎類
別。基本上，在該型別的繼承體系中只會有
一個型別帶有非靜態成員

POD的標準
Class/struct/union被視為是POD要滿足三個
條件
它是極簡的型別或結構
它是標準佈局的型別或結構
其所有資料成員及基礎類別皆為PODs

區域及不具名的型別做為模版的參數

在C++98，區域及不具名的型別不能做為模
版的參數傳入
如今，在C++11中支援了
一致性更好
void f(vector<X>& v)
{
struct Less {
bool operator()(const X& a, const X& b)
{ return a.v<b.v; }
};
sort(v.begin(), v.end(), Less()); // C++98: error: Less is local
// C++11: ok
}
*http://www.stroustrup.com/C++11FAQ.html#local-types

不具名型別的值也能做為模版參數

template<typename T> void foo(T const& t){}
enum X { x };
enum { y };

int main()
{
foo(x); // C++98: ok; C++11: ok
foo(y); // C++98: error; C++11: ok
enum Z { z };
foo(z); // C++98: error; C++11: ok
}

auto Keyword (1/3)
auto 關鍵字能從所宣告變數的初始算式，導
出其型別 ( 當然是編譯時期 )

auto declarator initializer;

int j = 0;
auto k = 0; // Variable k is implicitly type int

*http://msdn.microsoft.com/en-us/library/dd293667(v=VS.100).aspx

auto Keyword (2/3)

map<int,list<string>>::iterator i = m.begin();
auto i = m.begin();

auto x = 1, *y = &x, **z = &y; // Resolves to int.
auto a(2.01), *b (&a); // Resolves to double.
auto c = 'a', *d(&c); // Resolves to char.
auto m = 1, &n = m; // Resolves to int.

auto Keyword (3/3)
使用 auto 關鍵字時的重要限制
使用時一定要搭配 initializer
不能用來宣告陣列、變數的 return type、函
式或 template 的參數
除了 static member 之外，不能在
class/struct 中使用 auto 宣告 data member

錯誤的 auto 使用

auto a;
auto ary[10];
auto ary2[] = { 1, 2, 3}
auto foo();
void bar(auto a);
struct A
{
auto a;
};

使用 auto 的時機
當資料型別可能隨著編譯器或目標平台而異
時
例 strlen() 的回傳型別
當資料型別過於複雜不易表示時
例 map<int,list<string>>::iterator i
將 Lambdas 指派至變數時 ( 待續,
Lambdas 算式 )
指定 Trailing Return Types ( 待續, decltype
)

auto vs. auto
/Zc:auto[-] 編譯器選項是用來告訴編譯器看
待宣告變數時 auto 這個關鍵字的確切意義
指定 /Zc:auto 編譯器會從所宣告變數的初始
算式推導出其確型別
指定 /Zc:auto- 編譯器會以 automatic
storage class 來宣告變數
此為相容性問題

*http://msdn.microsoft.com/en-us/library/dd293615.aspx

decltype Type Specifier (1/2)
decltype 是個 type specifier
decltype 依據所給定的算式來決定型別
decltype( expression )
decltype 不同於 typeid，因為它是從算式中
得到型別本身，而非型別資訊

decltype Type Specifier (2/2)
int var;
const int&& fx();
struct A { double x; }
const A* a = new A();

decltype(fx()); // const int &&
decltype(var) // int
decltype(a->x) // double ( The type of the member
access )
decltype((a->x)) // const double && (an expression
instead of a member access )

使用 decltype 的時機
適當的表示資料型別
decltype(mstr.begin()->second.get_allocator()) under_alloc;

decltype 和 auto 不同的是，auto 推導型別
是靠初始算式，得發生指派動作
decltype 只是從算式評估結果的型別來推導
型別，但卻不會確切的進行評估
decltype(mstr.begin()->second.get_allocator()) under_alloc;

trailing-return-type

auto function_name( parameters ) −> decltype( expression )
{
function_body;
}
template<typename T, typename U>
auto myFunc( T& t, U& u)-> decltype( t + u ){
return t + u;
};

C++11對型別推論的加強
auto及decltype 其作用皆是在編譯時期推論
型別
編譯器在編譯時期確切的能從變數的初始化
述句以及算式得到最終該有的型別
編譯時期的動作不影響執行時期的效能
C++的一貫哲學
之所以要有型別推論不是因為那些顯而易見
的型別宣告問題
長久以來C++程式設計者受模版庫中複雜的型別
宣告所苦
它們確切究竟是什麼型別並不那麼重要

Ranged-for
透過Ranged-for 述句，可以在迴圈裡遞代一
組元素
所有標準的容器、 std::string 、初始化列表
、陣列、以及所有可以定義begin()及end()的
類別都可適用
void f(vector<double>& v)
{
for (auto x : v) cout << x << 'n';
for (auto& x : v) ++x;
}

for (const auto x : { 1,2,3,5,8,13,21,34 }) cout << x << 'n';

*http://www.stroustrup.com/C++11FAQ.html#aims

Ranged-for 的意義
提供更高階的迴圈語義
Compiler Sugar
即使如Java，也在J2SE 5.0之後，加入了
forech的語法

Lambda Expressions(1/4)
基本上就像是個匿名函式（anonymous
function）
[]() mutable throw() -> typeid
{
//function body
}
Lambda Expression 使得函式可以在使用的
地方定義，並且可以在Lambda 函式中使用
Lambda 函式之外的變數
比函式物件或函式指標方便且單純

Lambda Expressions(2/4)
1.capture clause
2.parameter list
3.mutable specification
4.exception specification
5.return type
6.lambda body

Lambda Expressions (3/4)
capture clause

*http://heresy.spaces.live.com/blog/cns!E0070FB8ECF9015F!10575.entry

Lambda Expressions (4/4)
parameter list
不能有預設引數
不能有可變長度引數列表
不能有不具名參數
沒有參數時，可省略 parameter list
int main()
{
int x = 4;
int y = 5;
int z = [=] { return x + y; } ;
}

使用函式物件

struct FunctionObj {
void operator()(int n) const {
// 對n操作
}
}
…
for_each(v.begin(), v.end(), FunctionObj ());

使用Lambda Expression

for_each(v.begin(), v.end(), [] (int n) {
// 操作 n
}

rvalue references
rvalue reference 被創造，其中最大的意義
之一，便是為了提供更有效率的 move 語義
rvalue reference 語法

type-id && cast-expression

*http://msdn.microsoft.com/en-us/library/dd293668(v=VS.100).aspx

lvalue & rvalue (1/2)
每個算式，若非 lvalue 便為 rvalue
lvalue 指的是在單一算式之後所代表的一個
續存物件
例：++x
rvalue 指的是在算式評估結束之後，生命期
便結束的暫時性物件
例：x++
能對它取址的即為 lvalue

*http://blogs.msdn.com/b/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx

lvalue & rvalue (2/2)

string one("cute");
const string two("fluffy");
string three() { return "kittens"; }
const string four() { return "are an essential part of a
healthy diet"; }

one; // modifiable lvalue
two; // const lvalue
three(); // modifiable rvalue
four(); // const rvalue

C++ 的複製問題
每次 + 運算子執行時，都會產生一個暫時性
物件
string s0("my mother told me that");
string s1("cute");
string s2("fluffy");
string s3("kittens");
string s4("are an essential part of a healthy diet");

string dest = s0 + " " + s1 + " " + s2 + " " + s3 + " " + s4;
產生了八個暫時性物件！

問題出在那裡？
因為 s0 是個 lvalue 不能加以修改，所以在
計算 s0 + “ “ 時，得建立一暫時物件
但在接著計算 (s0 + “ ”) + s1 時，可以直接
把 s1 接到之前建立的暫時物件後，便毋需
再產生第二個暫時物件，並捨棄第一個暫時
物件
這便是 move 語義的核心觀念

move 語義的實作 (1/2)
在 C++11 中，每次 + 運算子被呼叫時，仌
會產生獨立的暫時物件
但第二次呼叫 + 運算子時（例如在（(s0 + “
”) + s1）中），會將前一個暫時物件所配置
的記憶體挪過來使用
只更動第二個暫時物件指標之值，而不重新
配置並複製
第一個暫時物件的指標則指向 null，避免解
構時的相關行為

move 語義的實作 (2/2)
所以，若能偵測到處理的是 non-const
rvalue 時，就可以直接挪用其記憶體
反正它很快就要被摧毀，又沒人在乎它
從 rvalue 建立物件，或指派 rvalue 物件之
值至另一物件時，所謂的 moving，便是挪
用其記憶體的動作
例如 vector 要擴展空間時，moving 就很重
要
那問題就是在於要如何偵測了！

rvalue reference
C++11 引入了名為 rvalue reference 的新
reference，其語法為 Type&& 及 const
Type&&
rvalue reference 與 lvalue reference 是不同
的型別
但在語意上都一樣是 reference

兩種 reference 在函式重載時的不同行為

Type& 只能繫結到 non-const lvalue
const Type& 可以繫結到任何東西
Type&& 可以繫結到 non-const lvalue 及
non-const rvalue
const Type&& 可以繫結到任何東西

rvalue reference 的overload resolution
void purr(const string& s) {
cout << "purr(const string&): " << s << endl;
}
void purr(string&& s) {
cout << "purr(string&&): " << s << endl;
}
string strange() {
return "strange()";
}
const string charm() {
return "charm()";
}

int main() {
string up("up");
const string down("down");

purr(up); // purr(const string&): up
purr(down); // purr(const string&): down
purr(strange()); // purr(string&&): strange()
purr(charm()); // purr(const string&): charm()
}

提供 move 語義
傳統的 copy 建構子
Simple(const Simple&);
C++0x 中的 move 建構子
Simple(Simple&&);
編譯器並不會提供預設的 move 建構子

*範例取自 http://www.codeproject.com/KB/cpp/cpp10.aspx#RValues

move 建構子被呼叫

Simple GetSimple()
{
Simple sObj(10);
return sObj;
}

實作 move 建構子(1/2)
class Simple
{
// The resource
void* Memory;

public:

Simple() { Memory = nullptr; }

// The MOVE-CONSTRUCTOR
Simple(Simple&& sObj)
{
// Take ownership
Memory = sObj.Memory;

// Detach ownership
sObj.Memory = nullptr;
}
};

實作 move 建構子(2/2)

Simple(int nBytes)
{
Memory = new char[nBytes];
}

~Simple()
{
if(Memory != nullptr)
delete []Memory;
}
};

move 指派運算子
傳統的 copy 指派運算子
void operator=(const Simple&);
C++0x 中的 move 指派運算子
void operator=(Simple&&);

實作 move 指派運算子
class Simple
{
...
void operator = (Simple&& sOther)
{
// De-allocate current
delete[] Memory;

// Take other's memory contnent
Memory = sOther.Memory;

// Since we have taken the temporary's resource.
// Detach it!
sOther.Memory = nullptr;
}
};};

rvalue reference的引入
減尐了一些執行時期額外的記憶體配置以及
重覆複製資料的負擔
對提升運行效能有所幫助
尤其對程式庫中的類別而言

static_assert Declaration (1/2)
在編譯時期針對指定條件檢驗是否成立
static_assert(
constant-expression,
string-literal
);
constant-expression: 若值評估為 0，則編譯
失敗並顯示string-literal參數之值，反之，則
無任何影響

static_assert Declaration (2/2)
應用：

static_assert(
sizeof(void *)==4,
"64-bit code generation is not supported.");

static_assert 與傳統方式的比較
在過去有兩種檢查方式
條件編譯 + #error 指令 ( 在 preprocessing 階段
)
assert 巨集 ( 在 runtime )
#error 無法針對 template 參數進行檢查，因
為 class 是在編譯時才會被 instantiated
static_assert 是在編譯時期進行檢查，所以
可以處理 template 參數的檢查

nullptr Keyword
nullptr 關鍵字用來表示指向空值的指標
解決 0 和指向空值指標的混淆情況
提供更直觀更不易出錯的空指標表示方式
void f(int){ printf( "intn" ); }
void f(char*){ printf( "char*n" ); }

f( NULL ); // int, compile error in GCC
f( 0 ); // int
f( (char*)0 ); // char*
f( (void*)0 ); // compile error

*http://heresy.spaces.live.com/blog/cns!E0070FB8ECF9015F!10891.entry

概括C++11的加強方向
增加編譯、鏈結時的效能
增加實際執行的效能
提供更高階、更抽象的程式設計方式
提供更簡便、更不易出錯的程式設計方式

何謂 C++ AMP？
AMP: Accelerated Massive Parallelism
在一個或多個的加速器上執行計算
目前，實際的加速器就是GPU
未來，會有其他型式的加速器
有了C++ AMP後，可以完全使用C++撰寫應
用程式來獲得平行計算的加速
可加速數十倍或更多
AMP 基本上是個程式庫
附於Visual Studio 2012中
規格開放，其他平台或編譯器亦可實作

C++ AMP的設計原則
基於C++而非C
能成為主流
愈簡化的程式設計模型愈好
盡可能尐的改動
具可攜性
可於任何廠商的硬體上執行
通用且不會過期
使用不同的加速器，例如雲端上的加速器
開放
規格開放

CPU vs. GPU

較低的記憶體頻寬較高的記憶體頻寬
高耗能低耗能
中度平行化高度平行化
較深的指令管線較淺的指令管線
隨機存取循序存取
支援通用型的程式支援平行資料操作的程
通用型的程式設計式
專用型的程式設計

GPGPU
General-Purpose computing on Graphics
Processing Units
也稱GP²U
傳統的GPU通常是處理和圖形有關的計算
由於現代GPU具有強大的並行處理能力，使
得GPU也可以用來處理一些通用類型的資料
尤其是面對單一指令多資料流（SIMD）的情況
目前，
OpenCL是現存的GPGPU計算語言
Nvidia的CUDA則是現成的封閉式框架

異質性計算（Heterogeneous Computing）

在同一計算系統中，使用不同指令集和架構
的計算單元來組成計算方式，即為異質性計
算
常見的計算單元包括像是
CPU, GPU, DSP, ASIC, FPGA … 等等
不同類型計算單元的長處各不盡相同
異質性計算則嘗試綜合各自的長處
例如GPU擅長平行計算
組合CPU+GPU計算的系統，即屬異質性計
算系統

GPU所能容納的執行緒數－例
NVIDIA GTX 690
16個多處理器
每個多處理器有192個CUDA核心
每個多處理器的最大執行緒數是2048
可同時容納最多32,768個執行緒
NVIDIA GTX 560 SE
9個多處理器
每個多處理器有32個CUDA核心
每個多處理器的最大執行緒是1536
可以同時容納最多13,824個執行緒
*http://www.infoq.com/cn/articles/cpp_amp_computing_on_GPU

程式庫型式的C++ AMP (1/2)

附於Visual Studio 2012之中
是個STL-like的程式庫
#include <amp.h>後取用
Namespace: concurrency
新類別
array, array_new
extent, index
accelerator, accelerator_new

程式庫型式的C++ AMP (2/2)

新函式
parallel_for_each()
新關鍵字（或說同名但新的用法）
restrict
用來告訴編譯器檢查GPU（DirectX）

parallel_for_each
程式庫的入口
接收參數
所需的執行緒數
在每個執行緒中執行的函式或labmda（必須是
restrict(amp)）
將工作送至加速器
返回 – 不會有遲滯或等待

Hello AMP: 陣列相加
#include <amp.h>
using namespace concurrency;

void AddArrays(int n, int * pA, int * void AddArrays(int n, int * pA, int *
pB, int * pSum) pB, int * pSum)
{ {
array_view<int,1> a(n, pA);
array_view<int,1> b(n, pB);
array_view<int,1> sum(n, pSum);

for (int i=0; i<n; i++) parallel_for_each(
sum.extent,
[=](index<1> i) restrict(amp)
{ {
pSum[i] = pA[i] + pB[i]; sum[i] = a[i] + b[i];
} }
);
} }

C++ AMP程式設計的基本元素
void AddArrays(int n, int * pA, int * pB, int * pSum) array_view
{ 將資料包裝成可於
array_view<int,1> a(n, pA); 加速器上操作的型
array_view<int,1> b(n, pB); 式
array_view<int,1> sum(n, pSum); parallel_for_each
在每個執行緒上執
parallel_for_each( 行一次函式或
lambda
sum.extent,
[=](index<1> i) restrict(amp)
extent
{ 執行lambda的執行
緒個數及結構形狀
sum[i] = a[i] + b[i];
}
index
); 執行labmda的執行
緒ID，做為資料的
} 索引
操作 restrict(amp)
表示amp限制

extent<N>: N維空間的大小

array_view<T, N>
vector<int> v(10);
檢視CPU或GPU
上既存的資料
extent<2> e(2,5);
元素型別T，並有 array_view<int,2> a(e, v);
N維
需指定extent
矩形
//above two lines can also be written
隨處可存取（自動
//array_view<int,2> a(2,5,v);
同步）
index<2> i(1,3);

int o = a[i]; // or a[i] = 16;
//or int o = a(1, 3);

restrict(amp)的限制
僅能呼叫其他同為restrict(amp)的函式
所有的函式都必須是可inline
僅能使用支援amp的型別
int, unsigned int, float, double, bool
上述型別構成的struct 及陣列
指標及參照的限制
Lambda不能是capture by reference或是
capture pointer
參照及單向指標僅能做為區域變數及函式的引數

array<T, N>
vector<int> v(8 * 12);
extent<2> e(8,12);
多維陣列，元素型別T， accelerator acc = …
N維 array<int,2> a(e,acc.default_view);
存於特定加速器中的儲存 copy_async(v.begin(), v.end(), a);
空間
Capture by reference
自行指定的複製動作 parallel_for_each(e, [&](index<2> idx)
和array_view<T, N>相近 restrict(amp)
的介面 {
a[idx] += 1;
});
copy(a, v.begin());

parallel_for_each的非同步特性
呼叫完立即返回
當嘗試透過array_view存取處理中的資料時
，若尚未運行完畢，則會等待，直到運算結
束為止
可運用array_view的synchronize_async()函
式來處理計算的完成事件
std::shared_future<void> synchronize_async() const;

撰寫C++ AMP程式簡單直覺
撰寫C++ AMP程式的步驟
建立array_view物件
呼叫parallel_for_each
利用array_view物件得到計算結果
CPU和GPU之間的溝通細節，皆由程式庫包
裝
記憶體的分配和釋放
資料的同步
GPU執行緒的規劃和管理

Visual Studio 2012對C++ AMP的除錯支援

中斷點
偵錯類型
變數值及call stack的察看
GPU的執行緒狀態
平行計算資料的察看

中斷點的設定
可停在CPU或GPU的中斷點
GPU的中斷點僅Windows 8支援

*Kate Gregory, “C++ Accelerated Massive Parallelism in Visual C++ 2012”

變數值及call stack的察看

開發/執行C++ AMP的條件
開發
Visual Studio 2012
支援DirectX 11的顯示卡
Windows 8作業系統
執行
Windows 7/Windows 8
支援DirectX 11的顯示卡

看待C++ AMP程式的效能問題
愈包裝的夠高階夠抽象的程式庫愈有提升效
能的可能性
基於C++ AMP的程式本身結構夠簡單
效能取決於C++ AMP程式庫的實作
當然還有底層的硬體條件
只要程式庫被專家最佳化了，應用程式就變
快了

http://www.microsoft.com/taiwan/techdays2012/
http://www.microsoft.com/learning/zh/tw/

http://social.technet.microsoft.com/Forums/zh- http://social.msdn.microsoft.com/Forums/zh-
tw/categories/ tw/categories/

認識 C++11 新標準及使用 AMP 函式庫作平行運算

認識 C++11 新標準及使用 AMP 函式庫作平行運算

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to 認識 C++11 新標準及使用 AMP 函式庫作平行運算

Similar to 認識 C++11 新標準及使用 AMP 函式庫作平行運算 (20)

認識 C++11 新標準及使用 AMP 函式庫作平行運算