Exploring the Future Potential of AI-Enabled Smartphone Processors
A New "Sparkitecture" for modernizing your data warehouse
1. A New “Sparkitecture”
for Modernizing Your
Data Warehouse
Ranga Nathan (Big Data Solutions Product Management)
Jack Gudenkauf (Big Data Professional Services Architect)
June 9, 2016
2. 40% of Enterprises struggle to identify,
integrate, and Manage Big Data with
existing technology
More systems to manageMore complexity to integrateMore data to identify
Barriers:
3. Hadoop and the data lake
3
App
App
App
Hadoop Data Lake
Vision:
• Data-centric foundation for all data and apps
• Elastic data management & compute platform for all data
• Single platform for all analytical workloads
Reality:
• Data swamps due to lack of oversight and data governance
• Dearth of skilled resources to extract value from the data
• Sub-optimal performance with traditional architectures
• Cannot scale to handle multi-tenant workload complexityData Ponds
4. Conventional Wisdom Regarding Deploying a Data Lake Infrastructure
4
Use Cases:
ProLiant
DL380
Apollo 4530
Apollo 4200
− Traditional Hadoop
architecture
− Batch workloads
with predictable
growth
− Lowers Big Data
Costs for larger
deployments
− Match compute to
workload
− Large internal storage
− Ideal for large data
volumes and batch
workloads where
density or cost per GB
is key
Density-OptimizedTraditional
Data Lakes & Hubs
• Ingestion of multiple types and
sources of data
• Aggregation, Transformation and
Visualization
• Batch, Interactive, Real-time
workloads
Data Warehouse
Modernization
• Data Staging & landing zone
• Migration of operational data stores
• Active archiving
• Batch workloads
5. A Big Data Journey…
ETL Offload Archival
Deep Learning
Event Processing
In Memory Analytics
6. HPE Elastic Platform for Analytics
Flexible Convergence for Big Data Workloads
Low Latency Compute
Event Processing
Moonshot M710P
Big Memory Compute
In Memory Analytics
Apollo xl170r w 512G memory
Archival Storage
Apollo 4200 w 6TB HDD
High Latency Compute
ETL Offload and Archival
Apollo xl170 w 256G memory
HPC Compute
Deep Learning
Apollo xl190r w GPUs
HDFS Storage
Apollo 4200 w 3TB HDD
7. Unique ValueHPE Workload- and Density-Optimized
(WDO) Solution
HPE Elastic Platform for Analytics
Innovation delivering unique value to customers and the open source community
Data Consolidation
− Shared storage pool for multiple Big Data
environments
Maximum Elasticity
− Dynamic cluster provisioning from compute pools
without repartitioning data
Flexible Scalability
− Scale compute and storage independently
Breakthrough Economics
− Workload optimized components for better density,
cost and power
Ethernet
HPE Apollo 4xx0
HPE Moonshot or HPE Apollo
7
8. Building blocks for the HPE Elastic Platform for Analytics
8
HP Apollo 2000 System
HP Apollo 4200 Scalable System
A density optimized compute platform that
offers double the density of traditional 1U
servers and high memory/core ratios
A cost-effective industry standard storage
server purpose built for big data with
converged infrastructure that offers high
density energy-efficient storage
HP Moonshot System
A complete server system engineered for
specific workloads and delivered in a dense,
energy-efficient package
HP ProLiant DL300 System
The industry’s most popular server balances
the latest compute and memory technologies
with internal storage and flash options,
coupled with industry-leading management
and serviceability
Workload-optimized compute nodes for Spark,
Hive/Tez, MapReduce, YARN, Vertica SQL on
Hadoop, and other analytics and batch
workload
Workload-optimized compute nodes for Hbase,
Kafka and other low-latency, streaming
workloads. Ideal for highest density + lowest
power requirements
Workload-optimized storage nodes for HDFS,
Kudu, and building a multi-temperate data lake
environment. Ideal for EDW Offload
workloads, and a foundation storage block for
workload consolidation
Balanced worker node for batch and single-
function use cases. Scalable storage node,
with smaller fault domain than Apollo 4200, for
workload consolidation use cases.
9. Yes, But Does It Perform?
ComputeNodes
StorageNodes
HDFS
Hyperscale Hadoop
Ethernet w/o RoCE
StorageNodes
HDFS
Conventional Hadoop
MapReduce
Read
9.2GB/Sec
4 – worker nodes 4 - Storage nodes
MapReduce
Write
7.4GB/Sec
Moonshot with 45 - M710
Read
4.9GB/Sec
Write
3.4GB/Sec
10. Comparing Configurations – Single Subject Data Mart Use Case
Normalized on CPU and list price
Balanced
18 Worker Nodes (Symmetric)
ProLiant DL380 Gen9
BDO
6 Worker Node Blocks (18 nodes)
Apollo 4530
Same performance (SpecInt)
7% higher $/SpecInt
WDO
16 Compute nodes + 4 Storage Nodes
Apollo 2000/4200
6% better performance (SpecInt)
2% lower $/SpecInt
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
Normalized on:
List price
SpecInt
11. Hyperscale Price/Performance
Compared with Conventional Cluster
11
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Green=40Gbps, Yellow=10GbpsQSFP+4847
49
51
50
52
343332311817161521 1/10GBASE-TGreen=10Gbps, Yellow=1Gbps
SYS
HP 5900
Series Switch
JG336A
Green=40Gbps, Yellow=10GbpsQSFP+4847
49
51
50
52
343332311817161521 1/10GBASE-TGreen=10Gbps, Yellow=1Gbps
SYS
HP 5900
Series Switch
JG336A
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
14%
31%
22%
14%
13%
18%
16%
15%
5%
7%
27%
2x
-3%
70%
6x
HadoopAggregation
HadoopJoin
HadoopScan
HadoopSort
HadoopTerasort
HadoopWordcount
HadoopGzip
HadoopGunzip
hperead
HPeWrite
ScalaSparkAggregat…
ScalaSparkJoin
ScalaSparkScan
ScalaSparkBayes
ScalaSparkPagerank
Normalized on list price and storage capacity
ProLiant DL380
Apollo 2000 +
Apollo 4200
12. HOT Data BALANCED COLD Data
Independent scaling of Compute and Storage
HOT Data *
− 2.8x compute
− 97% of the storage capacity
− 4x the memory
Balanced *
− 1.6x compute
− 1.5x the storage capacity
− 2.5x the memory
Cold Data *
− 0.9x of the compute
− 2.1x the storage capacity
− 1.5x the memory
Hyperscale vs. Conventional Scale-out
12
* Compared with balanced, conventional full rack cluster
13. Hyperscale benefits for Big Data
Hadoop Labels feature (jira Yarn-796)
– Contributed node labels concepts into Hadoop (Apache 2.6 trunk)
– Allows scheduling of YARN containers to specific pools of nodes
– Combined with Hyperscale approach, compute nodes can be dynamically assigned because no data needs to be repartitioned
Hadoop Cluster 1 Hive/Tez
Vertica SQL on
Hadoop
Data prep
(12am – 6am)
Predictive analytics
(6am – 12am)
Spark
Hadoop Cluster 1(High Power)
Storage Node Storage Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
14. 01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
A Modern “Sparkitecture” for Real-time Analytics (SMACK)
14
Cassandra
Mongo HDFS Kafka
Orc
OracleVertica
Spark
Sparksql Batch ML Streaming graph
Parquet
HANA
Vora
15. Vertica Cluster
MPP Columnar DW
Vertica
DW
Vertica
DW(s)
Vertica Cluster
Multi-DC DR
Vertica
DW
Vertica
DW(s)
Kafka (Mirrored)
Dual Ingestion Pipeline
Vertica
Spark
Vertica
Kafka
Connector
HDFS
/AVRO/Topic/Y
YYY/MM/DD/H
H
Semi-Structured
Data
HDFS
/ORC/Topic/YY
YY/MM/DD
Structured Data
Non-relational
Hadoop Cluster (Hive)
Hortonworks, Cloudera, AWS
1
2
3
4
5
6
78
1
0
11
9
Spark Parallel
Streaming
Transformations
Near real-time
Transform/ReShape
Mapping to Vertica SOT
Scala, Java, Python, SQL
Hadoop™ Spark™
Cluster
Spark
Vertica
Loader
Web
Mobile
IoT
MySQL
Applications
Centralized Data HubKafka™ Cluster
DataCenterBoundar
Parallel Streaming Transformation Loader
Confluent™ REST Proxy
JSON/AVRO Messages
Kafka Connect
16. Infrastructure trends affecting Big Data architecture
Workload optimization
Low-power SoC’s and other
accelerators giving rise to
workload optimized servers
Faster network fabric
Dramatic increase in fabric
speeds
Multi-temperate storage
Enterprise adoption of tiering
accelerated (NVMe, flash, etc.)
storage
Container-based Apps
Running multiple container apps
while hosting a common
resource management (YARN)
Big Data
17. Elastic Platform for Analytics long term view
Evolve to support multiple compute and storage blocks
Low Cost Nodes
SSD Nodes Disk Nodes Archive Nodes
Multi-temperate, Density Optimized Storage using HDFS Tiering, NoSQLs and Objectstores
GPU Nodes FPGA Nodes Big Memory Nodes
Density and Workload Optimized compute nodes to accelerate various big data software
18. Delivering a Scalable Data Lake for the Enterprise
• Unlock the most value and performance from Hadoop
• Scale without compromising data security, reliability, and ROI
• Enterprise-Grade, Trusted, and Proven HPE solution
Optimize the Hadoop Data Lake for More Business Value
Elastic Platform
for Analytics
(Workload and Density
Optimized)
High-Performing
Analytics Engines
for Hadoop
Consulting &
Implementation Services for
Hadoop
Data Security
for Hadoop