SlideShare a Scribd company logo
1 of 19
A New “Sparkitecture”
for Modernizing Your
Data Warehouse
Ranga Nathan (Big Data Solutions Product Management)
Jack Gudenkauf (Big Data Professional Services Architect)
June 9, 2016
40% of Enterprises struggle to identify,
integrate, and Manage Big Data with
existing technology
More systems to manageMore complexity to integrateMore data to identify
Barriers:
Hadoop and the data lake
3
App
App
App
Hadoop Data Lake
Vision:
• Data-centric foundation for all data and apps
• Elastic data management & compute platform for all data
• Single platform for all analytical workloads
Reality:
• Data swamps due to lack of oversight and data governance
• Dearth of skilled resources to extract value from the data
• Sub-optimal performance with traditional architectures
• Cannot scale to handle multi-tenant workload complexityData Ponds
Conventional Wisdom Regarding Deploying a Data Lake Infrastructure
4
Use Cases:
ProLiant
DL380
Apollo 4530
Apollo 4200
− Traditional Hadoop
architecture
− Batch workloads
with predictable
growth
− Lowers Big Data
Costs for larger
deployments
− Match compute to
workload
− Large internal storage
− Ideal for large data
volumes and batch
workloads where
density or cost per GB
is key
Density-OptimizedTraditional
Data Lakes & Hubs
• Ingestion of multiple types and
sources of data
• Aggregation, Transformation and
Visualization
• Batch, Interactive, Real-time
workloads
Data Warehouse
Modernization
• Data Staging & landing zone
• Migration of operational data stores
• Active archiving
• Batch workloads
A Big Data Journey…
ETL Offload Archival
Deep Learning
Event Processing
In Memory Analytics
HPE Elastic Platform for Analytics
Flexible Convergence for Big Data Workloads
Low Latency Compute
Event Processing
Moonshot M710P
Big Memory Compute
In Memory Analytics
Apollo xl170r w 512G memory
Archival Storage
Apollo 4200 w 6TB HDD
High Latency Compute
ETL Offload and Archival
Apollo xl170 w 256G memory
HPC Compute
Deep Learning
Apollo xl190r w GPUs
HDFS Storage
Apollo 4200 w 3TB HDD
Unique ValueHPE Workload- and Density-Optimized
(WDO) Solution
HPE Elastic Platform for Analytics
Innovation delivering unique value to customers and the open source community
Data Consolidation
− Shared storage pool for multiple Big Data
environments
Maximum Elasticity
− Dynamic cluster provisioning from compute pools
without repartitioning data
Flexible Scalability
− Scale compute and storage independently
Breakthrough Economics
− Workload optimized components for better density,
cost and power
Ethernet
HPE Apollo 4xx0
HPE Moonshot or HPE Apollo
7
Building blocks for the HPE Elastic Platform for Analytics
8
HP Apollo 2000 System
HP Apollo 4200 Scalable System
A density optimized compute platform that
offers double the density of traditional 1U
servers and high memory/core ratios
A cost-effective industry standard storage
server purpose built for big data with
converged infrastructure that offers high
density energy-efficient storage
HP Moonshot System
A complete server system engineered for
specific workloads and delivered in a dense,
energy-efficient package
HP ProLiant DL300 System
The industry’s most popular server balances
the latest compute and memory technologies
with internal storage and flash options,
coupled with industry-leading management
and serviceability
Workload-optimized compute nodes for Spark,
Hive/Tez, MapReduce, YARN, Vertica SQL on
Hadoop, and other analytics and batch
workload
Workload-optimized compute nodes for Hbase,
Kafka and other low-latency, streaming
workloads. Ideal for highest density + lowest
power requirements
Workload-optimized storage nodes for HDFS,
Kudu, and building a multi-temperate data lake
environment. Ideal for EDW Offload
workloads, and a foundation storage block for
workload consolidation
Balanced worker node for batch and single-
function use cases. Scalable storage node,
with smaller fault domain than Apollo 4200, for
workload consolidation use cases.
Yes, But Does It Perform?
ComputeNodes
StorageNodes
HDFS
Hyperscale Hadoop
Ethernet w/o RoCE
StorageNodes
HDFS
Conventional Hadoop
MapReduce
Read
9.2GB/Sec
4 – worker nodes 4 - Storage nodes
MapReduce
Write
7.4GB/Sec
Moonshot with 45 - M710
Read
4.9GB/Sec
Write
3.4GB/Sec
Comparing Configurations – Single Subject Data Mart Use Case
Normalized on CPU and list price
Balanced
18 Worker Nodes (Symmetric)
ProLiant DL380 Gen9
BDO
6 Worker Node Blocks (18 nodes)
Apollo 4530
Same performance (SpecInt)
7% higher $/SpecInt
WDO
16 Compute nodes + 4 Storage Nodes
Apollo 2000/4200
6% better performance (SpecInt)
2% lower $/SpecInt
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
UID UID UID
21
UID
21
UID
21
UID
Apollo
4500
Gen9
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
Normalized on:
List price
SpecInt
Hyperscale Price/Performance
Compared with Conventional Cluster
11
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Green=40Gbps, Yellow=10GbpsQSFP+4847
49
51
50
52
343332311817161521 1/10GBASE-TGreen=10Gbps, Yellow=1Gbps
SYS
HP 5900
Series Switch
JG336A
Green=40Gbps, Yellow=10GbpsQSFP+4847
49
51
50
52
343332311817161521 1/10GBASE-TGreen=10Gbps, Yellow=1Gbps
SYS
HP 5900
Series Switch
JG336A
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
ProLiant
DL380p
Gen8
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
32313029282726252423222120191817161514131211654321
SYS
Green=40Gbps,Yellow=10GbpsQSFP+
5930 Series
Switch
JG726A
HP FlexFabric
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
SAS
1.2 TB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
SATA
7.2K
4.0 TB
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
10
9
8
7
6
14
13
12
11
19
18
17
16
15
24
23
22
21
20
5
4
3
2
1
UID
Apollo
2000 System
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
SATA
1.0 TB
7.2K
14%
31%
22%
14%
13%
18%
16%
15%
5%
7%
27%
2x
-3%
70%
6x
HadoopAggregation
HadoopJoin
HadoopScan
HadoopSort
HadoopTerasort
HadoopWordcount
HadoopGzip
HadoopGunzip
hperead
HPeWrite
ScalaSparkAggregat…
ScalaSparkJoin
ScalaSparkScan
ScalaSparkBayes
ScalaSparkPagerank
Normalized on list price and storage capacity
ProLiant DL380
Apollo 2000 +
Apollo 4200
HOT Data BALANCED COLD Data
Independent scaling of Compute and Storage
HOT Data *
− 2.8x compute
− 97% of the storage capacity
− 4x the memory
Balanced *
− 1.6x compute
− 1.5x the storage capacity
− 2.5x the memory
Cold Data *
− 0.9x of the compute
− 2.1x the storage capacity
− 1.5x the memory
Hyperscale vs. Conventional Scale-out
12
* Compared with balanced, conventional full rack cluster
Hyperscale benefits for Big Data
Hadoop Labels feature (jira Yarn-796)
– Contributed node labels concepts into Hadoop (Apache 2.6 trunk)
– Allows scheduling of YARN containers to specific pools of nodes
– Combined with Hyperscale approach, compute nodes can be dynamically assigned because no data needs to be repartitioned
Hadoop Cluster 1 Hive/Tez
Vertica SQL on
Hadoop
Data prep
(12am – 6am)
Predictive analytics
(6am – 12am)
Spark
Hadoop Cluster 1(High Power)
Storage Node Storage Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
UID
2
1
4
3
6
5
7 8
ProLiant
DL360
Gen9
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
SAS
900 GB
10K
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
UID
Tray 2
22191613
24211815
10741
12963
Tray 1
Pull for tray 2Pull for tray 2
Apollo
4200 Gen9
SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453
Speed: Green=1000Mbps, Yellow=10/100Mbps
SFP+ QSFP+
Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps
HP 5900
Series Switch
JG510A
A Modern “Sparkitecture” for Real-time Analytics (SMACK)
14
Cassandra
Mongo HDFS Kafka
Orc
OracleVertica
Spark
Sparksql Batch ML Streaming graph
Parquet
HANA
Vora
Vertica Cluster
MPP Columnar DW
Vertica
DW
Vertica
DW(s)
Vertica Cluster
Multi-DC DR
Vertica
DW
Vertica
DW(s)
Kafka (Mirrored)
Dual Ingestion Pipeline
Vertica
Spark
Vertica
Kafka
Connector
HDFS
/AVRO/Topic/Y
YYY/MM/DD/H
H
Semi-Structured
Data
HDFS
/ORC/Topic/YY
YY/MM/DD
Structured Data
Non-relational
Hadoop Cluster (Hive)
Hortonworks, Cloudera, AWS
1
2
3
4
5
6
78
1
0
11
9
Spark Parallel
Streaming
Transformations
Near real-time
Transform/ReShape
Mapping to Vertica SOT
Scala, Java, Python, SQL
Hadoop™ Spark™
Cluster
Spark
Vertica
Loader
Web
Mobile
IoT
MySQL
Applications
Centralized Data HubKafka™ Cluster
DataCenterBoundar
Parallel Streaming Transformation Loader
Confluent™ REST Proxy
JSON/AVRO Messages
Kafka Connect
Infrastructure trends affecting Big Data architecture
Workload optimization
Low-power SoC’s and other
accelerators giving rise to
workload optimized servers
Faster network fabric
Dramatic increase in fabric
speeds
Multi-temperate storage
Enterprise adoption of tiering
accelerated (NVMe, flash, etc.)
storage
Container-based Apps
Running multiple container apps
while hosting a common
resource management (YARN)
Big Data
Elastic Platform for Analytics long term view
Evolve to support multiple compute and storage blocks
Low Cost Nodes
SSD Nodes Disk Nodes Archive Nodes
Multi-temperate, Density Optimized Storage using HDFS Tiering, NoSQLs and Objectstores
GPU Nodes FPGA Nodes Big Memory Nodes
Density and Workload Optimized compute nodes to accelerate various big data software
Delivering a Scalable Data Lake for the Enterprise
• Unlock the most value and performance from Hadoop
• Scale without compromising data security, reliability, and ROI
• Enterprise-Grade, Trusted, and Proven HPE solution
Optimize the Hadoop Data Lake for More Business Value
Elastic Platform
for Analytics
(Workload and Density
Optimized)
High-Performing
Analytics Engines
for Hadoop
Consulting &
Implementation Services for
Hadoop
Data Security
for Hadoop
Thank you
19

More Related Content

What's hot

Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World ConsiderationsCloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World ConsiderationsDataWorks Summit/Hadoop Summit
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentDataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDataWorks Summit
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?DataWorks Summit/Hadoop Summit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...DataWorks Summit
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Cedric CARBONE
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Hortonworks
 
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezHadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezJan Pieter Posthuma
 

What's hot (20)

Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World ConsiderationsCloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World Considerations
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service Deployment
 
Scheduling Policies in YARN
Scheduling Policies in YARNScheduling Policies in YARN
Scheduling Policies in YARN
 
To The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid AnalyticsTo The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid Analytics
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profitDeep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
 
Spark Uber Development Kit
Spark Uber Development KitSpark Uber Development Kit
Spark Uber Development Kit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
 
Hybrid Data Platform
Hybrid Data Platform Hybrid Data Platform
Hybrid Data Platform
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
 
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to TezHadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
 

Viewers also liked

Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on HadoopTyler Mitchell
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataDataWorks Summit/Hadoop Summit
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveDataWorks Summit/Hadoop Summit
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewDataWorks Summit/Hadoop Summit
 
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJIntro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJDaniel Madrigal
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success DataWorks Summit/Hadoop Summit
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondDataWorks Summit/Hadoop Summit
 
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaBridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaDataWorks Summit/Hadoop Summit
 

Viewers also liked (20)

File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Apache Hive ACID Project
Apache Hive ACID ProjectApache Hive ACID Project
Apache Hive ACID Project
 
Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Loan Decisioning Transformation
Loan Decisioning TransformationLoan Decisioning Transformation
Loan Decisioning Transformation
 
Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on Hadoop
 
Beyond TCO
Beyond TCOBeyond TCO
Beyond TCO
 
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of DataMachine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of Data
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on HiveFaster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
 
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJIntro to Spark with Zeppelin Crash Course Hadoop Summit SJ
Intro to Spark with Zeppelin Crash Course Hadoop Summit SJ
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
Apache Ranger Hive Metastore Security
Apache Ranger Hive Metastore Security Apache Ranger Hive Metastore Security
Apache Ranger Hive Metastore Security
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyond
 
Toward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFSToward Better Multi-Tenancy Support from HDFS
Toward Better Multi-Tenancy Support from HDFS
 
Keep your Hadoop Cluster at its Best
Keep your Hadoop Cluster at its BestKeep your Hadoop Cluster at its Best
Keep your Hadoop Cluster at its Best
 
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaBridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
 
Active Learning for Fraud Prevention
Active Learning for Fraud PreventionActive Learning for Fraud Prevention
Active Learning for Fraud Prevention
 

Similar to A New "Sparkitecture" for modernizing your data warehouse

The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkThe Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkLenovo Data Center
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY
 
Application Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster InterconnectsApplication Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster InterconnectsIT Brand Pulse
 
PLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP MoonshotPLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP MoonshotPROIDEA
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataLviv Startup Club
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Lviv Startup Club
 
Track 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbedTrack 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbedEMC Forum India
 
Accelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HPAccelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HPSpiceworks Ziff Davis
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceGanesan Narayanasamy
 
32992 lam ebc storage overview3
32992 lam ebc storage overview332992 lam ebc storage overview3
32992 lam ebc storage overview3gmazuel
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy
 
HP Storage: Delivering Storage without Boundaries
HP Storage: Delivering Storage without BoundariesHP Storage: Delivering Storage without Boundaries
HP Storage: Delivering Storage without Boundariesjameshub12
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Sumeet Singh
 
Architecting the Cloud Infrastructure for the Future with Intel
Architecting the Cloud Infrastructure for the Future with IntelArchitecting the Cloud Infrastructure for the Future with Intel
Architecting the Cloud Infrastructure for the Future with IntelIntel IT Center
 

Similar to A New "Sparkitecture" for modernizing your data warehouse (20)

Sgi hadoop
Sgi hadoopSgi hadoop
Sgi hadoop
 
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkThe Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmark
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Application Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster InterconnectsApplication Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster Interconnects
 
Empower Data-Driven Organizations
Empower Data-Driven OrganizationsEmpower Data-Driven Organizations
Empower Data-Driven Organizations
 
PLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP MoonshotPLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP Moonshot
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
Track 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbedTrack 2, session 4, data protection and disaster recovery with riverbed
Track 2, session 4, data protection and disaster recovery with riverbed
 
Empower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and HadoopEmpower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and Hadoop
 
Accelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HPAccelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HP
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performance
 
NetApp All Flash storage
NetApp All Flash storageNetApp All Flash storage
NetApp All Flash storage
 
32992 lam ebc storage overview3
32992 lam ebc storage overview332992 lam ebc storage overview3
32992 lam ebc storage overview3
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
 
HP Storage: Delivering Storage without Boundaries
HP Storage: Delivering Storage without BoundariesHP Storage: Delivering Storage without Boundaries
HP Storage: Delivering Storage without Boundaries
 
Exadata
ExadataExadata
Exadata
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
Architecting the Cloud Infrastructure for the Future with Intel
Architecting the Cloud Infrastructure for the Future with IntelArchitecting the Cloud Infrastructure for the Future with Intel
Architecting the Cloud Infrastructure for the Future with Intel
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 

Recently uploaded

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

A New "Sparkitecture" for modernizing your data warehouse

  • 1. A New “Sparkitecture” for Modernizing Your Data Warehouse Ranga Nathan (Big Data Solutions Product Management) Jack Gudenkauf (Big Data Professional Services Architect) June 9, 2016
  • 2. 40% of Enterprises struggle to identify, integrate, and Manage Big Data with existing technology More systems to manageMore complexity to integrateMore data to identify Barriers:
  • 3. Hadoop and the data lake 3 App App App Hadoop Data Lake Vision: • Data-centric foundation for all data and apps • Elastic data management & compute platform for all data • Single platform for all analytical workloads Reality: • Data swamps due to lack of oversight and data governance • Dearth of skilled resources to extract value from the data • Sub-optimal performance with traditional architectures • Cannot scale to handle multi-tenant workload complexityData Ponds
  • 4. Conventional Wisdom Regarding Deploying a Data Lake Infrastructure 4 Use Cases: ProLiant DL380 Apollo 4530 Apollo 4200 − Traditional Hadoop architecture − Batch workloads with predictable growth − Lowers Big Data Costs for larger deployments − Match compute to workload − Large internal storage − Ideal for large data volumes and batch workloads where density or cost per GB is key Density-OptimizedTraditional Data Lakes & Hubs • Ingestion of multiple types and sources of data • Aggregation, Transformation and Visualization • Batch, Interactive, Real-time workloads Data Warehouse Modernization • Data Staging & landing zone • Migration of operational data stores • Active archiving • Batch workloads
  • 5. A Big Data Journey… ETL Offload Archival Deep Learning Event Processing In Memory Analytics
  • 6. HPE Elastic Platform for Analytics Flexible Convergence for Big Data Workloads Low Latency Compute Event Processing Moonshot M710P Big Memory Compute In Memory Analytics Apollo xl170r w 512G memory Archival Storage Apollo 4200 w 6TB HDD High Latency Compute ETL Offload and Archival Apollo xl170 w 256G memory HPC Compute Deep Learning Apollo xl190r w GPUs HDFS Storage Apollo 4200 w 3TB HDD
  • 7. Unique ValueHPE Workload- and Density-Optimized (WDO) Solution HPE Elastic Platform for Analytics Innovation delivering unique value to customers and the open source community Data Consolidation − Shared storage pool for multiple Big Data environments Maximum Elasticity − Dynamic cluster provisioning from compute pools without repartitioning data Flexible Scalability − Scale compute and storage independently Breakthrough Economics − Workload optimized components for better density, cost and power Ethernet HPE Apollo 4xx0 HPE Moonshot or HPE Apollo 7
  • 8. Building blocks for the HPE Elastic Platform for Analytics 8 HP Apollo 2000 System HP Apollo 4200 Scalable System A density optimized compute platform that offers double the density of traditional 1U servers and high memory/core ratios A cost-effective industry standard storage server purpose built for big data with converged infrastructure that offers high density energy-efficient storage HP Moonshot System A complete server system engineered for specific workloads and delivered in a dense, energy-efficient package HP ProLiant DL300 System The industry’s most popular server balances the latest compute and memory technologies with internal storage and flash options, coupled with industry-leading management and serviceability Workload-optimized compute nodes for Spark, Hive/Tez, MapReduce, YARN, Vertica SQL on Hadoop, and other analytics and batch workload Workload-optimized compute nodes for Hbase, Kafka and other low-latency, streaming workloads. Ideal for highest density + lowest power requirements Workload-optimized storage nodes for HDFS, Kudu, and building a multi-temperate data lake environment. Ideal for EDW Offload workloads, and a foundation storage block for workload consolidation Balanced worker node for batch and single- function use cases. Scalable storage node, with smaller fault domain than Apollo 4200, for workload consolidation use cases.
  • 9. Yes, But Does It Perform? ComputeNodes StorageNodes HDFS Hyperscale Hadoop Ethernet w/o RoCE StorageNodes HDFS Conventional Hadoop MapReduce Read 9.2GB/Sec 4 – worker nodes 4 - Storage nodes MapReduce Write 7.4GB/Sec Moonshot with 45 - M710 Read 4.9GB/Sec Write 3.4GB/Sec
  • 10. Comparing Configurations – Single Subject Data Mart Use Case Normalized on CPU and list price Balanced 18 Worker Nodes (Symmetric) ProLiant DL380 Gen9 BDO 6 Worker Node Blocks (18 nodes) Apollo 4530 Same performance (SpecInt) 7% higher $/SpecInt WDO 16 Compute nodes + 4 Storage Nodes Apollo 2000/4200 6% better performance (SpecInt) 2% lower $/SpecInt 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A 32313029282726252423222120191817161514131211654321 SYS Green=40Gbps,Yellow=10GbpsQSFP+ 5930 Series Switch JG726A HP FlexFabric 32313029282726252423222120191817161514131211654321 SYS Green=40Gbps,Yellow=10GbpsQSFP+ 5930 Series Switch JG726A HP FlexFabric UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 10 9 8 7 6 14 13 12 11 19 18 17 16 15 24 23 22 21 20 5 4 3 2 1 UID Apollo 2000 System SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 10 9 8 7 6 14 13 12 11 19 18 17 16 15 24 23 22 21 20 5 4 3 2 1 UID Apollo 2000 System SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K 10 9 8 7 6 14 13 12 11 19 18 17 16 15 24 23 22 21 20 5 4 3 2 1 UID Apollo 2000 System SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K 10 9 8 7 6 14 13 12 11 19 18 17 16 15 24 23 22 21 20 5 4 3 2 1 UID Apollo 2000 System SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 UID UID UID 21 UID 21 UID 21 UID Apollo 4500 Gen9 UID UID UID 21 UID 21 UID 21 UID Apollo 4500 Gen9 UID UID UID 21 UID 21 UID 21 UID Apollo 4500 Gen9 UID UID UID 21 UID 21 UID 21 UID Apollo 4500 Gen9 UID UID UID 21 UID 21 UID 21 UID Apollo 4500 Gen9 UID UID UID 21 UID 21 UID 21 UID Apollo 4500 Gen9 32313029282726252423222120191817161514131211654321 SYS Green=40Gbps,Yellow=10GbpsQSFP+ 5930 Series Switch JG726A HP FlexFabric 32313029282726252423222120191817161514131211654321 SYS Green=40Gbps,Yellow=10GbpsQSFP+ 5930 Series Switch JG726A HP FlexFabric 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A Normalized on: List price SpecInt
  • 11. Hyperscale Price/Performance Compared with Conventional Cluster 11 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Green=40Gbps, Yellow=10GbpsQSFP+4847 49 51 50 52 343332311817161521 1/10GBASE-TGreen=10Gbps, Yellow=1Gbps SYS HP 5900 Series Switch JG336A Green=40Gbps, Yellow=10GbpsQSFP+4847 49 51 50 52 343332311817161521 1/10GBASE-TGreen=10Gbps, Yellow=1Gbps SYS HP 5900 Series Switch JG336A UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID ProLiant DL380p Gen8 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 32313029282726252423222120191817161514131211654321 SYS Green=40Gbps,Yellow=10GbpsQSFP+ 5930 Series Switch JG726A HP FlexFabric 32313029282726252423222120191817161514131211654321 SYS Green=40Gbps,Yellow=10GbpsQSFP+ 5930 Series Switch JG726A HP FlexFabric SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K SAS 1.2 TB 10K UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB SATA 7.2K 4.0 TB 10 9 8 7 6 14 13 12 11 19 18 17 16 15 24 23 22 21 20 5 4 3 2 1 UID Apollo 2000 System SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K 10 9 8 7 6 14 13 12 11 19 18 17 16 15 24 23 22 21 20 5 4 3 2 1 UID Apollo 2000 System SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K SATA 1.0 TB 7.2K 14% 31% 22% 14% 13% 18% 16% 15% 5% 7% 27% 2x -3% 70% 6x HadoopAggregation HadoopJoin HadoopScan HadoopSort HadoopTerasort HadoopWordcount HadoopGzip HadoopGunzip hperead HPeWrite ScalaSparkAggregat… ScalaSparkJoin ScalaSparkScan ScalaSparkBayes ScalaSparkPagerank Normalized on list price and storage capacity ProLiant DL380 Apollo 2000 + Apollo 4200
  • 12. HOT Data BALANCED COLD Data Independent scaling of Compute and Storage HOT Data * − 2.8x compute − 97% of the storage capacity − 4x the memory Balanced * − 1.6x compute − 1.5x the storage capacity − 2.5x the memory Cold Data * − 0.9x of the compute − 2.1x the storage capacity − 1.5x the memory Hyperscale vs. Conventional Scale-out 12 * Compared with balanced, conventional full rack cluster
  • 13. Hyperscale benefits for Big Data Hadoop Labels feature (jira Yarn-796) – Contributed node labels concepts into Hadoop (Apache 2.6 trunk) – Allows scheduling of YARN containers to specific pools of nodes – Combined with Hyperscale approach, compute nodes can be dynamically assigned because no data needs to be repartitioned Hadoop Cluster 1 Hive/Tez Vertica SQL on Hadoop Data prep (12am – 6am) Predictive analytics (6am – 12am) Spark Hadoop Cluster 1(High Power) Storage Node Storage Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node
  • 14. 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 UID 2 1 4 3 6 5 7 8 ProLiant DL360 Gen9 SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K SAS 900 GB 10K UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 UID Tray 2 22191613 24211815 10741 12963 Tray 1 Pull for tray 2Pull for tray 2 Apollo 4200 Gen9 SYS161521 32311817 10/100/1000Base-T 48473433 5049 5251 5453 Speed: Green=1000Mbps, Yellow=10/100Mbps SFP+ QSFP+ Green=10Gbps, Yellow=1Gbps Green=40Gbps, Yellow=10Gbps HP 5900 Series Switch JG510A A Modern “Sparkitecture” for Real-time Analytics (SMACK) 14 Cassandra Mongo HDFS Kafka Orc OracleVertica Spark Sparksql Batch ML Streaming graph Parquet HANA Vora
  • 15. Vertica Cluster MPP Columnar DW Vertica DW Vertica DW(s) Vertica Cluster Multi-DC DR Vertica DW Vertica DW(s) Kafka (Mirrored) Dual Ingestion Pipeline Vertica Spark Vertica Kafka Connector HDFS /AVRO/Topic/Y YYY/MM/DD/H H Semi-Structured Data HDFS /ORC/Topic/YY YY/MM/DD Structured Data Non-relational Hadoop Cluster (Hive) Hortonworks, Cloudera, AWS 1 2 3 4 5 6 78 1 0 11 9 Spark Parallel Streaming Transformations Near real-time Transform/ReShape Mapping to Vertica SOT Scala, Java, Python, SQL Hadoop™ Spark™ Cluster Spark Vertica Loader Web Mobile IoT MySQL Applications Centralized Data HubKafka™ Cluster DataCenterBoundar Parallel Streaming Transformation Loader Confluent™ REST Proxy JSON/AVRO Messages Kafka Connect
  • 16. Infrastructure trends affecting Big Data architecture Workload optimization Low-power SoC’s and other accelerators giving rise to workload optimized servers Faster network fabric Dramatic increase in fabric speeds Multi-temperate storage Enterprise adoption of tiering accelerated (NVMe, flash, etc.) storage Container-based Apps Running multiple container apps while hosting a common resource management (YARN) Big Data
  • 17. Elastic Platform for Analytics long term view Evolve to support multiple compute and storage blocks Low Cost Nodes SSD Nodes Disk Nodes Archive Nodes Multi-temperate, Density Optimized Storage using HDFS Tiering, NoSQLs and Objectstores GPU Nodes FPGA Nodes Big Memory Nodes Density and Workload Optimized compute nodes to accelerate various big data software
  • 18. Delivering a Scalable Data Lake for the Enterprise • Unlock the most value and performance from Hadoop • Scale without compromising data security, reliability, and ROI • Enterprise-Grade, Trusted, and Proven HPE solution Optimize the Hadoop Data Lake for More Business Value Elastic Platform for Analytics (Workload and Density Optimized) High-Performing Analytics Engines for Hadoop Consulting & Implementation Services for Hadoop Data Security for Hadoop