SlideShare a Scribd company logo
1 of 28
Download to read offline
Unlocking Big Data
Infrastructure Efficiency
with Storage Disaggregation
Anjaneya “Reddy” Chagam
Chief SDS Architect, Data Center Group, Intel Corporation
Intel Confidential
Agenda
§  Data Growth Challenges
§  Need for Storage Disaggregation
§  Hadoop Over Ceph (Block)
§  Summary
2
Intel Confidential 3
Challenges for Cloud Service Providers
Nearly continuous
acquisition of storage
is needed.
Petabyte-scale data
footprints are common.
>35-percent annual
rate of storage
growth is expected.1
Inefficiencies
of storage acquisition
are magnified over time.
3
Tier-2 cloud service
providers (CSPs)
must meet the
demands of fast data
growth while driving
differentiation and
value-added services.
1 IDC. “Extracting Value from Chaos.” Sponsored by EMC Corporation. June 2011.
emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf.
Intel Confidential 4
Challenges with Scaling Apache Hadoop* Storage
Native Hadoop storage and compute can’t be scaled independently
Inefficient resource
allocation and IT
spending result.
Excess compute
capacity: when more
storage is needed, IT
ends up with more
compute than it needs.
4
Inefficiencies are
highly consequential
for large firms such as
tier-2 CSPs.
Both storage and compute resources are bound to Hadoop nodes
Intel Confidential 5
Challenges with Scaling Apache Hadoop* Storage
Native Hadoop storage can be used only for Hadoop workloads
Additional storage is needed for non-big-data workloads
Greater investments are
required for other
workloads
•  Higher IT costs
Multiple storage
environments are
needed
•  Low storage-capacity
utilization for workloads
No multi-tenancy support
in Hadoop
•  Decreased operational agility
Lack of a central, unified
storage technology
•  Need to replicate data from
other storage environments
and applications to the
Hadoop cluster on a regular
basis
•  Results in unsustainable “data
islands” that increase total cost
of ownership (TCO) and
reduce decision agility
Intel Confidential 6
Solution: Apache Hadoop* with Ceph*
• Disaggregate Hadoop
storage and compute
• Ceph is:
• Open source
• Scalable
• Ceph enables:
• Storage for all data types
• Intel® Xeon® processors
• Intel network solutions
• Intel® Cache Acceleration
Software (Intel® CAS)
• Intel® Solid-State Drives
(SSDs) using high-speed
Non-Volatile Memory
Express* (NVMe*)
• Compute and storage scale
separately
• Unified storage for all
enterprise needs
• Increased organizational
agility
• More efficient use of IT
resources
Optimize performance
with Intel® technologies
ResultsUse Ceph instead of
local, direct-attached
hard drives for back-end
storage
Intel Confidential 7
Advantages of Ceph* Storage vs. Local Storage
Free (if self-supported)
Supports all data types:
file, block, and object
data
Provides one centralized,
standardized, and
scalable storage solution
for all enterprise needs
Open source
Supports many different
workloads and
applications
Works on commodity
hardware
Intel Confidential 8
Apache Hadoop* with Ceph* Storage: Logical Architecture
HDFS+YARN
SQLIn-Memory Map-Reduce NoSQL Stream Search Custom
Deployment Options
•  Hadoop Services: Virtual, Container or Bare Metal
•  Storage Integration: Ceph Block, File or Object
•  Data Protection: HDFS and/or Ceph replication or Erasure Codes
•  Tiering: HDFS and/or Ceph tiering
Intel Confidential
Ceph Monitors
Hadoop
Networking
Ceph OSD* x 7
Apache Hadoop* with Ceph*
on QCT Platform*
Physical architecture
QCT Solution Center*	
9
Intel Confidential 10
QCT Test Lab Environment (Cloudera Hadoop 5.7.0 & Ceph Jewel 10.2.1/FileStore)
Hadoop 21-22 (Data Nodes)RMS32 (Mgmt)
AP ES HM SM
SNN B
JHS RM
S
Hadoop24 (Name Node)
NN
G
S
Hadoop23 (Data Node)
DN
G
S
NM
/
blkdev<Host#>_{
0..11}, 6TB
110
RBD vols DN
G NM
/
blkdev<Host#>_{0
..11}, 6TB
110
RBD vols
p10p2
10.10.150.0/24 – private/cluster
p10p2 p10p2p255p2
Hadoop11-14 (Data Nodes)
DN
G NM
/
blkdev<Host#>_{0
..11}, 6TB
110
RBD vols
p10p2
p10p1 p10p1
10.10.242.0/24 – public
10.10.241.0/24 – public
p10p1
StarbaseMON41..42 StarbaseMON43
bond0 (p255p1+p255p2) bond0 (p255p1+p255p2)
Starbase51..54 Starbase55..57
bond0 (p255p1+p255p2)bond0 (p255p1+p255p2)
p2p1
10.10.100.0/24 – private/clusterp2p1
10.10.200.0/24 – private/cluster
CAS
NVMe 1 24
Journal
NVMe
OSD 1 OSD 2 OSD 24
nvme1n1nvme0n1
CAS
NVMe 1 24
Journal
NVMe
OSD 1 OSD 2 OSD 24
nvme1n1nvme0n1HDD 6TB HDD 6TB
SSD
Boot & Mon
SSD
Boot
SSD
Boot
SSD
Boot & Mon
MONMON
HDD (Boot
and CDH)
HDD (Boot
and CDH)
HDD (Boot
& CDH)
HDD (Boot
& CDH)
HDD (Boot
& CDH)
NOTE: BMC management network is not shown. HDFS replication 1, Ceph replication 2
*Other names and brands may be claimed as the property of others.
Intel Confidential 11
Intel CAS and Ceph Journal Configuration
OSDs
Ceph Journal
HDD13-24
Cache for
HDD1-12
Ceph Journal
HDD1-12
Cache for
HDD13-24
HDD1 HDD12 HDD13 HDD24… …
Reads Writes
OSDs
Ceph Journal
HDD13-24
Cache for
HDD1-12
Ceph Journal
HDD1-12
Cache for
HDD13-24
HDD1 HDD12 HDD13 HDD24… …
NVMe1 NVMe2 NVMe1 NVMe2
•  Ceph Journal[1-24]: 20G each, 480G in Total
•  Intel CAS[1-4]: 880G each, ~3520TB in Total
Validated Solution: Apache Hadoop* with Ceph* Storage
A highly performant proof-of-concept (POC) has been built by Intel and QCT.2
12
2 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and
MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You
should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with
other products. For more complete information visit intel.com/performance. For more information, see Legal Notices and Disclaimers.
**
Optimize performance
with Intel® CAS and
Intel® SSDs using
NVMe*
•  Resolve input/output (I/
O) bottlenecks
•  Provide better
customer service-level-
agreement (SLA)
support
•  Provide up to a 60-
percent I/O
performance
improvement2
Disaggregate
storage and
compute in
Hadoop by using
Ceph storage
instead of
direct-attached
storage (DAS)
HDFS replication 1, Ceph replication 2
Intel Confidential 13
Benefits of the Apache Hadoop* with Ceph* Solution
Multi-protocol
storage support
Independent
scaling of storage
and compute
Enhanced
organizational
agility
Decreased capital
expenditures
(CapEx)
No loss in
performance
Can use
resources for any
workload
Intel Confidential 14
Find Out More
To learn more about Intel® CAS and request a trial copy, visit: intel.com/content/www/us/en/software/
intel-cache-acceleration-software-performance.html
To find the Intel® SSD that’s right for you, visit: intel.com/go/ssd
To learn about QCT QxStor* Red Hat* Ceph* Storage Edition, visit: qct.io/solution/software-defined-
infrastructure/storage-virtualization/qxstor-red-hat-ceph-storage-edition-p365c225c226c230
Thank you!
BACKUP SLIDES
Intel Confidential 17
Legal Notices and Disclaimers
1 IDC. “Extracting Value from Chaos.” Sponsored by EMC Corporation. June 2011.
emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf.
2 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as
SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors
may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases,
including the performance of that product when combined with other products.
Configurations:
•  Ceph* storage nodes, each server: 16 Intel® Xeon® processor E5-2680 v3, 128 GB RAM, twenty-four 6 TB Seagate Enterprise* hard drives, and two 2 TB
Intel® Solid-State Drive (SSD) DC P3700 NVMe* drives with 10 gigabit Ethernet (GbE) Intel® Ethernet Converged Network Adapter X540-T2 network cards,
20 GbE public network, and 40 GbE private Ceph network.
•  Apache Hadoop* data nodes, each server: 16 Intel Xeon processor E5-2620 v3 single socket, 128 GB RAM, with 10 GbE Intel Ethernet Converged Network
Adapter X540-T2 network cards, bonded.
The difference between the version with Intel® Cache Acceleration Software (Intel® CAS) and the baseline is that the Intel CAS version is not caching and is in
pass-through mode, so software only, no hardware changes are needed. The tests used were TeraGen*, TeraSort*, TeraValidate*, and DFSIO*, which are the
industry-standard Hadoop performance tests. For more complete information, visit intel.com/performance.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced website and confirm
whether referenced data are accurate.
Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel
microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability,
functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are
intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the
applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice Revision #20110804
Intel Confidential 18
Legal Notices and Disclaimers
Intel technologies’ features and benefits depend on system configuration and may require hardware, software or service activation. Performance varies depending
on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.
Intel, the Intel logo, Intel. Experience What’s Inside, the Intel. Experience What’s Inside logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other
countries.
QCT, the QCT logo, Quanta, and the Quanta logo are trademarks or registered trademarks of Quanta Computer Inc.
Copyright © 2016 Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
19
Intel’s role in storage
Advance the
Industry
Open Source & Standards
Build an Open
Ecosystem
Intel® Storage Builders
End user solutions
Cloud, Enterprise
Intel Technology Leadership
Storage Optimized Platforms
Intel® Xeon® E5-2600 v4 Platform
Intel® Xeon® Processor D-1500 Platform
Intel® Converged Network Adapters 10/40GbE
Intel® SSDs for DC & Cloud
Storage Optimized Software
Intel® Intelligent Storage Acceleration Library
Storage Performance Development Kit
Intel® Cache Acceleration Software
SSD & Non-Volatile Memory
Interfaces: SATA , NVMe PCIe,
Form Factors: 2.5”, M.2, U.2, PCIe AIC
New Technologies: 3D NAND, Intel® Optane™
Cloud & Enterprise partner storage
solution architectures
73
+
partners
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific
computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully
evaluating your contemplated purchases, including the performance of that product when combined with other products.
Next gen solutions architectures
Intel solution architects have deep
expertise on Ceph for low cost and
high performance usage
helping customers to enable a modern
storage infrastructure
Intel Confidential 20
NVM Express
Intel Confidential
First 3D XPoint Use Cases for
Bluestore
§  Bluestore Backend, RocksDB Backend,
RocksDB WAL
Two methods for accessing PMEM
devices
§  Raw PMEM blockdev (libpmemblk)
§  DAX-enabled FS (mmap + libpmemlib)
	
3D XPoint™ and Ceph
BlueStore
Rocksdb
BlueFS
PMEMDevice PMEMDevice PMEMDevice
Metadata
Libpmemlib
Libpmemblk
DAX	Enabled	File	System
mmap
Load/store
mmap
Load/store
File
File
File
API
API
Data
21
Intel Confidential
Enterprise class, highly reliable, feature rich,
and cost effective AFA solution
§  NVMe SSD is today’s SSD, and 3D NAND
or TLC SSD is today’s HDD
–  NVMe as Journal, high capacity SATA SSD
or 3D NAND SSD as data store
–  Provide high performance, high capacity, a
more cost effective solution
–  1M 4K Random Read IOPS delivered by 5 Ceph
nodes
–  Cost effective: 1000 HDD Ceph nodes (10K
HDDs) to deliver same throughput
–  High capacity: 100TB in 5 nodes
§  with special software optimization on
filestore and bluestore backend
3D NAND - Ceph cost effective solution
Ceph Node
S3510
1.6TB
S3510
1.6TB
S3510
1.6TB
S3510
1.6TB
P3700
M.2 800GB
Ceph Node
P3520
4TB
P3520
4TB
P3520
4TB
P3520
4TB
P3700 & 3D Xpoint™ SSDs
P3520
4TB
NVMe 3D Xpoint™
NVMe 3D NAND
SATA/NVMe
NAND
22
Intel Confidential
Test Setup (Linux OS)
/etc/sysctl.conf
vm.swappiness=10
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.netdev_max_backlog = 250000
/etc/security/limits.conf
* soft nofile 65536
* hard nofile 1048576
* soft nproc 65536
* hard nproc unlimited
* hard memlock unlimited
CPU Profile
echo performance> /sys/devices/system/cpu/cpu{0..n}/cpufreq/scaling_governor
Huge Page
echo never> /sys/kernel/mm/transparent_hugepage/defrag
echo never> /sys/kernel/mm/transparent_hugepage/enabled
Network
ifconfig <eth> mtu 9000
ifconfig <eth> txqueuelen 1000
Intel Confidential
Test Setup (Ceph)
[global]
fsid = f1739148-3847-424d-b262-45d5b950fa3b
mon_initial_members = starbasemon41, starbasemon42, starbasemon43
mon_host = 10.10.241.41,10.10.241.42,10.10.242.43
auth_client_required = none
auth_cluster_required = none
auth_service_required = none
filestore_xattr_use_omap = true
osd_pool_default_size = 3 # Write an object 2 times.
osd_pool_default_min_size = 3 # Allow writing one copy in a degraded state.
osd_pool_default_pg_num = 4800
osd_pool_default_pgp_num = 4800
public_network = 10.10.241.0/24, 10.10.242.0/24
cluster_network = 10.10.100.0/24, 10.10.200.0/24
debug_lockdep = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_buffer = 0/0
debug_timer = 0/0
debug_filer = 0/0
debug_objecter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_ms = 0/0
debug_monc = 0/0
debug_tp = 0/0
debug_auth = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_perfcounter = 0/0
[global]
debug_asok = 0/0
debug_throttle = 0/0
debug_mon = 0/0
debug_paxos = 0/0
debug_rgw = 0/0
perf = true
mutex_perf_counter = true
throttler_perf_counter = false
rbd_cache = false
log_file = /var/log/ceph/$name.log
log_to_syslog = false
mon_compact_on_trim = false
osd_pg_bits = 8
osd_pgp_bits = 8
mon_pg_warn_max_object_skew = 100000
mon_pg_warn_min_per_osd = 0
mon_pg_warn_max_per_osd = 32768
Intel Confidential
Test Setup (Ceph)
[mon]
mon_host = starbasemon41, starbasemon42, starbasemon43
mon_data = /var/lib/ceph/mon/$cluster-$id
mon_max_pool_pg_num = 166496
mon_osd_max_split_count = 10000
mon_pg_warn_max_per_osd = 10000
[mon.a]
host = starbasemon41
mon_addr = 192.168.241.41:6789
[mon.b]
host = starbasemon42
mon_addr = 192.168.241.42:6789
[mon.c]
host = starbasemon43
mon_addr = 192.168.242.43:6789
[osd]
osd_mount_options_xfs =
rw,noatime,inode64,logbsize=256k,delaylog
osd_mkfs_options_xfs = -f -i size=2048
osd_op_threads = 32
filestore_queue_max_ops = 5000
filestore_queue_committing_max_ops = 5000
journal_max_write_entries = 1000
journal_queue_max_ops = 3000
objecter_inflight_ops = 102400
filestore_wbthrottle_enable = false
filestore_queue_max_bytes = 1048576000
filestore_queue_committing_max_bytes = 1048576000
journal_max_write_bytes = 1048576000
journal_queue_max_bytes = 1048576000
Intel Confidential
Test Setup (Hadoop)
Parameter Value Comment
Container Memory yarn.nodemanager.resource.memory-mb 80.52 GiB Default: Amount of physical memory, in MiB, that can be allocated for
containers
NOTE: In a different document, it recommends
Container Virtual CPU Cores
yarn.nodemanager.resource.cpu-vcores
48 Default: Number of virtual CPU cores that can be allocated for containers.
Container Memory Maximum
yarn.scheduler.maximum-allocation-mb
12 GiB The largest amount of physical memory, in MiB, that can be requested for a
container.
Container Virtual CPU Cores Maximum
yarn.scheduler.maximum-allocation-vcores
48 Default: The largest number of virtual CPU cores that can be requested for
a container.
Container Virtual CPU Cores Minimum
yarn.scheduler.minimum-allocation-vcores
2 The smallest number of virtual CPU cores that can be requested for a
container. If using the Capacity or FIFO scheduler (or any scheduler, prior to
CDH 5), virtual core requests will be rounded up to the nearest multiple of
this number.
JobTracker MetaInfo Maxsize
mapreduce.job.split.metainfo.maxsize
1000000000 The maximum permissible size of the split metainfo file. The JobTracker
won't attempt to read split metainfo files bigger than the configured value.
No limits if set to -1.
I/O Sort Memory Buffer (MiB) mapreduce.task.io.sort.mb 400 MiB To enable larger blocksize without spills
yarn.scheduler.minimum-allocation-mb 2 GiB Default: Minimum container size
mapreduce.map.memory.mb 1 GiB Memory req’d for each type of container - may want to increase for some
apps
mapreduce.reduce.memory.mb 1.5 GiB Memory req’d for each type of container - may want to increase for some
apps
mapreduce.map.cpu.vcores 1 Default: Number of vcores req’d for each type of container
mapreduce.reduce.cpu.vcores 1 Default: Number of vcores req’d for each type of container
mapreduce.job.heap.memory-mb.ratio 0.8 (Default). This sets Java heap size = 800/1200 MiB for mapreduce.{map|
reduce}.memory.mb = 1/1.5 GiB
Intel Confidential
Test Setup (Hadoop)
Parameter	 Value	 Comment	
dfs.blocksize	 128 MiB	 Default	
dfs.replication	 1	 Default block replication. The number of replications to make
when the file is created. The default value is used if a
replication number is not specified.	
Java Heap Size of NameNode in Bytes	 4127MiB	 Default: Maximum size in bytes for the Java Process heap
memory. Passed to Java -Xmx.	
Java Heap Size of Secondary NameNode in
Bytes	
4127MiB	 Default: Maximum size in bytes for the Java Process heap
memory. Passed to Java -Xmx.	
Parameter	 Value	 Comment	
Memory overcommit validation threshold	 0.9	 Threshold used when validating the allocation of RAM on a
host. 0 means all of the memory is reserved for the system. 1
means none is reserved. Values can range from 0 to 1.
Intel Confidential
Test Setup (CAS NVMe, Journal NVMe)
NVMe0n1	 NVMe1n1	
Ceph journal configured for 1st 12 HDDs will be
/dev/nvme0n1p1 - /dev/nvme0n1p12
Each Partition size: 20GiB 	
Ceph Journal configured for remaining 12 HDDs will be
/dev/nvme1n1p1 - /dev/nvme1n1p12
Each Partition size: 20GiB	
CAS for 12-24 HDDs will be from this SSD. Use rest of
the free space and split evenly for 2 cache partitions
e.g. /dev/sdo - /dev/sdz
cache 1 /dev/nvme0n1p13 Running wo -
├core 1 /dev/sdo1 - - /dev/intelcas1-1
├core 2 /dev/sdp1 - - /dev/intelcas1-2
├core 3 /dev/sdq1 - - /dev/intelcas1-3
├core 4 /dev/sdr1 - - /dev/intelcas1-4
├core 5 /dev/sds1 - - /dev/intelcas1-5
└core 6 /dev/sdt1 - - /dev/intelcas1-6
cache 2 /dev/nvme0n1p14 Running wo -
├core 1 /dev/sdu1 - - /dev/intelcas2-1
├core 2 /dev/sdv1 - - /dev/intelcas2-2
├core 3 /dev/sdw1 - - /dev/intelcas2-3
├core 4 /dev/sdx1 - - /dev/intelcas2-4
├core 5 /dev/sdy1 - - /dev/intelcas2-5
└core 6 /dev/sdz1 - - /dev/intelcas2-6	
CAS for 1-12 HDDs will be from this SSD. Use rest of the free
space and split evenly for 2 cache partitions
e.g. /dev/sdc - /dev/sdn
cache 1 /dev/nvme1n1p13 Running wo -
├core 1 /dev/sdc1 - - /dev/intelcas1-1
├core 2 /dev/sdd1 - - /dev/intelcas1-2
├core 3 /dev/sde1 - - /dev/intelcas1-3
├core 4 /dev/sdf1 - - /dev/intelcas1-4
├core 5 /dev/sdg1 - - /dev/intelcas1-5
└core 6 /dev/sdh1 - - /dev/intelcas1-6
cache 2 /dev/nvme1n1p14 Running wo -
├core 1 /dev/sdi1 - - /dev/intelcas2-1
├core 2 /dev/sdj1 - - /dev/intelcas2-2
├core 3 /dev/sdk1 - - /dev/intelcas2-3
├core 4 /dev/sdl1 - - /dev/intelcas2-4
├core 5 /dev/sdm1 - - /dev/intelcas2-5
└core 6 /dev/sdn1 - - /dev/intelcas2-6

More Related Content

What's hot

Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage MattersColleen Corrice
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red_Hat_Storage
 
Red Hat Storage Day Seattle: Why Software-Defined Storage Matters
Red Hat Storage Day Seattle: Why Software-Defined Storage MattersRed Hat Storage Day Seattle: Why Software-Defined Storage Matters
Red Hat Storage Day Seattle: Why Software-Defined Storage MattersRed_Hat_Storage
 
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage Red_Hat_Storage
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...Red_Hat_Storage
 
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...Red_Hat_Storage
 
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red_Hat_Storage
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red_Hat_Storage
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed_Hat_Storage
 
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...Red_Hat_Storage
 
Red Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed_Hat_Storage
 
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...
Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...Red_Hat_Storage
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage MattersRed_Hat_Storage
 
Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance Red_Hat_Storage
 
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...Red_Hat_Storage
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
 
Seagate Implementation of Dense Storage Utilizing HDDs and SSDs
Seagate Implementation of Dense Storage Utilizing HDDs and SSDsSeagate Implementation of Dense Storage Utilizing HDDs and SSDs
Seagate Implementation of Dense Storage Utilizing HDDs and SSDsRed_Hat_Storage
 
Red Hat Storage Day Boston - OpenStack + Ceph Storage
Red Hat Storage Day Boston - OpenStack + Ceph StorageRed Hat Storage Day Boston - OpenStack + Ceph Storage
Red Hat Storage Day Boston - OpenStack + Ceph StorageRed_Hat_Storage
 

What's hot (20)

Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage Matters
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
 
Red Hat Storage Day Seattle: Why Software-Defined Storage Matters
Red Hat Storage Day Seattle: Why Software-Defined Storage MattersRed Hat Storage Day Seattle: Why Software-Defined Storage Matters
Red Hat Storage Day Seattle: Why Software-Defined Storage Matters
 
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
Red Hat Storage Day Seattle: Supermicro Solutions for Red Hat Ceph and Red Ha...
 
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
 
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
 
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
Red Hat Storage Day Seattle: Stabilizing Petabyte Ceph Cluster in OpenStack C...
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference Architectures
 
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
 
Red Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for ContainersRed Hat Storage Day New York - Persistent Storage for Containers
Red Hat Storage Day New York - Persistent Storage for Containers
 
Containerized Storage
Containerized StorageContainerized Storage
Containerized Storage
 
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...
Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage Matters
 
Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance
 
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
Seagate Implementation of Dense Storage Utilizing HDDs and SSDs
Seagate Implementation of Dense Storage Utilizing HDDs and SSDsSeagate Implementation of Dense Storage Utilizing HDDs and SSDs
Seagate Implementation of Dense Storage Utilizing HDDs and SSDs
 
Red Hat Storage Day Boston - OpenStack + Ceph Storage
Red Hat Storage Day Boston - OpenStack + Ceph StorageRed Hat Storage Day Boston - OpenStack + Ceph Storage
Red Hat Storage Day Boston - OpenStack + Ceph Storage
 

Viewers also liked

Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers Red_Hat_Storage
 
Red Hat Storage Day - When the Ceph Hits the Fan
Red Hat Storage Day -  When the Ceph Hits the FanRed Hat Storage Day -  When the Ceph Hits the Fan
Red Hat Storage Day - When the Ceph Hits the FanRed_Hat_Storage
 
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...Red_Hat_Storage
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red_Hat_Storage
 
Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...
Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...
Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...Red_Hat_Storage
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red_Hat_Storage
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RSimon Huang
 
Red Hat Storage Day Boston - Why Software-defined Storage Matters
Red Hat Storage Day Boston - Why Software-defined Storage MattersRed Hat Storage Day Boston - Why Software-defined Storage Matters
Red Hat Storage Day Boston - Why Software-defined Storage MattersRed_Hat_Storage
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...Odinot Stanislas
 
Userspace Linux I/O
Userspace Linux I/O Userspace Linux I/O
Userspace Linux I/O Garima Kapoor
 
Hardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux conHardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux consprdd
 
NVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about TimeNVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about TimeDell World
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver인구 강
 
Devconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDKDevconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDKMaxime Coquelin
 
Red Hat Storage Day New York - Welcome Remarks
Red Hat Storage Day New York - Welcome Remarks Red Hat Storage Day New York - Welcome Remarks
Red Hat Storage Day New York - Welcome Remarks Red_Hat_Storage
 
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and ContributionsCeph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and ContributionsRed_Hat_Storage
 
NVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxNVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxLF Events
 

Viewers also liked (18)

Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers
 
Red Hat Storage Day - When the Ceph Hits the Fan
Red Hat Storage Day -  When the Ceph Hits the FanRed Hat Storage Day -  When the Ceph Hits the Fan
Red Hat Storage Day - When the Ceph Hits the Fan
 
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...
Red Hat Storage Day New York - QCT: Avoid the mess, deploy with a validated s...
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
 
Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...
Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...
Red Hat Storage Day New York -Performance Intensive Workloads with Samsung NV...
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 
Red Hat Storage Day Boston - Why Software-defined Storage Matters
Red Hat Storage Day Boston - Why Software-defined Storage MattersRed Hat Storage Day Boston - Why Software-defined Storage Matters
Red Hat Storage Day Boston - Why Software-defined Storage Matters
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
Scale-out Storage on Intel® Architecture Based Platforms: Characterizing and ...
 
Userspace Linux I/O
Userspace Linux I/O Userspace Linux I/O
Userspace Linux I/O
 
Hardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux conHardware accelerated virtio networking for nfv linux con
Hardware accelerated virtio networking for nfv linux con
 
NVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about TimeNVMe PCIe and TLC V-NAND It’s about Time
NVMe PCIe and TLC V-NAND It’s about Time
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver
 
Devconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDKDevconf2017 - Can VMs networking benefit from DPDK
Devconf2017 - Can VMs networking benefit from DPDK
 
Red Hat Storage Day New York - Welcome Remarks
Red Hat Storage Day New York - Welcome Remarks Red Hat Storage Day New York - Welcome Remarks
Red Hat Storage Day New York - Welcome Remarks
 
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and ContributionsCeph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
 
NVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxNVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in Linux
 

Similar to Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Efficiency with Storage Disaggregation

Accelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryAccelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryDatabricks
 
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive... Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...Databricks
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...Databricks
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciIntel® Software
 
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSAccelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSCeph Community
 
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chipinside-BigData.com
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Ceph Community
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and futureboxu42
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...Andrey Kudryavtsev
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergenceinside-BigData.com
 
Intel ssd dc data center family for PCIe
Intel ssd dc data center family for PCIeIntel ssd dc data center family for PCIe
Intel ssd dc data center family for PCIeLow Hong Chuan
 
Intel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overviewIntel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overviewDESMOND YUEN
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Patrick McGarry
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Ceph Community
 
Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Michelle Holley
 
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Jaroslav Prodelal
 
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Alluxio, Inc.
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Community
 
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014StampedeCon
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephDanielle Womboldt
 

Similar to Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Efficiency with Storage Disaggregation (20)

Accelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryAccelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent Memory
 
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive... Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
 
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSAccelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
 
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and future
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
 
Intel ssd dc data center family for PCIe
Intel ssd dc data center family for PCIeIntel ssd dc data center family for PCIe
Intel ssd dc data center family for PCIe
 
Intel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overviewIntel xeon-scalable-processors-overview
Intel xeon-scalable-processors-overview
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
 
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
Using Recently Published Ceph Reference Architectures to Select Your Ceph Con...
 
Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...
 
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
 
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and Ceph
 

Recently uploaded

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 

Recently uploaded (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 

Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Efficiency with Storage Disaggregation

  • 1. Unlocking Big Data Infrastructure Efficiency with Storage Disaggregation Anjaneya “Reddy” Chagam Chief SDS Architect, Data Center Group, Intel Corporation
  • 2. Intel Confidential Agenda §  Data Growth Challenges §  Need for Storage Disaggregation §  Hadoop Over Ceph (Block) §  Summary 2
  • 3. Intel Confidential 3 Challenges for Cloud Service Providers Nearly continuous acquisition of storage is needed. Petabyte-scale data footprints are common. >35-percent annual rate of storage growth is expected.1 Inefficiencies of storage acquisition are magnified over time. 3 Tier-2 cloud service providers (CSPs) must meet the demands of fast data growth while driving differentiation and value-added services. 1 IDC. “Extracting Value from Chaos.” Sponsored by EMC Corporation. June 2011. emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf.
  • 4. Intel Confidential 4 Challenges with Scaling Apache Hadoop* Storage Native Hadoop storage and compute can’t be scaled independently Inefficient resource allocation and IT spending result. Excess compute capacity: when more storage is needed, IT ends up with more compute than it needs. 4 Inefficiencies are highly consequential for large firms such as tier-2 CSPs. Both storage and compute resources are bound to Hadoop nodes
  • 5. Intel Confidential 5 Challenges with Scaling Apache Hadoop* Storage Native Hadoop storage can be used only for Hadoop workloads Additional storage is needed for non-big-data workloads Greater investments are required for other workloads •  Higher IT costs Multiple storage environments are needed •  Low storage-capacity utilization for workloads No multi-tenancy support in Hadoop •  Decreased operational agility Lack of a central, unified storage technology •  Need to replicate data from other storage environments and applications to the Hadoop cluster on a regular basis •  Results in unsustainable “data islands” that increase total cost of ownership (TCO) and reduce decision agility
  • 6. Intel Confidential 6 Solution: Apache Hadoop* with Ceph* • Disaggregate Hadoop storage and compute • Ceph is: • Open source • Scalable • Ceph enables: • Storage for all data types • Intel® Xeon® processors • Intel network solutions • Intel® Cache Acceleration Software (Intel® CAS) • Intel® Solid-State Drives (SSDs) using high-speed Non-Volatile Memory Express* (NVMe*) • Compute and storage scale separately • Unified storage for all enterprise needs • Increased organizational agility • More efficient use of IT resources Optimize performance with Intel® technologies ResultsUse Ceph instead of local, direct-attached hard drives for back-end storage
  • 7. Intel Confidential 7 Advantages of Ceph* Storage vs. Local Storage Free (if self-supported) Supports all data types: file, block, and object data Provides one centralized, standardized, and scalable storage solution for all enterprise needs Open source Supports many different workloads and applications Works on commodity hardware
  • 8. Intel Confidential 8 Apache Hadoop* with Ceph* Storage: Logical Architecture HDFS+YARN SQLIn-Memory Map-Reduce NoSQL Stream Search Custom Deployment Options •  Hadoop Services: Virtual, Container or Bare Metal •  Storage Integration: Ceph Block, File or Object •  Data Protection: HDFS and/or Ceph replication or Erasure Codes •  Tiering: HDFS and/or Ceph tiering
  • 9. Intel Confidential Ceph Monitors Hadoop Networking Ceph OSD* x 7 Apache Hadoop* with Ceph* on QCT Platform* Physical architecture QCT Solution Center* 9
  • 10. Intel Confidential 10 QCT Test Lab Environment (Cloudera Hadoop 5.7.0 & Ceph Jewel 10.2.1/FileStore) Hadoop 21-22 (Data Nodes)RMS32 (Mgmt) AP ES HM SM SNN B JHS RM S Hadoop24 (Name Node) NN G S Hadoop23 (Data Node) DN G S NM / blkdev<Host#>_{ 0..11}, 6TB 110 RBD vols DN G NM / blkdev<Host#>_{0 ..11}, 6TB 110 RBD vols p10p2 10.10.150.0/24 – private/cluster p10p2 p10p2p255p2 Hadoop11-14 (Data Nodes) DN G NM / blkdev<Host#>_{0 ..11}, 6TB 110 RBD vols p10p2 p10p1 p10p1 10.10.242.0/24 – public 10.10.241.0/24 – public p10p1 StarbaseMON41..42 StarbaseMON43 bond0 (p255p1+p255p2) bond0 (p255p1+p255p2) Starbase51..54 Starbase55..57 bond0 (p255p1+p255p2)bond0 (p255p1+p255p2) p2p1 10.10.100.0/24 – private/clusterp2p1 10.10.200.0/24 – private/cluster CAS NVMe 1 24 Journal NVMe OSD 1 OSD 2 OSD 24 nvme1n1nvme0n1 CAS NVMe 1 24 Journal NVMe OSD 1 OSD 2 OSD 24 nvme1n1nvme0n1HDD 6TB HDD 6TB SSD Boot & Mon SSD Boot SSD Boot SSD Boot & Mon MONMON HDD (Boot and CDH) HDD (Boot and CDH) HDD (Boot & CDH) HDD (Boot & CDH) HDD (Boot & CDH) NOTE: BMC management network is not shown. HDFS replication 1, Ceph replication 2 *Other names and brands may be claimed as the property of others.
  • 11. Intel Confidential 11 Intel CAS and Ceph Journal Configuration OSDs Ceph Journal HDD13-24 Cache for HDD1-12 Ceph Journal HDD1-12 Cache for HDD13-24 HDD1 HDD12 HDD13 HDD24… … Reads Writes OSDs Ceph Journal HDD13-24 Cache for HDD1-12 Ceph Journal HDD1-12 Cache for HDD13-24 HDD1 HDD12 HDD13 HDD24… … NVMe1 NVMe2 NVMe1 NVMe2 •  Ceph Journal[1-24]: 20G each, 480G in Total •  Intel CAS[1-4]: 880G each, ~3520TB in Total
  • 12. Validated Solution: Apache Hadoop* with Ceph* Storage A highly performant proof-of-concept (POC) has been built by Intel and QCT.2 12 2 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit intel.com/performance. For more information, see Legal Notices and Disclaimers. ** Optimize performance with Intel® CAS and Intel® SSDs using NVMe* •  Resolve input/output (I/ O) bottlenecks •  Provide better customer service-level- agreement (SLA) support •  Provide up to a 60- percent I/O performance improvement2 Disaggregate storage and compute in Hadoop by using Ceph storage instead of direct-attached storage (DAS) HDFS replication 1, Ceph replication 2
  • 13. Intel Confidential 13 Benefits of the Apache Hadoop* with Ceph* Solution Multi-protocol storage support Independent scaling of storage and compute Enhanced organizational agility Decreased capital expenditures (CapEx) No loss in performance Can use resources for any workload
  • 14. Intel Confidential 14 Find Out More To learn more about Intel® CAS and request a trial copy, visit: intel.com/content/www/us/en/software/ intel-cache-acceleration-software-performance.html To find the Intel® SSD that’s right for you, visit: intel.com/go/ssd To learn about QCT QxStor* Red Hat* Ceph* Storage Edition, visit: qct.io/solution/software-defined- infrastructure/storage-virtualization/qxstor-red-hat-ceph-storage-edition-p365c225c226c230
  • 17. Intel Confidential 17 Legal Notices and Disclaimers 1 IDC. “Extracting Value from Chaos.” Sponsored by EMC Corporation. June 2011. emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf. 2 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Configurations: •  Ceph* storage nodes, each server: 16 Intel® Xeon® processor E5-2680 v3, 128 GB RAM, twenty-four 6 TB Seagate Enterprise* hard drives, and two 2 TB Intel® Solid-State Drive (SSD) DC P3700 NVMe* drives with 10 gigabit Ethernet (GbE) Intel® Ethernet Converged Network Adapter X540-T2 network cards, 20 GbE public network, and 40 GbE private Ceph network. •  Apache Hadoop* data nodes, each server: 16 Intel Xeon processor E5-2620 v3 single socket, 128 GB RAM, with 10 GbE Intel Ethernet Converged Network Adapter X540-T2 network cards, bonded. The difference between the version with Intel® Cache Acceleration Software (Intel® CAS) and the baseline is that the Intel CAS version is not caching and is in pass-through mode, so software only, no hardware changes are needed. The tests used were TeraGen*, TeraSort*, TeraValidate*, and DFSIO*, which are the industry-standard Hadoop performance tests. For more complete information, visit intel.com/performance. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced website and confirm whether referenced data are accurate. Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804
  • 18. Intel Confidential 18 Legal Notices and Disclaimers Intel technologies’ features and benefits depend on system configuration and may require hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. Intel, the Intel logo, Intel. Experience What’s Inside, the Intel. Experience What’s Inside logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. QCT, the QCT logo, Quanta, and the Quanta logo are trademarks or registered trademarks of Quanta Computer Inc. Copyright © 2016 Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
  • 19. 19 Intel’s role in storage Advance the Industry Open Source & Standards Build an Open Ecosystem Intel® Storage Builders End user solutions Cloud, Enterprise Intel Technology Leadership Storage Optimized Platforms Intel® Xeon® E5-2600 v4 Platform Intel® Xeon® Processor D-1500 Platform Intel® Converged Network Adapters 10/40GbE Intel® SSDs for DC & Cloud Storage Optimized Software Intel® Intelligent Storage Acceleration Library Storage Performance Development Kit Intel® Cache Acceleration Software SSD & Non-Volatile Memory Interfaces: SATA , NVMe PCIe, Form Factors: 2.5”, M.2, U.2, PCIe AIC New Technologies: 3D NAND, Intel® Optane™ Cloud & Enterprise partner storage solution architectures 73 + partners Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Next gen solutions architectures Intel solution architects have deep expertise on Ceph for low cost and high performance usage helping customers to enable a modern storage infrastructure
  • 21. Intel Confidential First 3D XPoint Use Cases for Bluestore §  Bluestore Backend, RocksDB Backend, RocksDB WAL Two methods for accessing PMEM devices §  Raw PMEM blockdev (libpmemblk) §  DAX-enabled FS (mmap + libpmemlib) 3D XPoint™ and Ceph BlueStore Rocksdb BlueFS PMEMDevice PMEMDevice PMEMDevice Metadata Libpmemlib Libpmemblk DAX Enabled File System mmap Load/store mmap Load/store File File File API API Data 21
  • 22. Intel Confidential Enterprise class, highly reliable, feature rich, and cost effective AFA solution §  NVMe SSD is today’s SSD, and 3D NAND or TLC SSD is today’s HDD –  NVMe as Journal, high capacity SATA SSD or 3D NAND SSD as data store –  Provide high performance, high capacity, a more cost effective solution –  1M 4K Random Read IOPS delivered by 5 Ceph nodes –  Cost effective: 1000 HDD Ceph nodes (10K HDDs) to deliver same throughput –  High capacity: 100TB in 5 nodes §  with special software optimization on filestore and bluestore backend 3D NAND - Ceph cost effective solution Ceph Node S3510 1.6TB S3510 1.6TB S3510 1.6TB S3510 1.6TB P3700 M.2 800GB Ceph Node P3520 4TB P3520 4TB P3520 4TB P3520 4TB P3700 & 3D Xpoint™ SSDs P3520 4TB NVMe 3D Xpoint™ NVMe 3D NAND SATA/NVMe NAND 22
  • 23. Intel Confidential Test Setup (Linux OS) /etc/sysctl.conf vm.swappiness=10 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.core.netdev_max_backlog = 250000 /etc/security/limits.conf * soft nofile 65536 * hard nofile 1048576 * soft nproc 65536 * hard nproc unlimited * hard memlock unlimited CPU Profile echo performance> /sys/devices/system/cpu/cpu{0..n}/cpufreq/scaling_governor Huge Page echo never> /sys/kernel/mm/transparent_hugepage/defrag echo never> /sys/kernel/mm/transparent_hugepage/enabled Network ifconfig <eth> mtu 9000 ifconfig <eth> txqueuelen 1000
  • 24. Intel Confidential Test Setup (Ceph) [global] fsid = f1739148-3847-424d-b262-45d5b950fa3b mon_initial_members = starbasemon41, starbasemon42, starbasemon43 mon_host = 10.10.241.41,10.10.241.42,10.10.242.43 auth_client_required = none auth_cluster_required = none auth_service_required = none filestore_xattr_use_omap = true osd_pool_default_size = 3 # Write an object 2 times. osd_pool_default_min_size = 3 # Allow writing one copy in a degraded state. osd_pool_default_pg_num = 4800 osd_pool_default_pgp_num = 4800 public_network = 10.10.241.0/24, 10.10.242.0/24 cluster_network = 10.10.100.0/24, 10.10.200.0/24 debug_lockdep = 0/0 debug_context = 0/0 debug_crush = 0/0 debug_buffer = 0/0 debug_timer = 0/0 debug_filer = 0/0 debug_objecter = 0/0 debug_rados = 0/0 debug_rbd = 0/0 debug_ms = 0/0 debug_monc = 0/0 debug_tp = 0/0 debug_auth = 0/0 debug_finisher = 0/0 debug_heartbeatmap = 0/0 debug_perfcounter = 0/0 [global] debug_asok = 0/0 debug_throttle = 0/0 debug_mon = 0/0 debug_paxos = 0/0 debug_rgw = 0/0 perf = true mutex_perf_counter = true throttler_perf_counter = false rbd_cache = false log_file = /var/log/ceph/$name.log log_to_syslog = false mon_compact_on_trim = false osd_pg_bits = 8 osd_pgp_bits = 8 mon_pg_warn_max_object_skew = 100000 mon_pg_warn_min_per_osd = 0 mon_pg_warn_max_per_osd = 32768
  • 25. Intel Confidential Test Setup (Ceph) [mon] mon_host = starbasemon41, starbasemon42, starbasemon43 mon_data = /var/lib/ceph/mon/$cluster-$id mon_max_pool_pg_num = 166496 mon_osd_max_split_count = 10000 mon_pg_warn_max_per_osd = 10000 [mon.a] host = starbasemon41 mon_addr = 192.168.241.41:6789 [mon.b] host = starbasemon42 mon_addr = 192.168.241.42:6789 [mon.c] host = starbasemon43 mon_addr = 192.168.242.43:6789 [osd] osd_mount_options_xfs = rw,noatime,inode64,logbsize=256k,delaylog osd_mkfs_options_xfs = -f -i size=2048 osd_op_threads = 32 filestore_queue_max_ops = 5000 filestore_queue_committing_max_ops = 5000 journal_max_write_entries = 1000 journal_queue_max_ops = 3000 objecter_inflight_ops = 102400 filestore_wbthrottle_enable = false filestore_queue_max_bytes = 1048576000 filestore_queue_committing_max_bytes = 1048576000 journal_max_write_bytes = 1048576000 journal_queue_max_bytes = 1048576000
  • 26. Intel Confidential Test Setup (Hadoop) Parameter Value Comment Container Memory yarn.nodemanager.resource.memory-mb 80.52 GiB Default: Amount of physical memory, in MiB, that can be allocated for containers NOTE: In a different document, it recommends Container Virtual CPU Cores yarn.nodemanager.resource.cpu-vcores 48 Default: Number of virtual CPU cores that can be allocated for containers. Container Memory Maximum yarn.scheduler.maximum-allocation-mb 12 GiB The largest amount of physical memory, in MiB, that can be requested for a container. Container Virtual CPU Cores Maximum yarn.scheduler.maximum-allocation-vcores 48 Default: The largest number of virtual CPU cores that can be requested for a container. Container Virtual CPU Cores Minimum yarn.scheduler.minimum-allocation-vcores 2 The smallest number of virtual CPU cores that can be requested for a container. If using the Capacity or FIFO scheduler (or any scheduler, prior to CDH 5), virtual core requests will be rounded up to the nearest multiple of this number. JobTracker MetaInfo Maxsize mapreduce.job.split.metainfo.maxsize 1000000000 The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1. I/O Sort Memory Buffer (MiB) mapreduce.task.io.sort.mb 400 MiB To enable larger blocksize without spills yarn.scheduler.minimum-allocation-mb 2 GiB Default: Minimum container size mapreduce.map.memory.mb 1 GiB Memory req’d for each type of container - may want to increase for some apps mapreduce.reduce.memory.mb 1.5 GiB Memory req’d for each type of container - may want to increase for some apps mapreduce.map.cpu.vcores 1 Default: Number of vcores req’d for each type of container mapreduce.reduce.cpu.vcores 1 Default: Number of vcores req’d for each type of container mapreduce.job.heap.memory-mb.ratio 0.8 (Default). This sets Java heap size = 800/1200 MiB for mapreduce.{map| reduce}.memory.mb = 1/1.5 GiB
  • 27. Intel Confidential Test Setup (Hadoop) Parameter Value Comment dfs.blocksize 128 MiB Default dfs.replication 1 Default block replication. The number of replications to make when the file is created. The default value is used if a replication number is not specified. Java Heap Size of NameNode in Bytes 4127MiB Default: Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. Java Heap Size of Secondary NameNode in Bytes 4127MiB Default: Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. Parameter Value Comment Memory overcommit validation threshold 0.9 Threshold used when validating the allocation of RAM on a host. 0 means all of the memory is reserved for the system. 1 means none is reserved. Values can range from 0 to 1.
  • 28. Intel Confidential Test Setup (CAS NVMe, Journal NVMe) NVMe0n1 NVMe1n1 Ceph journal configured for 1st 12 HDDs will be /dev/nvme0n1p1 - /dev/nvme0n1p12 Each Partition size: 20GiB Ceph Journal configured for remaining 12 HDDs will be /dev/nvme1n1p1 - /dev/nvme1n1p12 Each Partition size: 20GiB CAS for 12-24 HDDs will be from this SSD. Use rest of the free space and split evenly for 2 cache partitions e.g. /dev/sdo - /dev/sdz cache 1 /dev/nvme0n1p13 Running wo - ├core 1 /dev/sdo1 - - /dev/intelcas1-1 ├core 2 /dev/sdp1 - - /dev/intelcas1-2 ├core 3 /dev/sdq1 - - /dev/intelcas1-3 ├core 4 /dev/sdr1 - - /dev/intelcas1-4 ├core 5 /dev/sds1 - - /dev/intelcas1-5 └core 6 /dev/sdt1 - - /dev/intelcas1-6 cache 2 /dev/nvme0n1p14 Running wo - ├core 1 /dev/sdu1 - - /dev/intelcas2-1 ├core 2 /dev/sdv1 - - /dev/intelcas2-2 ├core 3 /dev/sdw1 - - /dev/intelcas2-3 ├core 4 /dev/sdx1 - - /dev/intelcas2-4 ├core 5 /dev/sdy1 - - /dev/intelcas2-5 └core 6 /dev/sdz1 - - /dev/intelcas2-6 CAS for 1-12 HDDs will be from this SSD. Use rest of the free space and split evenly for 2 cache partitions e.g. /dev/sdc - /dev/sdn cache 1 /dev/nvme1n1p13 Running wo - ├core 1 /dev/sdc1 - - /dev/intelcas1-1 ├core 2 /dev/sdd1 - - /dev/intelcas1-2 ├core 3 /dev/sde1 - - /dev/intelcas1-3 ├core 4 /dev/sdf1 - - /dev/intelcas1-4 ├core 5 /dev/sdg1 - - /dev/intelcas1-5 └core 6 /dev/sdh1 - - /dev/intelcas1-6 cache 2 /dev/nvme1n1p14 Running wo - ├core 1 /dev/sdi1 - - /dev/intelcas2-1 ├core 2 /dev/sdj1 - - /dev/intelcas2-2 ├core 3 /dev/sdk1 - - /dev/intelcas2-3 ├core 4 /dev/sdl1 - - /dev/intelcas2-4 ├core 5 /dev/sdm1 - - /dev/intelcas2-5 └core 6 /dev/sdn1 - - /dev/intelcas2-6