SlideShare a Scribd company logo
1 of 31
Download to read offline
Practice of large Hadoop cluster in China
Mobile
Speaker: Duan Yunfeng, Pan Yuxuan
China Mobile Communications Corporation(CMCC)
2018-6
Big Data in China Mobile
Data in China Mobile
• 900 million customers
• 3 million base stations
• 200 million IoT connections
• 100PB data generated per day
Hadoop in China Mobile
• A big centralized control cluster, CBA cluster, 1600
nodes
• Several small compute cluster in each province
• 1+N architecture
• 15, 000 Hadoop nodes in all
Outline
02
Experience in
Construction
01
Introduction of CBA
ClusterC
目
录
ONTENTS
03 Future works
About CBA Project
Data Source:
• 2/3/4G, WLAN log
• Detailed data traffic
• Customer information
• Service Record
• Web Crawler
• …
Applications:
• In Finance, Security, Tourism,
Traffic, Advertisement,
Healthcare
• On-line behavior analysis
• Internet Opinion analysis
• Customer portrait
Centralized Business Analysis (CBA)
BI system based on Enterprise data warehouse for Group Company,
branch companies and subsidiary companies
Brief History of CBA Project
Stage 1
• 2016.8
• 600 nodes totally
• Max Hadoop: 400 nodes
Stage 2
• 2017.10
• 2,400 nodes totally
• Max Hadoop: 1600 nodes
Stage 3
• 2018.12
• 21,000 nodes totally
• Max Hadoop: 14,000 nodes
…
…
…
…
…
…
Current CBA Cluster
FTP
(20)
Flume
(90)
Hadoop Cluster
(1584+17)
HBase 2
(222+3)
Application
(10)
Application
(20)
Application
(20)
Business
System
HBase 1
(222+7)
Gateway
Crawler
(30)
Application
(25)
Flume
(18)
Kafka
(6)
Spark
streaming
(14)
H CORE
Branch Companies
H DMZ B DMZ
B CORE
FTP
(4)
Gateway
Largest Hadoop Cluster: 1600+ nodes
380 million HDFS files,total capacity: 62.38 PB
2PB input data per day, 20000 jobs per day
14 million files into HDFS by Flume per day
Group Company
BigData Platform
Hadoop 2.8.2
Hive 1.2.2
Spark 2.2.0
HBase 1.2.6
Flume 1.6.0
Ambari(HControl) 2.1.1
Outline
02
Experience in
Construction
01
Introduction of CBA
ClusterC
目
录
ONTENTS
03 Future works
Challenge
• Large amount of data, massive data types
• Build the whole cluster in several months from
nothing
Deployment Test Product
Ambari Tuning LDAP Tuning
HDFS Tuning
Flume Tuning
Operation Mangement
Application Tuning
Highlights
data collector, filtering, normalization and
encryption
• SQL Based Flume Interceptor
• easy extended for various data sources
HDFS Turning
• using NN Federation to scale NS horizontally
• using fair callqueue to reduce RPC latency
cluster deployment and maintenance
• Ambari turning
• cluster maintenance with AI
Flume Background
Operations Flume do before sending data to
HDFS:
Decompression
Filter by certain fields
Normalization
Encryption
Problem:
• Performance
50MB/s per node, need 400 nodes in
all
• Unstable, GC overhead
• Hard code in Interceptor of Flume
Gateway
Cluster
Collect
or
Cluster
Flume: SQL based Interceptor
Filtering, Normalization, Encryption in
each Interceptor
Implement logic in one SQL based
Interceptor
Use UDF to implement certain logic
Use Hive to parse SQL and get
Query Execution Plan
Select
c1_2, c3, c4_7,
sm4(normalizePhoneNum(c8),
strtolong(c20)),
c9_19, strtodate(c20),
strtodate(c21), c22_200
from event
FilterOperat
or
SelectOperat
or
SinkOperator
Execute Operator code in Flume interceptor
Overwrite SinkOperator,
convert record object
to flume record
Source
Deser
ialize
Process
Seriali
ze
SQL Inteceptor
Channel
Source
Deser
ialize
Process
Seriali
ze
Filter Inteceptor
Deser
ialize
Process
Seriali
ze
Encryption Inteceptor
Deser
ialize
Process
Seriali
ze
Normalization Inteceptor
Channel
Flume—SerDe
 Serialization/Deserialization
• Use LazySimpleSerDe From Hive
• Merge fields in Serialization
before after
Not all of the column need to be normalized or encrypted.
Merge unprocessed fields
Reduce CPU spent and JAVA Objects created
2X Performance improvement
Deserialize
Deserialize
Serialize
Serialize
Flume—tuning
• Use ConcurrentLinkedDeque instead of LinkedBlockingDeque in
MemoryChannel
• Reduce memory consume by adjusting Channel Capacity
agent.channels.c1.capacity=2000000
• Adjust MemoryChannel Keep-alive parameter to handle HDFS performance
variation
agent.channels.c1.keep-alive=24
• Reduce HDFS files to write by adjusting HDFSEventSink hdfs.idleTimeout
agent.sinks.s1.hdfs.idleTimeout=600
• Improve JVM performance by adding option -XX:+UseLargePages
Performance improvement: 50MB/s ->790MB/s per node
Flume nodes reduced: 400 nodes -> 90 nodes
Challenge of HDFS
Too many files in NameNode
 More than 300 million files in Namespace.
 NN memory usage over 150G, configured 180G
NN RPC performance becomes the bottleneck
 Process 30 million RPC calls per hour
 RPC accumulates in callQueue, RPC response over 10s
Namenode-HA failure
 Dead lock when HA in high concurrency situations(code bug)
 Have to restart all when active NN fails
Too many HDFS failures
HDFS(namenode) always goes down, HA does not work,
downtime over 2 hours each time
• Namenode JVM GC overtime
• Too many HDFS-Audit log and disk is full
• Network failure
• RPC pressure too high
......
Optimize the NameSpace
NS1
Hive/Flume/App
s
NS2
Apps
NS1
Hive
NS2
Flume
NS3
Apps
NS4
Apps
NS5
Apps
• Scale the NameSpace
Federation with 2 NS -> 5 NS
• YARN log files over 100 million
Introduce Ambari Logsearch tool to manage YARN log, no need to save
these logs for a long time.
• NameNode memory usage : 160G -> 90G
FairCallQueue
Challenge:
• Most RPC call from batch job users
• Flume task requires low latency of HDFS
• Flume Task needs higher priority of accessing HDFS
queue0
queue1
queue2
queue3
rpc
Priority=
1
FairCallQueue
Multi
plexe
r
rpc
take
• Use FairCallQueue
• Massive, not sensitive to
latency, batch job RPC -> Low
priority RPC queue
• Few, sensitive to latency RPC
-> High priority queue
• Latency of RPC from Flume:
More than 10s ->less than
0.5shttps://issues.apache.org/jira/browse/HADOOP-9640
Namenode GC algorithm
-XX:ParallelGCThreads=8
-XX:+UseConcMarkSweepGC
-
XX:CMSInitiatingOccupancyFraction=7
0
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseG1GC
-XX:ParallelGCThreads=20
-XX:ConcGCThreads=20
-XX:MaxGCPauseMillis=5000
GC algorithm in Namenode JVM options: CMS GC -> G1GC
• GC time reduce from 15ms to 2ms
• Long time GC suspend: Concurrent Mode Failure no longer occur
Block Placement
rack1 rack2 rack3
Problem:
• Number of Nodes on each Rack may
be different
• Nodes on smaller rack have
received more block replicas
• Run out of space , load is
higher
rack4
Analysis:
• Default block placement used in cluster
• Placement of the 1st replica is balanced
• Placement of the 2nd/3rdb block replica is imbalanced when racks
are not balanced
WeightedRackBlockPlacementPolicy
TotalNode
s
Racks
Nodes Per
NormalRack
Nodes Per
SmallRack
Without
WeightedRack
BlockPlacemen
t
With
WeightedRack
BlockPlacement
(better close to
1)
35 3 15 5 1.334 1.103
50 4 15 5 1.189 1.023
65 5 15 5 1.114 1.035
140 10 15 5 1.087 1.014
95 4 30 5 1.247 1.030
155 4 50 5 1.288 1.030
455 10 50 5 1.108 1.014
Solution:
• Implement new
BlockPlacementPolicy:
WeightedRackBlockPlacementPolic
y
• Calculate the probability of
block placement on each rack,
according to number of nodes on
each rack
• Calculate the weight of each
according to the probability.
Higher probability with lower
weight
• Adjust the block placement by
https://issues.apache.org/jira/browse/HDFS-13279
Other tuning
• Set MR property
mapreduce.fileoutputcommitter.algorithm.version=2, which
reduce rpc calls to the NN
For certain big job, RPC calls are reduced by 40%
• Use TEZ instead of MR in Hive.
hive.execution.engine=tez
• Merge tasks in ETL process, reduce HDFS access
https://issues.apache.org/jira/browse/MAPREDUCE-4815
Operation Management
NameSpace Quota
• Set NameSpace Quota of the root of each NS to 300 million
Estimate application RPC producing
• Count RPC generated by one Application in DEV environment before
it enters production.
• According to hdfs-audit
Limit the HEAVY RPC
• Heavy RPC: Recursive operation(delete, getContentSummary) to a
huge directory
• Implements in Ranger
Ambari tuning
Challenge: Nodes in Cluster up to 1600, Ambari service
becomes very slow
Parameters tuning
Apply patches
Code Improvement
New features support:
• Support NameNode Federation deployment
• High Available of Ambari Server
LDAP Tuning for Kerberized Cluster
Solutions
• Use NSCD to cache user information
on local
• Support multi LDAP Server in Hadoop
Connections on LDAP Server: 7000 -> 700
HDFS groupsHDFS groups
LDAP
Server1:389
LDAP
Server2: 389
MultiLdap
GroupsMapping
LDAP Server
N :389
负载均衡
<property>
<name>hadoop.security.group.mapping.ldap.url</name>
<value>ldap1:389,ldap2:389,ldap3:389</value>
</property>
Too many connections
Connections to LDAP Server over 7000, latency on LDAP Server node over 8s
Load balance
HSmart
An intelligent Operation management tool helps user to optimize Hadoop
cluster and jobs running on the cluster.
 Cluster Health Inspection
• Score the cluster status
• Suggestion for problems
 Cluster Resource Prediction
 Job Tuning
• 35 selected cluster metrics
• Predict future resource consumption by LSTM Algorithm
• Collect and analyze job log, counters and metrics
• Provide tuning suggestion for jobs
• Referred to Dr. Elephant@LinkedIn
Outline
02
Experience in
Construction
01
Introduction of CBA
ClusterC
目
录
ONTENTS
03 Future works
Challenge in Future
 Cluster Scale
Growing very fast: 1600 ->14000+ nodes
HDFS-Federation limitation?
Yarn cluster limitation?
Ambari limitation?
 Multi Sub-clusters
 Single Sub-cluster: 3000 to 5000 nodes
 RouterBasedFederation/ YarnFederation
 Balance among Namespaces
Data divided to NSs by business currently
Large load difference among NSs
Different Load type on Namenode: Some Namenode has more files but less
RPC requests
 Balancer to move data from NSs
5000?
4000?
3000?
Sub Cluster
HDFS Federation
YARN
NS1 NS2
NS3 NS4
Sub Cluster
HDFS Federation
YARN
NS1 NS2
NS3 NS4
Sub Cluster
HDFS Federation
YARN
NS1 NS2
NS3 NS4
RouterBasedFederatio
n
YarnFederation
balance
r
balancerAmbari Ambari Ambari
Summary
Challenge in construction of Hadoop Cluster
• Flume、HDFS、Ambari、LDAP
• not only follow the community,but also add self work
Namenode is the most difficult bottleneck
• NS space、RPC performance
• Extend NSs、parameter tuning、FairCallQueue、Operation
Management
Large cluster maintenance
• Introduce AI into cluster maintenance
Challenge in future: 1600 nodes -> 14000 nodes
Thank you!
Contact e-mail: panyuxuan@cmss.chinamobile.com
Appendix
Code example to compile a SQL and get the Operator list from Hive
Dirver.

More Related Content

What's hot

Postgresql 12 streaming replication hol
Postgresql 12 streaming replication holPostgresql 12 streaming replication hol
Postgresql 12 streaming replication holVijay Kumar N
 
Hyper-V を Windows PowerShell から管理する
Hyper-V を Windows PowerShell から管理するHyper-V を Windows PowerShell から管理する
Hyper-V を Windows PowerShell から管理するjunichi anno
 
Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성
Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성
Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성Yoonjeong Kwon
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it WorksMike Dirolf
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
 
Linux の hugepage の開発動向
Linux の hugepage の開発動向Linux の hugepage の開発動向
Linux の hugepage の開発動向Naoya Horiguchi
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Corporation
 
Web hdfs and httpfs
Web hdfs and httpfsWeb hdfs and httpfs
Web hdfs and httpfswchevreuil
 
検証環境をGoBGPで極力仮想化してみた
検証環境をGoBGPで極力仮想化してみた検証環境をGoBGPで極力仮想化してみた
検証環境をGoBGPで極力仮想化してみたToshiya Mabuchi
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...Severalnines
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
 
Dockerイメージ管理の内部構造
Dockerイメージ管理の内部構造Dockerイメージ管理の内部構造
Dockerイメージ管理の内部構造Etsuji Nakai
 
03 spark rdd operations
03 spark rdd operations03 spark rdd operations
03 spark rdd operationsVenkat Datla
 
HDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wHDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wCloudera Japan
 
Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseDataWorks Summit
 
NY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleNY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleWagner Bianchi
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High AvailabilityJakub Pavlik
 
Intro ProxySQL
Intro ProxySQLIntro ProxySQL
Intro ProxySQLI Goo Lee
 

What's hot (20)

Postgresql 12 streaming replication hol
Postgresql 12 streaming replication holPostgresql 12 streaming replication hol
Postgresql 12 streaming replication hol
 
Hyper-V を Windows PowerShell から管理する
Hyper-V を Windows PowerShell から管理するHyper-V を Windows PowerShell から管理する
Hyper-V を Windows PowerShell から管理する
 
Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성
Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성
Message Queue 가용성, 신뢰성을 위한 RabbitMQ Server, Client 구성
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
 
Hadoop入門
Hadoop入門Hadoop入門
Hadoop入門
 
Linux の hugepage の開発動向
Linux の hugepage の開発動向Linux の hugepage の開発動向
Linux の hugepage の開発動向
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly AvailableMariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Galera Cluster - Simple, Transparent, Highly Available
 
Web hdfs and httpfs
Web hdfs and httpfsWeb hdfs and httpfs
Web hdfs and httpfs
 
Introduction to SLURM
 Introduction to SLURM Introduction to SLURM
Introduction to SLURM
 
検証環境をGoBGPで極力仮想化してみた
検証環境をGoBGPで極力仮想化してみた検証環境をGoBGPで極力仮想化してみた
検証環境をGoBGPで極力仮想化してみた
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
 
Dockerイメージ管理の内部構造
Dockerイメージ管理の内部構造Dockerイメージ管理の内部構造
Dockerイメージ管理の内部構造
 
03 spark rdd operations
03 spark rdd operations03 spark rdd operations
03 spark rdd operations
 
HDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wHDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13w
 
Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data Warehouse
 
NY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleNY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with Maxscale
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
Intro ProxySQL
Intro ProxySQLIntro ProxySQL
Intro ProxySQL
 

Similar to Practice of large Hadoop cluster in China Mobile

FreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceEvan McGee
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftRX-M Enterprises LLC
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application Apache Apex
 
Scaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssScaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssAnil Nair
 
How to run a bank on Apache CloudStack
How to run a bank on Apache CloudStackHow to run a bank on Apache CloudStack
How to run a bank on Apache CloudStackgjdevos
 
NGINX Plus R20 Webinar
NGINX Plus R20 WebinarNGINX Plus R20 Webinar
NGINX Plus R20 WebinarNGINX, Inc.
 
Sunx4450 Intel7460 GigaSpaces XAP Platform Benchmark
Sunx4450 Intel7460 GigaSpaces XAP Platform BenchmarkSunx4450 Intel7460 GigaSpaces XAP Platform Benchmark
Sunx4450 Intel7460 GigaSpaces XAP Platform BenchmarkShay Hassidim
 
Scale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on AzureScale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on AzureAvi Networks
 
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSArquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSAmazon Web Services LATAM
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)Spark Summit
 
Improving HDFS Availability with Hadoop RPC Quality of Service
Improving HDFS Availability with Hadoop RPC Quality of ServiceImproving HDFS Availability with Hadoop RPC Quality of Service
Improving HDFS Availability with Hadoop RPC Quality of ServiceMing Ma
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
IRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSIRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSICT PRISTINE
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
Multi-Layer DDoS Mitigation Strategies
Multi-Layer DDoS Mitigation StrategiesMulti-Layer DDoS Mitigation Strategies
Multi-Layer DDoS Mitigation StrategiesSagi Brody
 
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...
Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...chen sheffer
 

Similar to Practice of large Hadoop cluster in China Mobile (20)

FreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a Microservice
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache Thrift
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
 
Scaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssScaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ss
 
How to run a bank on Apache CloudStack
How to run a bank on Apache CloudStackHow to run a bank on Apache CloudStack
How to run a bank on Apache CloudStack
 
NGINX Plus R20 Webinar
NGINX Plus R20 WebinarNGINX Plus R20 Webinar
NGINX Plus R20 Webinar
 
Sunx4450 Intel7460 GigaSpaces XAP Platform Benchmark
Sunx4450 Intel7460 GigaSpaces XAP Platform BenchmarkSunx4450 Intel7460 GigaSpaces XAP Platform Benchmark
Sunx4450 Intel7460 GigaSpaces XAP Platform Benchmark
 
Scale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on AzureScale Your Load Balancer from 0 to 1 million TPS on Azure
Scale Your Load Balancer from 0 to 1 million TPS on Azure
 
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSArquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Improving HDFS Availability with Hadoop RPC Quality of Service
Improving HDFS Availability with Hadoop RPC Quality of ServiceImproving HDFS Availability with Hadoop RPC Quality of Service
Improving HDFS Availability with Hadoop RPC Quality of Service
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
IRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSIRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OS
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Multi-Layer DDoS Mitigation Strategies
Multi-Layer DDoS Mitigation StrategiesMulti-Layer DDoS Mitigation Strategies
Multi-Layer DDoS Mitigation Strategies
 
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...
Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...Example of One  of my Desgins  for Cyber &Networking Solutions for Customers ...
Example of One of my Desgins for Cyber &Networking Solutions for Customers ...
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Recently uploaded (20)

All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

Practice of large Hadoop cluster in China Mobile

  • 1. Practice of large Hadoop cluster in China Mobile Speaker: Duan Yunfeng, Pan Yuxuan China Mobile Communications Corporation(CMCC) 2018-6
  • 2. Big Data in China Mobile Data in China Mobile • 900 million customers • 3 million base stations • 200 million IoT connections • 100PB data generated per day Hadoop in China Mobile • A big centralized control cluster, CBA cluster, 1600 nodes • Several small compute cluster in each province • 1+N architecture • 15, 000 Hadoop nodes in all
  • 3. Outline 02 Experience in Construction 01 Introduction of CBA ClusterC 目 录 ONTENTS 03 Future works
  • 4. About CBA Project Data Source: • 2/3/4G, WLAN log • Detailed data traffic • Customer information • Service Record • Web Crawler • … Applications: • In Finance, Security, Tourism, Traffic, Advertisement, Healthcare • On-line behavior analysis • Internet Opinion analysis • Customer portrait Centralized Business Analysis (CBA) BI system based on Enterprise data warehouse for Group Company, branch companies and subsidiary companies
  • 5. Brief History of CBA Project Stage 1 • 2016.8 • 600 nodes totally • Max Hadoop: 400 nodes Stage 2 • 2017.10 • 2,400 nodes totally • Max Hadoop: 1600 nodes Stage 3 • 2018.12 • 21,000 nodes totally • Max Hadoop: 14,000 nodes … … … … … …
  • 6. Current CBA Cluster FTP (20) Flume (90) Hadoop Cluster (1584+17) HBase 2 (222+3) Application (10) Application (20) Application (20) Business System HBase 1 (222+7) Gateway Crawler (30) Application (25) Flume (18) Kafka (6) Spark streaming (14) H CORE Branch Companies H DMZ B DMZ B CORE FTP (4) Gateway Largest Hadoop Cluster: 1600+ nodes 380 million HDFS files,total capacity: 62.38 PB 2PB input data per day, 20000 jobs per day 14 million files into HDFS by Flume per day Group Company
  • 7. BigData Platform Hadoop 2.8.2 Hive 1.2.2 Spark 2.2.0 HBase 1.2.6 Flume 1.6.0 Ambari(HControl) 2.1.1
  • 8. Outline 02 Experience in Construction 01 Introduction of CBA ClusterC 目 录 ONTENTS 03 Future works
  • 9. Challenge • Large amount of data, massive data types • Build the whole cluster in several months from nothing Deployment Test Product Ambari Tuning LDAP Tuning HDFS Tuning Flume Tuning Operation Mangement Application Tuning
  • 10. Highlights data collector, filtering, normalization and encryption • SQL Based Flume Interceptor • easy extended for various data sources HDFS Turning • using NN Federation to scale NS horizontally • using fair callqueue to reduce RPC latency cluster deployment and maintenance • Ambari turning • cluster maintenance with AI
  • 11. Flume Background Operations Flume do before sending data to HDFS: Decompression Filter by certain fields Normalization Encryption Problem: • Performance 50MB/s per node, need 400 nodes in all • Unstable, GC overhead • Hard code in Interceptor of Flume Gateway Cluster Collect or Cluster
  • 12. Flume: SQL based Interceptor Filtering, Normalization, Encryption in each Interceptor Implement logic in one SQL based Interceptor Use UDF to implement certain logic Use Hive to parse SQL and get Query Execution Plan Select c1_2, c3, c4_7, sm4(normalizePhoneNum(c8), strtolong(c20)), c9_19, strtodate(c20), strtodate(c21), c22_200 from event FilterOperat or SelectOperat or SinkOperator Execute Operator code in Flume interceptor Overwrite SinkOperator, convert record object to flume record Source Deser ialize Process Seriali ze SQL Inteceptor Channel Source Deser ialize Process Seriali ze Filter Inteceptor Deser ialize Process Seriali ze Encryption Inteceptor Deser ialize Process Seriali ze Normalization Inteceptor Channel
  • 13. Flume—SerDe  Serialization/Deserialization • Use LazySimpleSerDe From Hive • Merge fields in Serialization before after Not all of the column need to be normalized or encrypted. Merge unprocessed fields Reduce CPU spent and JAVA Objects created 2X Performance improvement Deserialize Deserialize Serialize Serialize
  • 14. Flume—tuning • Use ConcurrentLinkedDeque instead of LinkedBlockingDeque in MemoryChannel • Reduce memory consume by adjusting Channel Capacity agent.channels.c1.capacity=2000000 • Adjust MemoryChannel Keep-alive parameter to handle HDFS performance variation agent.channels.c1.keep-alive=24 • Reduce HDFS files to write by adjusting HDFSEventSink hdfs.idleTimeout agent.sinks.s1.hdfs.idleTimeout=600 • Improve JVM performance by adding option -XX:+UseLargePages Performance improvement: 50MB/s ->790MB/s per node Flume nodes reduced: 400 nodes -> 90 nodes
  • 15. Challenge of HDFS Too many files in NameNode  More than 300 million files in Namespace.  NN memory usage over 150G, configured 180G NN RPC performance becomes the bottleneck  Process 30 million RPC calls per hour  RPC accumulates in callQueue, RPC response over 10s Namenode-HA failure  Dead lock when HA in high concurrency situations(code bug)  Have to restart all when active NN fails
  • 16. Too many HDFS failures HDFS(namenode) always goes down, HA does not work, downtime over 2 hours each time • Namenode JVM GC overtime • Too many HDFS-Audit log and disk is full • Network failure • RPC pressure too high ......
  • 17. Optimize the NameSpace NS1 Hive/Flume/App s NS2 Apps NS1 Hive NS2 Flume NS3 Apps NS4 Apps NS5 Apps • Scale the NameSpace Federation with 2 NS -> 5 NS • YARN log files over 100 million Introduce Ambari Logsearch tool to manage YARN log, no need to save these logs for a long time. • NameNode memory usage : 160G -> 90G
  • 18. FairCallQueue Challenge: • Most RPC call from batch job users • Flume task requires low latency of HDFS • Flume Task needs higher priority of accessing HDFS queue0 queue1 queue2 queue3 rpc Priority= 1 FairCallQueue Multi plexe r rpc take • Use FairCallQueue • Massive, not sensitive to latency, batch job RPC -> Low priority RPC queue • Few, sensitive to latency RPC -> High priority queue • Latency of RPC from Flume: More than 10s ->less than 0.5shttps://issues.apache.org/jira/browse/HADOOP-9640
  • 19. Namenode GC algorithm -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC - XX:CMSInitiatingOccupancyFraction=7 0 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseG1GC -XX:ParallelGCThreads=20 -XX:ConcGCThreads=20 -XX:MaxGCPauseMillis=5000 GC algorithm in Namenode JVM options: CMS GC -> G1GC • GC time reduce from 15ms to 2ms • Long time GC suspend: Concurrent Mode Failure no longer occur
  • 20. Block Placement rack1 rack2 rack3 Problem: • Number of Nodes on each Rack may be different • Nodes on smaller rack have received more block replicas • Run out of space , load is higher rack4 Analysis: • Default block placement used in cluster • Placement of the 1st replica is balanced • Placement of the 2nd/3rdb block replica is imbalanced when racks are not balanced
  • 21. WeightedRackBlockPlacementPolicy TotalNode s Racks Nodes Per NormalRack Nodes Per SmallRack Without WeightedRack BlockPlacemen t With WeightedRack BlockPlacement (better close to 1) 35 3 15 5 1.334 1.103 50 4 15 5 1.189 1.023 65 5 15 5 1.114 1.035 140 10 15 5 1.087 1.014 95 4 30 5 1.247 1.030 155 4 50 5 1.288 1.030 455 10 50 5 1.108 1.014 Solution: • Implement new BlockPlacementPolicy: WeightedRackBlockPlacementPolic y • Calculate the probability of block placement on each rack, according to number of nodes on each rack • Calculate the weight of each according to the probability. Higher probability with lower weight • Adjust the block placement by https://issues.apache.org/jira/browse/HDFS-13279
  • 22. Other tuning • Set MR property mapreduce.fileoutputcommitter.algorithm.version=2, which reduce rpc calls to the NN For certain big job, RPC calls are reduced by 40% • Use TEZ instead of MR in Hive. hive.execution.engine=tez • Merge tasks in ETL process, reduce HDFS access https://issues.apache.org/jira/browse/MAPREDUCE-4815
  • 23. Operation Management NameSpace Quota • Set NameSpace Quota of the root of each NS to 300 million Estimate application RPC producing • Count RPC generated by one Application in DEV environment before it enters production. • According to hdfs-audit Limit the HEAVY RPC • Heavy RPC: Recursive operation(delete, getContentSummary) to a huge directory • Implements in Ranger
  • 24. Ambari tuning Challenge: Nodes in Cluster up to 1600, Ambari service becomes very slow Parameters tuning Apply patches Code Improvement New features support: • Support NameNode Federation deployment • High Available of Ambari Server
  • 25. LDAP Tuning for Kerberized Cluster Solutions • Use NSCD to cache user information on local • Support multi LDAP Server in Hadoop Connections on LDAP Server: 7000 -> 700 HDFS groupsHDFS groups LDAP Server1:389 LDAP Server2: 389 MultiLdap GroupsMapping LDAP Server N :389 负载均衡 <property> <name>hadoop.security.group.mapping.ldap.url</name> <value>ldap1:389,ldap2:389,ldap3:389</value> </property> Too many connections Connections to LDAP Server over 7000, latency on LDAP Server node over 8s Load balance
  • 26. HSmart An intelligent Operation management tool helps user to optimize Hadoop cluster and jobs running on the cluster.  Cluster Health Inspection • Score the cluster status • Suggestion for problems  Cluster Resource Prediction  Job Tuning • 35 selected cluster metrics • Predict future resource consumption by LSTM Algorithm • Collect and analyze job log, counters and metrics • Provide tuning suggestion for jobs • Referred to Dr. Elephant@LinkedIn
  • 27. Outline 02 Experience in Construction 01 Introduction of CBA ClusterC 目 录 ONTENTS 03 Future works
  • 28. Challenge in Future  Cluster Scale Growing very fast: 1600 ->14000+ nodes HDFS-Federation limitation? Yarn cluster limitation? Ambari limitation?  Multi Sub-clusters  Single Sub-cluster: 3000 to 5000 nodes  RouterBasedFederation/ YarnFederation  Balance among Namespaces Data divided to NSs by business currently Large load difference among NSs Different Load type on Namenode: Some Namenode has more files but less RPC requests  Balancer to move data from NSs 5000? 4000? 3000? Sub Cluster HDFS Federation YARN NS1 NS2 NS3 NS4 Sub Cluster HDFS Federation YARN NS1 NS2 NS3 NS4 Sub Cluster HDFS Federation YARN NS1 NS2 NS3 NS4 RouterBasedFederatio n YarnFederation balance r balancerAmbari Ambari Ambari
  • 29. Summary Challenge in construction of Hadoop Cluster • Flume、HDFS、Ambari、LDAP • not only follow the community,but also add self work Namenode is the most difficult bottleneck • NS space、RPC performance • Extend NSs、parameter tuning、FairCallQueue、Operation Management Large cluster maintenance • Introduce AI into cluster maintenance Challenge in future: 1600 nodes -> 14000 nodes
  • 30. Thank you! Contact e-mail: panyuxuan@cmss.chinamobile.com
  • 31. Appendix Code example to compile a SQL and get the Operator list from Hive Dirver.

Editor's Notes

  1. Good morning ladies and gentlemen. My name is Yuxuan Pan, a software engineer majored in Big Data, Hadoop and Data Management of China Mobile Communications Corporation (hereinafter referred to as CMCC). Today I’m very delighted to have the opportunity of making this presentation to share the Practice of large Hadoop cluster in CMCC.
  2. CMCC is now the leading mobile communication corporation based on the standard network in China, and we have nearly 900 million active users in total, 3 million base stations, and 200 million Internet of things (IoT) connections. All of these can generate 100 petabyte per day. In CMCC, Big Data Platform is a One plus N architecture. N refers to Distributed Hadoop clusters of branch companies for each province’s unique requirement. One means that we have built a Centralized Hadoop cluster on a big scale, with more than 1600 nodes. Moreover, we keep on collecting data from dozens of distributed clusters to analyze for our business. Now we have totally 15 thousand Hadoop nodes in our Big Data Platform.
  3. I’ve divided my presentation into three parts. First, I will introduce the Centralized Business Analysis Cluster of CMCC. Second, I’ll share the experiences we learned from the construction of large Hadoop cluster. I’ll discuss the future works and challenges about the CBA cluster in the final part.
  4. CBA system is a BI system based on the Enterprise data warehouse for Group Company, branch companies and subsidiary companies. Here is the architecture of this system. Data sources showed below contain massive data types including both structured and unstructured data. There are 2/3/4G, WLAN log, detailed data traffic, customer information, service record, web crawler, and etc. We standardize the data model and build a universal collection system. After data collection, the data will be loaded into Master Data Warehouse for further analysis. This analysis platform is based on X86 physical machine and the upper layer of data analysis is called open application platform. In this layer, we have various applications in different fields, such as Finance, Security, Tourism, Traffic, Advertisement and Healthcare. For instance, we can analyze on-line behaviors of users, Internet Opinions, and Customer portraits by collecting data. Moreover, we offer the service of mobile phone call records’ enquiries for all CMCC customers.
  5. You can see the brief history of the centralized business analysis project on this page. This project starts in the year of 2015 and the first stage has been finished in August 2016. There are 600 physical nodes in total and maximum 400 nodes for Hadoop. It is not a large cluster and only can deal with the data source of 2/3G log. However, after the stage 2 which has been finished in October 2017, it is obvious that there is an increase from 600 to 2400 nodes in total and the biggest Hadoop cluster has 1600 nodes. I’m going to be speaking more about the cluster in stage 2 because we are still working on the stage 3 which aims to have 21000 nodes totally and the maximum 14000 nodes for Hadoop cross two datacenters.
  6. There are several statistical information of the system--the total capacity for HDFS is over 62PB with about 380 million HDFS files. Meanwhile, the cluster has 2 petabytes input every day and runs around 20000 jobs on YARN. We use Flume to collect data from Gateway in each province and it collects 14 million files into HDFS every day. The below graph shows the components and deployment of the project. There are 90 nodes of Flume for data collection and the large Hadoop cluster which is used for data analysis has more than 1600 nodes. The HBase cluster which is used for customer Internet detail records’ inquiries has more than 200 nodes. We have two active-active HBase clusters for disaster recovery based on HBase Replication.
  7. The biggest challenges we meet are the large amounts of data and massive data types. We must build the whole cluster in several months from nothing! So from the deployment stage, test stage to the production stage, we have taken a fair amount of detours/have several turning points.
  8. This page highlights what we‘ve done in this project. We concentrate on data collector with SQL based Flume interceptor, HDFS turning, and cluster deployment and maintenance.
  9. Let's talk about Flume first. We need Flume to do some operations before sending data to HDFS for some security cares. The data from gateway is compressed by gzip, so Flume should decompress the data in memory first, and then each line of the data should be filtered by certain fields based on different data models. The next thing is field normalization which means we need to apply some rules for different fields and make the data standardize. The last important action is the encryption. But why we need encryption here? The reason is the original data contains some privacy fields, and the most critical data is phone number. So we must encrypt these privacy data before transferring data to HDFS, and of course all these operations must be done in memory. We choose to use Flume to do all of these things. But there are several problems occurred if we hard code all the processing logics in the Flume interceptor. Performance is the biggest challenge and there are only 50 Megabytes per second per node. As a result, we need 400 nodes in all to satisfy the data access requirements. Furthermore, the other problem is that Flume agent becomes unstable with GC overhead because of the in-memory data processing.
  10. We did different things in different interceptor in the previous day. One for filtering, one for normalization and one for encryption, and then linked them in a serial. Actually, it‘s a traditional way to use Flume Interceptor. But based on Flume standard interface, the input and output of interceptor are Event. In fact, the Event is a byte array, every interceptor needs to deserialize the Event to String first, then process the string and serialize the string to Event at last. Through that, the records will be serialized and deserialized three times during the whole process link. It will spend a lot of CPU and produce many intermediate objects which will cause GC overhead. Now let’s talk about the SQL based interceptor. In my opinion, SQL is the best way to process structured data. And we can use UDF which refers to user defined function to implement certain logic. It means all the processing logics can be implemented by one SQL. We use Hive SQL engine to parse SQL and get query execution plan. Hive engine will produce the below operator link, FilterOperator is used to process where clause of the SQL for data filtering, SelectOperator is used to process the select clause including expressions and function operation, and SinkOperator is responsible for coverting record object to Flume Event. As we know that Hive can parse SQL to MapReduce jobs. But in this case, we only use Hive parser to get the operator list which can be executed locally. So there's no dependency on HDFS and MapReduce.
  11. And now let’s see the serialization and deserialization in Flume. In the past, we converted byte array to string directly and used string split to get column value. This spent more than 80% CPU time on serialization and deserialization. But after we introduced Hive-based interceptor, naturally we start to use LazySimpleSerDe from Hive. LazySimpleSerDe has several advantages, instead of treating all columns as String, it outputs only typed columns and creats objects in a lazy way to provide better performance. But it’s not enough for us to use LazySimpleSerDe only. Most of our records have about 200 columns, and not all of the columns need to be normalized or encrypted. So we also develop a new SerDe tool based on the LazySimple to merge unprocessed fields. Let’s look at the below graph, for processing the column 7 and column 12, now we only need to get the seventh and the twelfth column by delimiter order. The column 1 to 6 and the column 8 to 11 will be recognized as a whole and the serialization and deserialization will only be applied to the column 7 and column 12. You can see we get improvement of the time performance twice.
  12. Here are some other tunings for Flume. First we use ConcurrentLinkedDeque instead of LinkedBlockingDeque in MenoryChannel of Flume. LinkedBlockingDeque has a lock inside which will limit the throughput of Flume, but ConcurrentLinkedDeque has no lock and it‘s based on compare and set (CAS). So this adjustment is supposed to make throughput improvement 2 or 3 times in our scenario. The second tuning is about Flume Channel Capacity. This capacity means the number of Events can be stored in the memory channel. In general, the throughput capability of Flume sink should be bigger than that of Flume source for preventing data backlog in the memory channel. Usually the channel is empty, so if the network fluctuation occurs by accident, the channel will make the input and output smoothly. But the channel capacity is not “the bigger, the better”. Because when the channel size becomes more than 10 Gigabyte, it will cause heavy GC overhead. In our experience, the channel capacity should be less than 5 Gigabyte. So we configure the parameter to 2 million in our case. And there are two other parameters need to be adjusted: memory channel keep-alive and hdfs idleTimeout. Memory channel keep-alive keeps source for a moment when channel is full instead of reporting errors immediately. We change the value from 3 second to 24 second in our case, because this value should not be too large of the timeout retransmission. As for the parameter hdfs idleTimeout, it makes Flume close the file if there's no input during the whole idle time. But the default is 60 seconds which will produce many small files in HDFS. So we make several rounds of test and adjust the value to 600 seconds to reduce 70% small files. We also improve JVM performance by adding option -XX:+UseLargePages to speed up memory addressing for large JVM heap size. Through all these tunings, the performance of Flume data collector is significantly improved from 50 Megabyte per second to 790 Megabyte per second, and the physical machine we used for data collector is also reduced from 400 nodes to 90 nodes. The tunings of Flume help us save a lot of cost.
  13. Next, let‘s talk about the challenges of HDFS. The biggest challenge is the stability of NameNode. We all know that when the HDFS file number grows, the memory consumed by NameNode will also grow. As a result, we have 300 million files in a namespace and NameNode memory usage is over 150 Gigabyte. Large heap size will cause long time GC to stop the world. And too many files will bring more RPC calls from application clients. NN RPC performance will become the bottleneck of the whole system. One NameNode should process 30 million RPC calls per hour and when the cluster is busy, RPC response over 10 seconds because of the RPC accumulates in call queue. The worst thing is that the heavy load of RPC call will finally cause the shutdown of active NameNod. Although we have high-availability NameNode, by the influence of some code bug, dead lock occurs when NameNode failover in high concurrency situations. So the standby NameNode cannot promote itself to become active state. In the last, we have to restart all NameNodes when the active NN fails.
  14. NameNode always goes down due to the heavy pressure. When we restart NameNode, it needs 2 hours for each time. Because starting NameNode will load the filesystem metadata into memory first, and then process the block report from thousands of DataNodes. These actions are time-consuming in large HDFS cluster. More than 2-hour downtime in production cluster is unacceptable. So the HDFS stability is a serious problem we need to solve. We use this picture of Jacky Chen to express our feeling.
  15. In the past, we have used the HDFS Federation with 2 NameSpaces. NS1 is used for Hive/Flume/Apps and NS2 is only for Apps. The previous division cannot meet our pressure requirement. So we scale the NameSpace from 2 to 5. One for Hive, one for Flume and three for Apps. In this way, the pressure of each NameSpace is reduced and the failed NameSpace will not influence each other. We also find that there are over 100 million YARN log files in HDFS. Previously, we store YARN logs for ten days in HDFS to occupy large amounts of HDFS resource. So we introduce a log tool in Apache Ambari called logsearch and write YARN logs to Apache Solr instead of HDFS. In this way, we don‘t need to save YARN logs for a long time and can use logsearch for further analysis for the logs. The memory used by NameNode is nearly halved from 160 Gigabyte to 90 Gigabyte by splitting the NameSpace and removing the YARN logs from HDFS.
  16. The next thing I want to introduce is FairCallQueue. Let‘s revisit RPC call queue in the old version of Hadoop first. It’s a single liner queue, when the queue is congested by RPC calls, the Namenode is overloaded and fails to respond, the RPC latency will become very high and then have a significant impact on data collection and may cause Hadoop job’s failure. Our cluster is a multi-tenancy one, and we find that most RPC calls are from batch job users. But Flume task requires low latency of HDFS to prevent losing data. So Flume user needs higher priority of accessing HDFS. We use FairCallQueue which has four queues with different priority. For Flume task, it will be put to the high priority queue if having few RPC calls, and for batch job, it will be put to the low priority queue if having massive RPC calls. With the help of FairCallQueue, the latency of RPC from Flume decreases from 10 second to half second. If you have interest in FairCallQueue, please refer to the below JIRA link for more details.
  17. In the past, we used Concurrent Mark Sweep GC for NameNode. Most of the time, it can work well. But as the memory usage of NameNode becomes larger and larger, which is almost 200 gigabyte, the time of full GC increases as well as the CPU cost. Sometimes, we caught the concurrent mode failure for full GC. What happened is that the object allocation was faster than the garbage collection efficiency so the old generation of JVM heap was full before the CMS GC finished. This is called Concurrent Mode Failure in CMS GC. To avoid this kind of failure and to suspend the GC time, we make a comparison between CMS GC and the new kind of GC algorithm called G1 GC. As a result, G1 performs better than CMS in suspending long time GC pause. Here we take the new generation as an example, you can see GC time reduces from 15 milliseconds to 2 milliseconds with JDK 8.
  18. There’s another problem about Block Placement we met. As is known, each Block in HDFS has three replications for fault tolerance. Although, ideally, all Blocks will be balanced in different racks with the help of rack awareness policy. The problem we met is that the number of nodes on each rack may be different. As a result, nodes on smaller rack will receive more Blocks which will lead to running out of space and heavy load. We made an analysis about this, and found that by using the default block placement in cluster, placement of the first block replication is balanced. But the placement of the second and third replication is imbalanced when racks are not balanced.
  19. To solve the imbalance of block placement in each rack, we implement a new BlockPlacementPolicy called WeightedRackBlockPlacementPolicy. This block placement policy will calculate the probability of block placement on each rack according to the number of nodes on each rack. Then it will calculate the weight of each rack according to the probability. And higher probability will accompany with lower weight. At last, it will adjust the block placement by rack weight. As a result, the block placement is more balanced on different racks in our cluster. If you are interesting in this, please refer to the JIRA link here.
  20. We have some other tunings for Hadoop. First, we change the algorithm version of mapreduce fileoutputcommitter to two from the default value one. This helped a lot if a big job generates many files to commit and there are many this kind of jobs in our cluster. In the default algorithm version 1, the commit is single-threaded and waits until all tasks of a job have completed before commencing. This means such job needs a lot of extra rename operation after all tasks finished which has performance issue. The version 2 will change the behavior of the job commit. The rename operation will be done in the commit of each task and the job commit only need to rename a directory to job output. The new algorithm makes a parallel change and will reduce the output commit time for large jobs. The performance increased by 40% as shown in the below JIRA link. For Hive jobs, we change the execution engine to TEZ instead of MR. TEZ can be thought as a more flexible and powerful successor of the map-reduce framework. It‘s based on Directed acyclic graph which is short for DAG and performances better than MR. We also merge tasks in ETL process to reduce HDFS access.
  21. Besides tuning the parameters and making some code improvement, we also need some operation management of HDFS. First of all, it‘s better to set a NameSpace Quota in each HDFS NameSpace. If we don’t do this, memory usage of NameNode will increase along with the increment of file numbers, which will eventually cause NameNode shutdown. So we need to limit the NameSpace Quota of the root path of each NameSpace to prevent the NameNode out of memory. In our cluster, we limited each NameSpace Quota to 300 million. Then we develop some models to estimate the application RPC producing. With the models we can count the RPC calls generated by one Application in development environment before push it to the production and the model is based on hdfs-audit. Another operation management is to limit the heavy RPC. What is heavy RPC? Some recursive operation like delete, getContentSummary to a huge directory. This kind of RPC calls will lock FSNameSystem for a long time to incerase NameNode RPC latency. We should be vigilant of this kind of operation and implement the RPC management in Apache Ranger.
  22. Now let’s talk about Ambari. We use Apache Ambari for cluster deployment and management in our cluster. We improve Ambari with new features such as support NameNode Federation deployment and High Available of Ambari Server. Here is an example of 2 Federation NameSpaces deployed with ambari. The biggest challenge we faced is that Ambari web UI becomes very slow when the nodes in cluster up to 1600 which brings poor user experience to cluster Operation & Maintenance Engineer. So we need to do some performance tuning as below. Besides the parameters tuning, we referred to the community and worked on several patches to improve the performance. Finally, the web page response time of Ambari server drop down to satisfy our requirements.
  23. The next tuning I want to talk about is LDAP. LDAP is short for Lightweight Directory Access Protocol. Why we need LDAP in Hadoop? First of all, our cluster enables Kerberos for authentication. If we want to run MR jobs in a kerberized cluster, we must have a Linux local user created before running the MR tasks in every NodeManager. This will be inconvenient in a multi-tenancy cluster for frequently user add or delete. LDAP can help us simplify the user management and Hadoop also support LDAP by changing the GroupMapping implementation of Hadoop to LDAPGroupMapping. But when massive users running jobs in the cluster, the performance of LDAP suffers a lot. The connections to LDAP server is over 7000 and latency on LDAP server node over 8 seconds. After analysis of the LDAP, we found that all user and group operation need to set up a connection to LDAP server. We have two solutions for this. Firstly, we take NSCD into consideration. NSCD is short for name service cache daemon. It's a cache service which can cache LDAP user in local. Besides cache missing, there's no need to set up a connection to LDAP server everytime. The other improvement is that we support multi LDAP server in Hadoop. By supporting multi LDAP server, we can distribute the connections to different server with Round Robin policy for load balance. With the help of NSCD and MultiLdapGroupMapping, the connections to the server drop down from 7000 to 700 and performs well for massive users in our cluster.
  24. After the construction of the large Hadoop cluster, we also need large amount of human resources invested in the cluster Operation and Maintenance. We develop a tool called HSmart which is an intelligent Operation management tool to help us to maintain and optimize the Hadoop cluster. There is a knowledge base inside the tool so that we can quickly get the solution when cluster failure occurs. Moreover, the tool can automatically analysis Hadoop jobs and gives a suggestion for optimizing the jobs. With the Cluster Health Inspection tool in HSmart, we can configure the points we want to check such as weak password, disk/CPU/memory usage and so on. After scanning the cluster with the policy we’ve configured, we can get a score of the cluster health as well as the suggestions for the problems found. This will help us to inspect the cluster automatically. The next tool is Cluster Resource Prediction which can predict future resource consumption based on the history resource usage by LSTM algorithm. LSTM is short for Long Short-Term Memory Algorithm in deep learning. With the prediction we can find out the problems & adopt measures timely. The last example is Job Tuning. HSmart will collect and analyze the job log, job counters and job metrics, then provides the tuning suggestion for jobs in the cluster such as Mapper memory usage and Mapper GC. This is based on Dr.Elephant@LinkedIn.
  25. The last part of today‘s topic is the future works of this project. The nodes number changed from 1600 to 14000 in the end of this year, which grows really fast. So we will face HDFS Federation limitation, YARN cluster limitation, Ambari cluster limitation and so on. It’s impossible to use a whole big cluster to deal with the limitation of Hadoop cluster scalability. So we will deploy multi sub-clusters instead of a big cluster. Each single sub-cluster will have 3000 to 5000 nodes. Then we will use RouterBasedFederation to maintain different HDFS sub-cluster and will use YarnFederation to maintain different YARN sub-cluster. Data will be placed in different NameSpaces, and we also need to develop a new balancer to move data between different NameSpaces.
  26. I’d like to finish by a short summary about today‘s presentation. First of all, the challenge we faced is in the construction of large Hadoop cluster, mainly in Flume, HDFS, Ambari and LDAP. And it’s not only follow the community, but also add self-work. Meanwhile, NameNode is the most difficult bottleneck because of the RPC performance. To respond to these challenges, we extend NameSpaces from 2 to 5, tune parameter and use FairCallQueue to reduce data collection latency and to add some Operation Management. And for large cluster maintenance, we introduce AI into cluster maintenance tool, which helps us to save human resource investment. In the future, although there may have more challenges, we’re going to take on them, as always. And I think we will have more practices to share in the future.
  27. That's all about my presentation today, thank you for listening. If you have any questions, I’d be glad to answer them now.
  28. Here is a code example for parsing sql and getting operator list from Hive driver. It’s very simple.