SlideShare a Scribd company logo
1 of 26
Grid Operations



Hadoop Performance at LinkedIn
Allen Wittenauer
Grid Computing Architect


©2012 LinkedIn Corporation. All Rights Reserved.
©2012 LinkedIn Corporation. All Rights Reserved.
“I have never seen a Hadoop cluster that was
             legitimately CPU bound.”
                -- Milind Bhandarkar
                -- Milind Bhandarkar
                -- Milind Bhandarkar



©2012 LinkedIn Corporation. All Rights Reserved.
X5650 - 6 Core @ 2.67 MHz




©2012 LinkedIn Corporation. All Rights Reserved.
X5650 - 6 Core @ 2.67 MHz




©2012 LinkedIn Corporation. All Rights Reserved.
“I have only seen one Hadoop cluster that was
            legitimately CPU bound.”
               -- Milind Bhandarkar
               -- Milind Bhandarkar
               -- Milind Bhandarkar



©2012 LinkedIn Corporation. All Rights Reserved.
Why do we have such high CPU usage?




©2012 LinkedIn Corporation. All Rights Reserved.
We do a lot of Graph Theory.




©2012 LinkedIn Corporation. All Rights Reserved.
Ticket to Ride




   Ticket To Ride is a registered trademark of Days of Wonder


    ©2012 LinkedIn Corporation. All Rights Reserved.             GRID OPERATIONS
Social Graph




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
2nd Degree Connection




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
We under-commit our memory.




©2012 LinkedIn Corporation. All Rights Reserved.
Our Hadoop Software Needs... The Plan...

  Tasks
     – 2 GB of RAM = 1 GB of JVM Heap, .5-1GB for non-heap
     – (Typically) 1 Super Active Threads


  TaskTracker
     – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap
     – 1-4 Super Active Threads


  DataNode
     – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap
     – 1-4 Super Active Threads


  RAM: 3GB + (task count * 2GB) + OS needs
  Threads: 8 + (task count) + OS needs


©2012 LinkedIn Corporation. All Rights Reserved.             GRID OPERATIONS
Our Hadoop Software Needs... The Reality

  Task Counts
     – Westmere (5650): 6
       Cores+HT = 12
       Tasks
     – Sandy Bridge
       (2640): 6 Cores+HT
       = 14 Tasks


  Most of our tasks
   leave at most .5
   GB free
     – = combined -> very
       large buffer & cache




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
We don’t have as many disks per node.




©2012 LinkedIn Corporation. All Rights Reserved.
Typical Hadoop Node Out in the Wild

  Most user’s don’t know their actual
   needs
     – Vendor advice... play it safe!


  Significantly more memory
     – “For the future!”
     – Badly written code
  Significantly more disk
     – “Hadoop is IO intensive!”
     – “Greater task locality!”


  Greater performance...but is it worth
   the cost...



©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
What Happens With Fewer Disks?

  Physical footprint requirements are smaller
  Linux buffers & caches are more efficient
     – More per disk
     – Fewer to manage
  Spindle count DOES matter... but the price/perf isn’t there for our
   workflows.
     – From a few years ago & based on store.sun.com prices (so not “real”)...

     Nodes/Cores                         RAM/Bus      Disks   Time In Minutes   HW Cost*
             3/24                           16/half    8          254.98         $37827
             3/24                           24/full    8          244.50         $38817
             3/24                           16/half    4          257.38         $21456
             3/24                           24/full    4          246.82         $22986
             6/48                           16/half    4          126.98         $42912

©2012 LinkedIn Corporation. All Rights Reserved.                                    GRID OPERATIONS
LinkedIn Node Configuration

  No RAID controller
     – More cost for negative perf when doing
       JBOD


  6 Drives
     – Still fits in 1U w/SATA drives
     – ~same perf as 8 drives


  Less metal = cheaper cost




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
Rack Level View

  If we assume we can use 40u in a rack then:
     – More CPUs
     – Just as many HDs
     – More Network
     – Potentially more RAM




©2012 LinkedIn Corporation. All Rights Reserved.   GRID OPERATIONS
We care about file system tuning.




©2012 LinkedIn Corporation. All Rights Reserved.
LinkedIn Hadoop Disk/File Systems

  noatime Enabled

  writeback Enabled

  Each Disk (except root) Partitions:
     – Swap
     – MapReduce Spill Space
     – HDFS


  Delayed Commits
     – Why write once when you can do ganged writes more efficiently?




©2012 LinkedIn Corporation. All Rights Reserved.                        GRID OPERATIONS
We care about job tuning.




©2012 LinkedIn Corporation. All Rights Reserved.
LinkedIn Job Tuning Guidelines

  All jobs get reviewed prior to going to production.

  Task times should be between 5-15 minutes.

  Jobs should have less than 10,000 tasks.

  Jobs should be smart about # of files and the size of those files
   generated.




©2012 LinkedIn Corporation. All Rights Reserved.                  GRID OPERATIONS
... and the result?




©2012 LinkedIn Corporation. All Rights Reserved.
Why is LinkedIn Running so Hot?

  We do a lot of non-MapReduce work.

  RAM buffers and caches allow us to offset a lot of disk IO.

  We audit our jobs.

  As a result, our CPUs are actually busy.




©2012 LinkedIn Corporation. All Rights Reserved.                 GRID OPERATIONS
©2012 LinkedIn Corporation. All Rights Reserved.   BUSINESS OPERATIONS

More Related Content

What's hot

How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterAltoros
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101EMC
 
Moving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDBMoving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDBMongoDB
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuningAnil Reddy
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationDataWorks Summit
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Kathleen Ting
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jkEdureka!
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonVitthal Gogate
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedInDataWorks Summit
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop EcosystemJ Singh
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)mundlapudi
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Ryu Kobayashi
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopRan Ziv
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopDataWorks Summit
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw
 

What's hot (20)

How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
 
Moving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDBMoving from C#/.NET to Hadoop/MongoDB
Moving from C#/.NET to Hadoop/MongoDB
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College London
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
 
Tune hadoop
Tune hadoopTune hadoop
Tune hadoop
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Pptx present
Pptx presentPptx present
Pptx present
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014Treasure Data on The YARN - Hadoop Conference Japan 2014
Treasure Data on The YARN - Hadoop Conference Japan 2014
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 

Similar to Hadoop Performance at LinkedIn

Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Sematext Group, Inc.
 
Kafka at half the price with JBOD setup
Kafka at half the price with JBOD setupKafka at half the price with JBOD setup
Kafka at half the price with JBOD setupDong Lin
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...Big Data Montreal
 
Bigdata and Hadoop with Docker
Bigdata and Hadoop with DockerBigdata and Hadoop with Docker
Bigdata and Hadoop with Dockerharidasnss
 
Right time Vs real time
Right time Vs real timeRight time Vs real time
Right time Vs real timeMurphy Choy
 
Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)Eric Ritchie
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryDataWorks Summit
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS CloudIdan Tohami
 
Complex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardComplex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardRedis Labs
 
Turbocharging php applications with zend server
Turbocharging php applications with zend serverTurbocharging php applications with zend server
Turbocharging php applications with zend serverEric Ritchie
 
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!Nitin Ramrakhyani
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You? EMC
 
MySQL Performance Best Practices
MySQL Performance Best PracticesMySQL Performance Best Practices
MySQL Performance Best PracticesOlivier DASINI
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsXiao Qin
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with HadoopGwen (Chen) Shapira
 

Similar to Hadoop Performance at LinkedIn (20)

Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
 
Kafka at half the price with JBOD setup
Kafka at half the price with JBOD setupKafka at half the price with JBOD setup
Kafka at half the price with JBOD setup
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
Bigdata and Hadoop with Docker
Bigdata and Hadoop with DockerBigdata and Hadoop with Docker
Bigdata and Hadoop with Docker
 
Right time Vs real time
Right time Vs real timeRight time Vs real time
Right time Vs real time
 
HugNov14
HugNov14HugNov14
HugNov14
 
Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)Turbocharging php applications with zend server (workshop)
Turbocharging php applications with zend server (workshop)
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recovery
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
 
Intoduction to OrientDB
Intoduction to OrientDBIntoduction to OrientDB
Intoduction to OrientDB
 
Complex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff PollardComplex Ephemeral Caching With Redis: Jeff Pollard
Complex Ephemeral Caching With Redis: Jeff Pollard
 
OpenStack Days Krakow
OpenStack Days KrakowOpenStack Days Krakow
OpenStack Days Krakow
 
Turbocharging php applications with zend server
Turbocharging php applications with zend serverTurbocharging php applications with zend server
Turbocharging php applications with zend server
 
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
 
Performance tuning PHP on IBMi
Performance tuning PHP on IBMiPerformance tuning PHP on IBMi
Performance tuning PHP on IBMi
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
 
Serverless Go at BuzzBird
Serverless Go at BuzzBirdServerless Go at BuzzBird
Serverless Go at BuzzBird
 
MySQL Performance Best Practices
MySQL Performance Best PracticesMySQL Performance Best Practices
MySQL Performance Best Practices
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
 
Making Sense of Big data with Hadoop
Making Sense of Big data with HadoopMaking Sense of Big data with Hadoop
Making Sense of Big data with Hadoop
 

More from Allen Wittenauer

2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at Scale2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at ScaleAllen Wittenauer
 
2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: Precommit2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: PrecommitAllen Wittenauer
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsAllen Wittenauer
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemAllen Wittenauer
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteAllen Wittenauer
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Allen Wittenauer
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System AdministratorsAllen Wittenauer
 

More from Allen Wittenauer (8)

2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at Scale2019-09-10: Testing Contributions at Scale
2019-09-10: Testing Contributions at Scale
 
2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: Precommit2018-08-23 Apache Yetus: Precommit
2018-08-23 Apache Yetus: Precommit
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase Contributors
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile Problem
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell Rewrite
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System Administrators
 
Hadoop 24/7
Hadoop 24/7Hadoop 24/7
Hadoop 24/7
 

Recently uploaded

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Hadoop Performance at LinkedIn

  • 1. Grid Operations Hadoop Performance at LinkedIn Allen Wittenauer Grid Computing Architect ©2012 LinkedIn Corporation. All Rights Reserved.
  • 2. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 3. “I have never seen a Hadoop cluster that was legitimately CPU bound.” -- Milind Bhandarkar -- Milind Bhandarkar -- Milind Bhandarkar ©2012 LinkedIn Corporation. All Rights Reserved.
  • 4. X5650 - 6 Core @ 2.67 MHz ©2012 LinkedIn Corporation. All Rights Reserved.
  • 5. X5650 - 6 Core @ 2.67 MHz ©2012 LinkedIn Corporation. All Rights Reserved.
  • 6. “I have only seen one Hadoop cluster that was legitimately CPU bound.” -- Milind Bhandarkar -- Milind Bhandarkar -- Milind Bhandarkar ©2012 LinkedIn Corporation. All Rights Reserved.
  • 7. Why do we have such high CPU usage? ©2012 LinkedIn Corporation. All Rights Reserved.
  • 8. We do a lot of Graph Theory. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 9. Ticket to Ride  Ticket To Ride is a registered trademark of Days of Wonder ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 10. Social Graph ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 11. 2nd Degree Connection ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 12. We under-commit our memory. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 13. Our Hadoop Software Needs... The Plan...  Tasks – 2 GB of RAM = 1 GB of JVM Heap, .5-1GB for non-heap – (Typically) 1 Super Active Threads  TaskTracker – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap – 1-4 Super Active Threads  DataNode – 1.5 GB of RAM = 1 GB of JVM Heap, .5GB for non-heap – 1-4 Super Active Threads  RAM: 3GB + (task count * 2GB) + OS needs  Threads: 8 + (task count) + OS needs ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 14. Our Hadoop Software Needs... The Reality  Task Counts – Westmere (5650): 6 Cores+HT = 12 Tasks – Sandy Bridge (2640): 6 Cores+HT = 14 Tasks  Most of our tasks leave at most .5 GB free – = combined -> very large buffer & cache ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 15. We don’t have as many disks per node. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 16. Typical Hadoop Node Out in the Wild  Most user’s don’t know their actual needs – Vendor advice... play it safe!  Significantly more memory – “For the future!” – Badly written code  Significantly more disk – “Hadoop is IO intensive!” – “Greater task locality!”  Greater performance...but is it worth the cost... ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 17. What Happens With Fewer Disks?  Physical footprint requirements are smaller  Linux buffers & caches are more efficient – More per disk – Fewer to manage  Spindle count DOES matter... but the price/perf isn’t there for our workflows. – From a few years ago & based on store.sun.com prices (so not “real”)... Nodes/Cores RAM/Bus Disks Time In Minutes HW Cost* 3/24 16/half 8 254.98 $37827 3/24 24/full 8 244.50 $38817 3/24 16/half 4 257.38 $21456 3/24 24/full 4 246.82 $22986 6/48 16/half 4 126.98 $42912 ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 18. LinkedIn Node Configuration  No RAID controller – More cost for negative perf when doing JBOD  6 Drives – Still fits in 1U w/SATA drives – ~same perf as 8 drives  Less metal = cheaper cost ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 19. Rack Level View  If we assume we can use 40u in a rack then: – More CPUs – Just as many HDs – More Network – Potentially more RAM ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 20. We care about file system tuning. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 21. LinkedIn Hadoop Disk/File Systems  noatime Enabled  writeback Enabled  Each Disk (except root) Partitions: – Swap – MapReduce Spill Space – HDFS  Delayed Commits – Why write once when you can do ganged writes more efficiently? ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 22. We care about job tuning. ©2012 LinkedIn Corporation. All Rights Reserved.
  • 23. LinkedIn Job Tuning Guidelines  All jobs get reviewed prior to going to production.  Task times should be between 5-15 minutes.  Jobs should have less than 10,000 tasks.  Jobs should be smart about # of files and the size of those files generated. ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 24. ... and the result? ©2012 LinkedIn Corporation. All Rights Reserved.
  • 25. Why is LinkedIn Running so Hot?  We do a lot of non-MapReduce work.  RAM buffers and caches allow us to offset a lot of disk IO.  We audit our jobs.  As a result, our CPUs are actually busy. ©2012 LinkedIn Corporation. All Rights Reserved. GRID OPERATIONS
  • 26. ©2012 LinkedIn Corporation. All Rights Reserved. BUSINESS OPERATIONS