IBM

64 Followers

There is nothing like the satisfaction of seeing a huge cluster of servers churning through massive amounts of data quickly and efficiently. As a Big Data performance engineer, my playground is massive amounts of memory, CPU, data disks (including NVMEs), fast network connectivity, open source Hadoop stack. I love Spark and Solr. I measure analytic workloads of all kind: Tesla trip data, crime data, tweets, credit card transactions, retail data that is larger than Amazon.com!

spark troubleshooting tpcds sql thrift server ibm iop tuning performance monitoring

Presentations
Documents
Infographics

Latest Most Popular

Spark 2.x Troubleshooting Guide

8 years ago • 51089 Views