Kannappan Sirchabesan

8 Followers

Problem Solver. Python/Hadoop Coder. I have done end to end work involving development, administration and Data Science in Big Data. I have set up Hadoop clusters, built ETL pipelines by writing MapReduce/Spark code and have worked on data science problems. I have used a variety of technologies including Spark, Hive, Pig, HBase, R, etc. I look at Big Data everyday and use map reduce features of Hadoop to solve big data problems and extract useful information from them. I have done expert work in search quality by analyzing millions of queries searched by users everyday. Here are some Data Science problems I have worked on solving so far 1) Understand the relationships between users wh...

newbie pycon python programming pycon2010

Presentations
Documents
Infographics

Latest Most Popular

Kannappan Sirchabesan

Netezza Architecture and Administration

Netezza Deep Dives

Notes from Coursera Deep Learning courses by Andrew Ng

Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for large-scale data analytics

Developing Real-Time Data Pipelines with Apache Kafka

Scala - The Simple Parts, SFScala presentation

Pragmatic Real-World Scala (short version)

Scala Data Pipelines @ Spotify

Recommender Systems (Machine Learning Summer School 2014 @ CMU)

Hive tuning

Spark SQL Deep Dive @ Melbourne Spark Meetup

Spark Summit East 2015 Advanced Devops Student Slides

DTCC '14 Spark Runtime Internals

Tuning and Debugging in Apache Spark

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San Jose 2015

Why Scala Is Taking Over the Big Data World

storm at twitter

Collaborative Filtering with Spark

DataFu @ ApacheCon 2014

Tiny Batches, in the wine: Shiny New Bits in Spark Streaming