Personal Information
Organization / Workplace
London, United Kingdom United Kingdom
Occupation
Data Science and Big Data
Industry
Technology / Software / Internet
About
Problem Solver. Python/Hadoop Coder. I have done end to end work involving development, administration and Data Science in Big Data.
I have set up Hadoop clusters, built ETL pipelines by writing MapReduce/Spark code and have worked on data science problems. I have used a variety of technologies including Spark, Hive, Pig, HBase, R, etc.
I look at Big Data everyday and use map reduce features of Hadoop to solve big data problems and extract useful information from them. I have done expert work in search quality by analyzing millions of queries searched by users everyday.
Here are some Data Science problems I have worked on solving so far
1) Understand the relationships between users wh...
Tags
newbie
pycon
python
programming
pycon2010
See more
Presentations
(2)Documents
(1)Likes
(24)Netezza Architecture and Administration
Braja Krishna Das
•
7 years ago
Netezza Deep Dives
Rush Shah
•
7 years ago
Notes from Coursera Deep Learning courses by Andrew Ng
Tess Ferrandez
•
6 years ago
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for large-scale data analytics
Databricks
•
8 years ago
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
•
8 years ago
Scala - The Simple Parts, SFScala presentation
Martin Odersky
•
9 years ago
Pragmatic Real-World Scala (short version)
Jonas Bonér
•
15 years ago
Scala Data Pipelines @ Spotify
Neville Li
•
8 years ago
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
•
9 years ago
Hive tuning
Michael Zhang
•
10 years ago
Spark SQL Deep Dive @ Melbourne Spark Meetup
Databricks
•
8 years ago
Spark Summit East 2015 Advanced Devops Student Slides
Databricks
•
9 years ago
DTCC '14 Spark Runtime Internals
Cheng Lian
•
10 years ago
Tuning and Debugging in Apache Spark
Databricks
•
9 years ago
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San Jose 2015
Databricks
•
9 years ago
Why Scala Is Taking Over the Big Data World
Dean Wampler
•
9 years ago
storm at twitter
Krishna Gade
•
10 years ago
Collaborative Filtering with Spark
Chris Johnson
•
9 years ago
DataFu @ ApacheCon 2014
William Vaughan
•
10 years ago
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Paco Nathan
•
9 years ago
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Cloudera, Inc.
•
12 years ago
HBase schema design Big Data TechCon Boston
amansk
•
11 years ago
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
Cloudera, Inc.
•
11 years ago
The 21 Coolest Internet Of Things Gadgets
Bernard Marr
•
9 years ago
Personal Information
Organization / Workplace
London, United Kingdom United Kingdom
Occupation
Data Science and Big Data
Industry
Technology / Software / Internet
About
Problem Solver. Python/Hadoop Coder. I have done end to end work involving development, administration and Data Science in Big Data.
I have set up Hadoop clusters, built ETL pipelines by writing MapReduce/Spark code and have worked on data science problems. I have used a variety of technologies including Spark, Hive, Pig, HBase, R, etc.
I look at Big Data everyday and use map reduce features of Hadoop to solve big data problems and extract useful information from them. I have done expert work in search quality by analyzing millions of queries searched by users everyday.
Here are some Data Science problems I have worked on solving so far
1) Understand the relationships between users wh...
Tags
newbie
pycon
python
programming
pycon2010
See more