Personal Information
Organization / Workplace
Hyderabad Area, India India
Occupation
Big Data Engineer
Industry
Technology / Software / Internet
About
I am a passionate Big Data Engineer who loves building large scale real time data processing pipelines.
Current working as Big Data Engineer developing large scale real time data processing pipelines in Microsoft Azure Cloud using Cloudera CDH tools that includes Hadoop, Spark, Kudu, kafka, Flume, HDFS, YARN, Azure SQL Data warehouse, Azure SQL Database etc.
Prior to this, worked on Data Warehouse migration from Oracle to Hadoop ecosystem on Azure using Spark as ETL, Oozie as Scheduler and Hive as target data warehouse with data in Parquet format. Sqoop to load historical data. Responsible for designing and implementing scalable and robust platform.
Prior to that, worked as Data Wareho...
Likes
(27)11 Principles of Applied Analytics
Georgian
•
7 years ago
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Cloudera, Inc.
•
9 years ago
Fast Data Analytics with Spark and Python
Benjamin Bengfort
•
9 years ago
Intro to Spark development
Spark Summit
•
8 years ago
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San Jose 2015
Databricks
•
9 years ago
Spark rdd part 2
Kiran Krishna
•
7 years ago
IBM Spark Meetup - RDD & Spark Basics
Satya Narayan
•
8 years ago
Custom Applications with Spark's RDD: Spark Summit East talk by Tejas Patil
Spark Summit
•
7 years ago
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
•
8 years ago
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
Amazon Web Services
•
8 years ago
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
•
9 years ago
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
Spark Summit
•
7 years ago
Deep Dive: Memory Management in Apache Spark
Databricks
•
7 years ago
Install Apache Hadoop for Development/Production
IMC Institute
•
7 years ago
Apache Spark & Hadoop : Train-the-trainer
IMC Institute
•
7 years ago
Apache Spark Architecture
Alexey Grishchenko
•
8 years ago
Scala for dummies
Javier Santos Paniego
•
9 years ago
Introduction to spark
Javier Santos Paniego
•
8 years ago
Distributed computing with spark
Javier Santos Paniego
•
9 years ago
Spark after Dark by Chris Fregly of Databricks
Data Con LA
•
9 years ago
Spark after Dark by Chris Fregly of Databricks
Data Con LA
•
9 years ago
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
•
8 years ago
Not Your Father's Database by Vida Ha
Spark Summit
•
8 years ago
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
Spark Summit
•
8 years ago
Real-Time Event & Stream Processing on MS Azure
Khalid Salama
•
7 years ago
Enterprise Cloud Data Platforms - with Microsoft Azure
Khalid Salama
•
7 years ago
Large scale ETL with Hadoop
OReillyStrata
•
11 years ago
Personal Information
Organization / Workplace
Hyderabad Area, India India
Occupation
Big Data Engineer
Industry
Technology / Software / Internet
About
I am a passionate Big Data Engineer who loves building large scale real time data processing pipelines.
Current working as Big Data Engineer developing large scale real time data processing pipelines in Microsoft Azure Cloud using Cloudera CDH tools that includes Hadoop, Spark, Kudu, kafka, Flume, HDFS, YARN, Azure SQL Data warehouse, Azure SQL Database etc.
Prior to this, worked on Data Warehouse migration from Oracle to Hadoop ecosystem on Azure using Spark as ETL, Oozie as Scheduler and Hive as target data warehouse with data in Parquet format. Sqoop to load historical data. Responsible for designing and implementing scalable and robust platform.
Prior to that, worked as Data Wareho...