Big Data is the Buzz word in today's scenario & many current application & products wants to leverage it But there are a lot of challenges involved. The ppt talk about the challenges, big data platform & high level architecture to support it.
1. Big Data as PaaS in Enterprises
-Pankaj Khattar
2. Products require to enable distributed programming for scalable solution
to improve:
Time Efficiency
Fault Tolerant
Enable as SaaS
De-Couple time/resource consuming tasks from main execution
Require a unified deployment platform which provides all the Big Data
Capabilities with latest & stable ecosystem but without maintenance &
security efforts for the product teams
Scenario
5. • Create a uniform big data platform for all Products/Applications
• A separate team manages the new Platform
• Products/Applications provide the platform with just job package &
data
• Job package contains the scripts, code, commands, etc…
• Platform stores the data & executes the commands
• Create a final data sets
• Data Sets is returned back to the Products/Applications
• Products/Applications don't bother about managing the platform &
concentrate on the computing code part only
• Platform is used as a Service
PaaS – Big Data Platform
6. Ecosystem (Tools)
PaaS – Big Data Platform
Proposed View - Create the Cloud
Cloud Data Warehouse
Hadoop Cluster
Geocoding CRM ETL DI/DQ
Platform as a Service
Job Execution
Resource Configuration
& Management
Multi-tenancy Security
7. Can have Multiple Clusters
Platinum
Production Usage
For SaaS based application
High number of machines with similar configuration
Requirement based tools/ecosystem supported
Gold
Development & Testing Environment
Medium number of machines with similar configurations
Requirement based tools/ecosystem supported
Silver
Small Scale/POC usage
Need based usage
Low end cluster with limited required machines
All tools/ecosystem supported
PaaS - Clusters