This Big Data case study outlines the Hadoop infrastructure deployment for a Fortune 100 media and telecommunications company.
Hadoop adoption in this company had grown organically across multiple different teams, starting with “science projects” and lab initiatives that quickly grew and expanded. Going forward, some of the options they considered for their Big Data deployment included expanding their on-premises infrastructure and using a Hadoop-as-a-Service cloud offering.
Fortunately, they realized that there is a third option: providing the benefits of Hadoop-as-a-Service with on-premises infrastructure. They selected the BlueData EPIC software platform to virtualize their Hadoop infrastructure and provide on-demand access to virtual Hadoop clusters in a secure, multi-tenant model.
Learn more about this case study in the blog post at: http://www.bluedata.com/blog/2015/05/big-data-case-study-hadoop-infrastructure
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
Big Data Case Study: Fortune 100 Telco
1. Big Data Case Study:
Fortune 100 Media / Telco Company
2. Fortune 100 Media / Telco Company
Business Goal
• Big Data analytics to improve customer experience
• Provide daily insights to internal and external teams
• Sandbox environment to support ad-hoc analysis
• Isolated environments for external content providers
Key Challenges
• Limited IT resources and skill sets in Hadoop and Spark
• Administrative overhead managing existing Big Data environments
• Onboarding multiple internal and external user groups
Big Data Case Study Example
3. MANAGEMENT
COMPLEXITY
DUPLICATION
OF DATA
CLUSTER
SPRAWL
< 30% UTILIZATION
IT
Fortune 100 Media / Telco Company
Big Data Infrastructure = Complex and Expensive
External Content
Provider
External Content
Provider
Other Internal
Teams
Data Scientists and
Developers
4. Going Forward: Two Options Considered
Expand on-premises Hadoop infrastructure
• Ongoing management of physical servers
• Multi-tenancy required for external providers
• Significant IT overhead
Fortune 100 Media / Telco Company
Move to AWS Elastic MapReduce
• Hadoop-as-a-Service offers simplicity and agility
• Internal security policies are barrier
• Ongoing TCO of AWS cloud services
• Data is on-premises, difficult to copy or move
5. Physical
Data
Copy
Hadoop Cluster
(~ 15 nodes)
(Converted to Production from Pilot)
New Physical Nodes ($$)
To increase performance & capacity
Hue Console
(Hadoop jobs)
Marketing
External Content Provider
Advanced administration
Groups/queues/schedulers
BI Tool(s)
Custom Web App ($$)
(Security, access control & onboarding)
New Physical Nodes ($)
For BI/ETL tools
User administration (AD/LDAP) User administration (AD/LDAP)
Utilization < 20%
NFS Database Other
Physical Data Copy/Duplication
Sales Support
Data
Scientists
Developers
Dev/Test Cluster
New Physical Nodes
($)
BigDataApplications
&Users
BigData
Infrastructure
Existing
DataFortune 100 Media / Telco Company
Option 1: Expand On-Premises Infrastructure
External Content Provider
6. A third option: Hadoop-as-a-Service on-premises
• Infrastructure software platform (BlueData) for Hadoop and Spark
Self-service, on-demand virtual clusters
• Amazon EMR-like experience
• Agility and speed for data scientists
• IT infrastructure efficiency, higher utilization
Secure and multi-tenant architecture
• Eliminate complexities and pitfalls of multiple isolated physical clusters
• Stronger isolation and greater flexibility, no data duplication
Solution and Benefits
Fortune 100 Media / Telco Company
7. Hadoop Cluster
(~ 15 nodes)
(Converted to Production from Pilot)
New Physical Nodes ($)
Performance optimized (CPU & Memory)
Data Scientists
and Developers
Web UI – multi-tenant, role-based access control
User administration (AD/LDAP)
EPIC Platform ($)
Content Provider
Tenant 3
VIRTUAL HADOOP CLUSTER
HUE CONSOLE + BI TOOLS
Content Provider
Tenant 2
VIRTUAL HADOOP CLUSTER
HUE CONSOLE + BI TOOLS
Internal Team
Tenant 1
VIRTUAL HADOOP CLUSTER
HUE CONSOLE + BI TOOLS
In-place access
Other Internal
Teams
NS Gluster Other
BigDataApplications
&Users
BigData
Infrastructure
Existing
DataFortune 100 Media / Telco Company
Option 3: Deploy BlueData EPIC Software Platform
External Content
Provider
External Content
Provider
11. • Significantly lower costs (~70%) – less hardware
required for dev/test cluster and BI / analytical tools
• Reduced administrative overhead – simpler user
management and administration, elminated data copying
• Speed and self-service – on-demand provisioning of
virtual Hadoop and Spark clusters
• Higher utilization – consolidation ratio of 8:1 between
virtual and physical servers
Fortune 100 Media / Telco Company
Big Data Case Study – Example Benefits
12. GLUSTER HDFS SWIFT NFS
Utilization > 90%
Simplified
management
No duplication of
data
No cluster
sprawl
ElasticPlane TM : Self-service, multi-tenant clusters
DataTap TM : In-place access to enterprise data stores
IOBoost TM : Extreme performance and scalability
EPIC Platform
Fortune 100 Media / Telco Company
Big Data Infrastructure Made Easy
External Content
Provider
External Content
Provider
Other Internal
Teams
Data Scientists and
Developers