SlideShare a Scribd company logo
1 of 46
Driving Business Value with Hadoop:
MapR Customer Experiences
Carl Olofson
Research Vice President
Agenda
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 2
 The Promise and Challenge of
Hadoop
 Choosing a Distributor
 MapR’s Key Differentiators
 IDC’s MapR Business Value Study
 Key Takeaways
 Conclusions/Recommendations
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 3
The Promise and
Challenge of
Hadoop
Choosing a
Distributor
MapR’s Key
Differentiators
IDC’s MapR
Business Value
Study
Key
Takeaways
Conclusions /
Recommendations
Agenda
The Promise and Challenge of Hadoop
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 4
Why Hadoop?
 Can collect any amount of data of any kind
 Open source and cheap to deploy
 Serves a variety of purposes
Challenges
 Involves many Apache projects
 Coordination of software is complex
 Management of clusters requires special expertise
The Problem
of Hadoop
Sprawl
 Hadoop is usually implemented, initially, as
discrete, limited projects.
 As these projects proliferate, they consume
more and more resources.
 Eventually, they require centralized
management.
 As long as they remain discrete
configurations, management is complex, and
resources are excessive.
 Needed: a manageable Hadoop data platform
that supports many projects in a single
system with shared resources, embracing
multi-tenancy.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 5
Choosing a
Distributor
 Choose a distributor whose software
and service delivery best meets your
needs.
 A distributor offers…
• Coordinated versions of related Apache
software delivered for immediate use
• Software combined in practical combinations
for various business purposes
• Expertise, guidance, and support are part of
the value proposition
 There are several leading distributors;
competition breeds excellence
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 6
MapR’s Key Differentiators
 Software Components
• MapR-FS
• MapR-DB
• MapR Streams
 MapR Converged Data
Platform
 Zeta Architecture
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 7
IDC’s MapR Business Value Study
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 8
IDC conducted research that
explores the value and benefits of
the Apache Hadoop for MapR
based on interviews with MapR
customers
The project included nine qualitative &
quantitative interviews with MapR customers.
Based on its analysis, IDC has created a model
that expresses the value and costs for these
organizations of using Hadoop with MapR.
These results will inform the study IDC is
developing for MapR.
Firmographics of Interviewed Organizations
9
Firmographics Average Median Range
Number of employees 10,410 500 8 to 65,000
Number of MapR users 762 150 0 to 3,500
Number of MapR applications 11 4 1 to 50
Number of PBs in MapR environment 3.90 1.10 0.02 to 20
Countries United States
Industries
Financial Services, Security, Professional Services, Cloud Services Provider,
Advertising (Big Data)
N=9 interviewed organizations
Apache Hadoop for MapR
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
Executive Summary
10
Interviewed organizations reported that Hadoop with MapR provides the performance, efficiency, and scalability they need to use
data analytics in a cost- and resource-effective way to drive their businesses and operations.
Hadoop with MapR has enabled data scientists and application developers to do their jobs more effectively and productively by
providing more timely and superior analytical outputs. On average, interviewed organizations reported that data scientists and
analysts improved their productivity by 31% and application developers by 39% with MapR.
Interviewed organizations reported leveraging Hadoop with MapR to drive their businesses. All interviewed organizations indicated
earning more revenue, and six attributed specific amount of revenue – an average of $26.7 million per year – by better serving
their customers with improved analytical capabilities or improving their analytics-based products and services.
MapR provides its customers with a cost-effective and efficient Big Data platform. On average, interviewed organizations put the
infrastructure and IT staff time cost of deploying and running MapR at 42% lower than if they had tried to do it by themselves.
IDC calculates that business benefits achieved by interviewed organizations are worth a discounted average of $19.4 million over
three years, which results in a return on investment (ROI) of 382% and a payback period of eight months.
Apache Hadoop for MapR
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
Quotes
11
“We tested it and MapR was faster by 15% to 20%. The other reason that we
went with MapR was security – MapR allows us to do really granular
security, and that’s just not possible with some of the other offerings.”
Why MapR?
“We needed a specific functionality - the ability to process TBs of data efficiently and cost
effectively. The MAPR version of Hadoop is faster and more reliable and has a great
support infrastructure around it. . . The most significant benefits for us of MapR have
been 4 nines in uptime and the ability to provide insight to our customers through
the ability to get meaningful data from the data that we’re collecting."
“We get dramatically better utilization out of our systems with MapR. Our
testing showed a three to five times performance increase when we ran on
MapR compared to other Hadoop, and compared to before it’s insane, it’s
probably like probably 10 to 20 times faster.”
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
Quotes
12
“We support over 150 data analysts, about 30 of those are what most people call data scientists.
The data scientists have absolutely saved time with MapR - productivity wise for them because
they have faster access to the data now and they can direct probably 50% more time to
other activities, plus they’re avoiding hiring. On top of the 50% saved, we are probably also
avoiding 10 hires."
“MapR has given our data scientists the ability to do more stuff. They can do
their work much faster – for example, if they had to develop a risk model, and
compare the eight hours it used to take compared with now running in an hour,
it makes them more productive – about 20% on average."
“Internal business processes have become more efficient with MapR. We have reporting that’s
available every 15 minutes to 30 minutes depending on what report you’re looking at. And before there
was not an efficient reconciliation process. The reconciliation process, if it occurred, would take 1 to 2
months to figure out. And the reconciliation process now takes a day. . . I would say there are
five people over saving 24 days collectively out of a month.”
Data-Related Staff Benefits of MapR
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
Quotes
13
“MapR absolutely has impacted our revenue. Our product is a superior product
because of MapR Hadoop, and that means more customers and more revenue. . .
We’ve added hundreds of customers and millions of dollars of revenue. I attribute
that to having Hadoop with MapR being the core of our platform."
“The quality of our outputs has increased dramatically with MapR – for example,
we're able to process everything and don't have to do sampling. So we can provide our
customers proper and accurate data. We can also do more test iterations to provide a
better quality product, and that's helped us get more customers. . . MapR has supported
business growth, dramatic growth of a product campaign we're running.”
“In our case, the choice of MapR was pretty simple. The alternative was that we just wouldn't do
it with anything else, because we couldn't. It actually wouldn't have worked – we couldn't scale up
enough to use a different solution. We wouldn't be in business.”
Business Benefits of MapR
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
Quotes
14
“We haven't had any downtime for a couple of years with MapR. Before, we were experiencing
downtime about quarterly, and these incidents had an impact on revenue because it affects client
retention, especially in the security space. When a client finds out that someone hacked into your
network, it makes for a very unhappy client. We lost clients probably 20-30% of the time
when we had downtime.”
“It we had built this out ourselves using relational databases, it would have
been more expensive than MapR because of phenomenal storage costs.
And then we'd have to pay for licenses –overall initial costs would have been
a number of times more expensive than MapR.”
“Multi-tenancy with MapR is helping us save money, and we can tell a
customer now about our multi-tenant capabilities and they will then not request
a dedicated environment. And that saves us money. We don’t have to get new
MapR servers, new MapR licenses just for their dedicated environment.
We are able to leverage the existing cluster.”
Reliability and Cost Benefits of MapR
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
$2.98M
$3.26M
$2.02M
$51,000
0
1
2
3
4
Business Productivity
Benefits
Increased IT Staff Productivity
Benefits
Risk Mitigation - User
Productivity Benefits
Million$Annual Average Benefits per Organization
15
Apache Hadoop for MapR
Total:
$8.31 Million
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
$74,500
$81,500
$50,600
$1,300
$0
$20,000
$40,000
$60,000
$80,000
$100,000
Business Productivity
Benefits
Increased Revenue IT Staff Productivity
Benefits
Risk Mitigation - User
Productivity Benefits
Annual Average Benefits per 100 TBs
16
Apache Hadoop for MapR
Total:
$207,900
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
$2.91M
$0.26M $0.59M $0.59M
$0
$4.28M
$9.54M $9.69M
-5
0
5
10
15
20
25
Initial Year 1 Year 2 Year 3
Million$
Investment Benefits Cumulative Net Benefit
$20.6 Million
Cost Benefit Analysis
17
Apache Hadoop for MapR
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
Three-Year ROI Analysis Average per Organization Average per 100 TBs
Benefit (discounted) $19.4 Million $486,100
Investment (discounted) $4.0 Million $100,800
Net Present Value $15.4 Million $385,300
ROI (NPV/Investment) 382% 382%
Payback (Months) 8.2 8.2
Discount Factor 12% 12%
Firmographics of Interviewed Organizations
18
Apache Hadoop for MapR
© IDC Visit us at IDC.com and follow us on Twitter: @IDC
Key
Takeaways
 Users found MapR’s distro to be much
easier than their prior configuration of
Apache Hadoop, resulting in dramatic
savings in staff time setting up, configuring,
and managing the Hadoop clusters.
 MapR’s software for optimizing file i/o,
database management, and cluster
management delivered significant
performance improvements.
 The rational way that the MapR Hadoop
system is configured, as a converged
platform, led to more efficient business
processes and better business outcomes.
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 19
Conclusions/Recommendations
© IDC Visit us at IDC.com and follow us on Twitter: @IDC 20
Recommendations
 There are multiple major Hadoop distributions,
each with its own merits.
 This study examined only one of these: MapR.
 The clear outcome of this study is demonstrable
benefits to customers derived from the use of
MapR.
 Business application of Hadoop requires the use
of a managed distribution of Hadoop.
 Users have several leading options in this
regard.
 MapR deserves to be on the list of those
receiving serious consideration.
Conclusions
© 2016 MapR Technologies 21© 2016 MapR Technologies
Driving Better Business Benefits with Hadoop
© 2016 MapR Technologies 22
What Contributed to the ROI with MapR
Higher productivity derived from:
• Higher efficiency
– Greater scalability
– Higher performance
– Better utilization
• Greater reliability
• Multi-tenancy
• Real-time data access
© 2016 MapR Technologies 23
Life Without a Converged Platform
Streaming
Real-time
Open Source
Analytics
(Hadoop, Spark)
Operational Cluster
(HBase, Other
NoSQL)
Streaming Cluster
(TIBCO, IBM, Kafka)
Batch Loads
Sources Apps
Enterprise
Storage
(system of record)
© 2016 MapR Technologies 24
© 2016 MapR Technologies 25
A Modern Big Data Architecture
MapR-DB: relational,
time series,
structured data
MapR-FS: emails,
blogs, tweets, log
files, unstructured
data
MapR Streams: event
data, change data,
IoT data
Agile, self-
service data
exploration
ETL into operational
reporting formats (e.g.,
Parquet)
Multi-tenancy:
job/data placement
control, volumes
Access controls:
file, table, column,
column family, doc,
sub-doc levels
Sources
RELATIONAL,
SAAS,
MAINFRAME
DOCUMENTS,
EMAILS
LOG FILES,
CLICKSTREAMS
ENSORS
BLOGS,
TWEETS,
LINK DATA
DATA
WAREHOUSES,
DATA MARTS
Auditing:
compliance, analyze
user accesses
Snapshots:
track data lineage
and history
Table Replication:
global multi-master,
business continuity
MapR Converged Data Platform
Enterprise Storage Database Event Streaming
MapR-FS MapR-DB MapR Streams
© 2016 MapR Technologies 26© 2016 MapR Technologies
Higher Efficiency
© 2016 MapR Technologies 27
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
MapR No-NameNode Architecture for Scale
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
DataNode
NameNode
A B C D E FAAA BBBB CCC DDD EEE FFF
No special configuration
Metadata is persisted to disk
Trillions of files (> 5000x advantage)
© 2016 MapR Technologies 28
HDFS-Based Scale Issues
Limited to 50 – 200 million files in
single NameNode (in 128 MB chunks)
Federation allows more files, but adds
new single points of failure
Federation plus Standby NameNodes
lead to complex configuration
5 to 20 NameNodes required for 1
billion files
© 2016 MapR Technologies 29
Ch
un
k
Ch
un
k
Ch
un
k
Ch
un
k
Blocks are inside chunks,
Chunks are inside containers
Containers - 32 GB
High scale to trillions of files
Default size self adjusts
Chunks - 256 MB
File sharding for parallelism
Default size can adjust by directory
Blocks - 8 KB
Raw device I/O for random
reads/writes
Small size advantage for
snapshot and mirroring
deltas
The Architectural Key to MapR Scale and Speed
© 2016 MapR Technologies 30
Disk I/O Throughput Performance (by Samsung)
Samsung Flash Memory Summit 2015 Keynote: https://www.youtube.com/watch?v=fOT63zR7PvU#t=26m45s
I/O throughput using
upcoming performance
enhancements on high
performance flash
© 2016 MapR Technologies 31
Optimized Resource Consumption
Linux File System
(general purpose, slower than MapR-FS,
leaves HA up to other engines)
Storage Hardware
HDFS (append-only)
Java Virtual Machine
HBase
(excessive writes)
Java Virtual Machine
Storage Hardware
MapR-FS + MapR-DB + MapR Streams
Every layer contends for
more CPU and memory
Efficient architecture frees up
resources; shared HA, DR,
and I/O systems
Java Virtual
Machine
Kafka
(separate cluster)
X
X
Replace with speed,
connectivity, HA/DR
Replace with less I/O
and RAM consumption
Eliminate layer
Eliminate layer
Replace with
full read-write
Fast, efficient, direct I/O
© 2016 MapR Technologies 32
Data/Job Placement Control for Resource Management
…
Single MapR Cluster
Operational workload
on largest servers
with SSDs
Production analytics
on standard servers
Archived data on
lower power, high
disk density nodes
Example topology for (optionally) dedicating
specific nodes to specific workloads
Self-service data
exploration on
standard servers
© 2016 MapR Technologies 33
Containerized Enterprise Using Docker, Mesos & Myriad
Mesos
YARN
Spark Hive MapRed
• Unified, shared platform for enterprise apps & data processing
• Unified application ecosystem
• Shared, persistent, high performance storage for all apps
• Multi-tenant, with choice of –
• YARN + Services per tenant
• Single, shared YARN for all tenants
Myriad
YARN
Spark Hive MapRed
Tenant
#1
Tenant
#2
Tenant
#1
© 2016 MapR Technologies 34© 2016 MapR Technologies
Greater Reliability
© 2016 MapR Technologies 35
No NameNode architecture
MapReduce/YARN HA
NFS HA
Instant recovery
Rolling upgrades
HA is built in
• Easy HA at massive scale
• Jobs are not impacted by failures
• Fast, resilient NFS access
• Replicas available within seconds of a node failure
• Upgrade the software with no downtime
• No special configuration to enable HA
High Availability (HA) Everywhere
© 2016 MapR Technologies 36
MapR Mirroring for Disaster Recovery
• Flexible
– Choose the volumes/directories to
mirror
– Scheduled/incremental to set low RPO
– Promotable mirrors to set low RTO
• Fast
– No performance impact
– Block-level (8KB) deltas
• Safe
– Point-in-time consistency, checksums
• Easy
– Takes less than two minutes to
configure!
Production
WAN
Production Research
Datacenter 1 Datacenter 2
WAN EC2
© 2016 MapR Technologies 37
MapR-DB Table Replication for Disaster Recovery
Multi-master (aka, active/active) replication
Active Read/Write
End Users
• Reduced risk of data loss
• Application failover
• Faster data access
© 2016 MapR Technologies 38
Snapshots for Online Consistent Backups
• Point-in-time recovery
• Consistent
• Efficient
• Fast
*
© 2016 MapR Technologies 39© 2016 MapR Technologies
Multi-Tenancy
© 2016 MapR Technologies 40
Multi-Tenancy
Efficient single cluster with:
• Isolation
• Quotas
• Security and delegation
• Reporting
© 2016 MapR Technologies 41© 2016 MapR Technologies
Real-Time Data Access
© 2016 MapR Technologies 42
MapR Platform Services: Open API Architecture
Assures Interoperability, Avoids Lock-in
HDFS
API
POSIX
NFS
SQL,
HBase
API
JSON
API
Kafka
API
© 2016 MapR Technologies 43
MapR NFS: Unique Advantage:
Direct Integration with Enterprise
Real-time
applications
NFS for
file-based
applications
Hadoop APIs
for Hadoop
applications ODBC &
JDBC for
SQL-based
applications
Mission
critical and
SLA
dependent
applications
© 2016 MapR Technologies 44
© 2016 MapR Technologies 45© 2016 MapR Technologies 45
ThankYou!
@mapr
dalekim@mapr.com
Engage with us!
maprtech
mapr-technologies
https://www.mapr.com/get-started-with-mapr
https://www.mapr.com/training
https://www.mapr.com/ebooks/big-data-all-stars/
© 2016 MapR Technologies 46© 2016 MapR Technologies
Q & A

More Related Content

What's hot

Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017Ray Bugg
 
Hadoop for Finance - sample chapter
Hadoop for Finance - sample chapterHadoop for Finance - sample chapter
Hadoop for Finance - sample chapterRajiv Tiwari
 
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsOptimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsCloudera, Inc.
 
Unlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaUnlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaCloudera, Inc.
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationHortonworks
 
Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop SampleAlan Quayle
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworksHortonworks
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data AnalyticsDatameer
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeInside Analysis
 
AWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAmazon Web Services
 
What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesTony Pearson
 
Record manager 8.0 presentation
Record manager 8.0  presentationRecord manager 8.0  presentation
Record manager 8.0 presentationAndrey Karpov
 
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Jonathan Seidman
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoopDr. Wilfred Lin (Ph.D.)
 
Frank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AIFrank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AIAI Frontiers
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysiszafarali1981
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemCapgemini
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSPhilip Filleul
 

What's hot (20)

Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017
 
Hadoop for Finance - sample chapter
Hadoop for Finance - sample chapterHadoop for Finance - sample chapter
Hadoop for Finance - sample chapter
 
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsOptimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analytics
 
Unlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaUnlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and Cloudera
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
 
Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop Sample
 
Haven 2 0
Haven 2 0 Haven 2 0
Haven 2 0
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworks
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality Challenge
 
AWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AIAWS Initiate Day Dublin 2019 – Big Data Meets AI
AWS Initiate Day Dublin 2019 – Big Data Meets AI
 
What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use Cases
 
Record manager 8.0 presentation
Record manager 8.0  presentationRecord manager 8.0  presentation
Record manager 8.0 presentation
 
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 
Frank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AIFrank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AI
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysis
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 

Similar to Driving Business Benefits with Hadoop

Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceSkillspeed
 
Powering the "As it Happens" Business
Powering the "As it Happens" BusinessPowering the "As it Happens" Business
Powering the "As it Happens" BusinessMapR Technologies
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)Rand Group
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise WeAreEsynergy
 
R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopRevolution Analytics
 
Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...
Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...
Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...ervogler
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingInside Analysis
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsSkillspeed
 
Hadoop Demo eConvergence
Hadoop Demo eConvergenceHadoop Demo eConvergence
Hadoop Demo eConvergencekvnnrao
 
mapr_case_study_experian
mapr_case_study_experianmapr_case_study_experian
mapr_case_study_experianErni Susanti
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise DataWorks Summit
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceSkillspeed
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of DataDaniel Saito
 
SIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess QlikSIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess QlikBardess Group
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaSkillspeed
 
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...MapR Technologies
 

Similar to Driving Business Benefits with Hadoop (20)

Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in Finance
 
Powering the "As it Happens" Business
Powering the "As it Happens" BusinessPowering the "As it Happens" Business
Powering the "As it Happens" Business
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
 
R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with Hadoop
 
Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...
Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...
Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: MapR Overview Present...
 
Hadoop In The Real World
Hadoop In The Real WorldHadoop In The Real World
Hadoop In The Real World
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise Thinking
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
Hadoop Demo eConvergence
Hadoop Demo eConvergenceHadoop Demo eConvergence
Hadoop Demo eConvergence
 
mapr_case_study_experian
mapr_case_study_experianmapr_case_study_experian
mapr_case_study_experian
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise Making Hadoop Ready for the Enterprise
Making Hadoop Ready for the Enterprise
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-Commerce
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of Data
 
SIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess QlikSIMPosium presentation_Bardess Qlik
SIMPosium presentation_Bardess Qlik
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
 

More from MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 

More from MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 

Recently uploaded

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Recently uploaded (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

Driving Business Benefits with Hadoop

  • 1. Driving Business Value with Hadoop: MapR Customer Experiences Carl Olofson Research Vice President
  • 2. Agenda © IDC Visit us at IDC.com and follow us on Twitter: @IDC 2  The Promise and Challenge of Hadoop  Choosing a Distributor  MapR’s Key Differentiators  IDC’s MapR Business Value Study  Key Takeaways  Conclusions/Recommendations
  • 3. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 3 The Promise and Challenge of Hadoop Choosing a Distributor MapR’s Key Differentiators IDC’s MapR Business Value Study Key Takeaways Conclusions / Recommendations Agenda
  • 4. The Promise and Challenge of Hadoop © IDC Visit us at IDC.com and follow us on Twitter: @IDC 4 Why Hadoop?  Can collect any amount of data of any kind  Open source and cheap to deploy  Serves a variety of purposes Challenges  Involves many Apache projects  Coordination of software is complex  Management of clusters requires special expertise
  • 5. The Problem of Hadoop Sprawl  Hadoop is usually implemented, initially, as discrete, limited projects.  As these projects proliferate, they consume more and more resources.  Eventually, they require centralized management.  As long as they remain discrete configurations, management is complex, and resources are excessive.  Needed: a manageable Hadoop data platform that supports many projects in a single system with shared resources, embracing multi-tenancy. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 5
  • 6. Choosing a Distributor  Choose a distributor whose software and service delivery best meets your needs.  A distributor offers… • Coordinated versions of related Apache software delivered for immediate use • Software combined in practical combinations for various business purposes • Expertise, guidance, and support are part of the value proposition  There are several leading distributors; competition breeds excellence © IDC Visit us at IDC.com and follow us on Twitter: @IDC 6
  • 7. MapR’s Key Differentiators  Software Components • MapR-FS • MapR-DB • MapR Streams  MapR Converged Data Platform  Zeta Architecture © IDC Visit us at IDC.com and follow us on Twitter: @IDC 7
  • 8. IDC’s MapR Business Value Study © IDC Visit us at IDC.com and follow us on Twitter: @IDC 8 IDC conducted research that explores the value and benefits of the Apache Hadoop for MapR based on interviews with MapR customers The project included nine qualitative & quantitative interviews with MapR customers. Based on its analysis, IDC has created a model that expresses the value and costs for these organizations of using Hadoop with MapR. These results will inform the study IDC is developing for MapR.
  • 9. Firmographics of Interviewed Organizations 9 Firmographics Average Median Range Number of employees 10,410 500 8 to 65,000 Number of MapR users 762 150 0 to 3,500 Number of MapR applications 11 4 1 to 50 Number of PBs in MapR environment 3.90 1.10 0.02 to 20 Countries United States Industries Financial Services, Security, Professional Services, Cloud Services Provider, Advertising (Big Data) N=9 interviewed organizations Apache Hadoop for MapR © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 10. Executive Summary 10 Interviewed organizations reported that Hadoop with MapR provides the performance, efficiency, and scalability they need to use data analytics in a cost- and resource-effective way to drive their businesses and operations. Hadoop with MapR has enabled data scientists and application developers to do their jobs more effectively and productively by providing more timely and superior analytical outputs. On average, interviewed organizations reported that data scientists and analysts improved their productivity by 31% and application developers by 39% with MapR. Interviewed organizations reported leveraging Hadoop with MapR to drive their businesses. All interviewed organizations indicated earning more revenue, and six attributed specific amount of revenue – an average of $26.7 million per year – by better serving their customers with improved analytical capabilities or improving their analytics-based products and services. MapR provides its customers with a cost-effective and efficient Big Data platform. On average, interviewed organizations put the infrastructure and IT staff time cost of deploying and running MapR at 42% lower than if they had tried to do it by themselves. IDC calculates that business benefits achieved by interviewed organizations are worth a discounted average of $19.4 million over three years, which results in a return on investment (ROI) of 382% and a payback period of eight months. Apache Hadoop for MapR © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 11. Quotes 11 “We tested it and MapR was faster by 15% to 20%. The other reason that we went with MapR was security – MapR allows us to do really granular security, and that’s just not possible with some of the other offerings.” Why MapR? “We needed a specific functionality - the ability to process TBs of data efficiently and cost effectively. The MAPR version of Hadoop is faster and more reliable and has a great support infrastructure around it. . . The most significant benefits for us of MapR have been 4 nines in uptime and the ability to provide insight to our customers through the ability to get meaningful data from the data that we’re collecting." “We get dramatically better utilization out of our systems with MapR. Our testing showed a three to five times performance increase when we ran on MapR compared to other Hadoop, and compared to before it’s insane, it’s probably like probably 10 to 20 times faster.” © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 12. Quotes 12 “We support over 150 data analysts, about 30 of those are what most people call data scientists. The data scientists have absolutely saved time with MapR - productivity wise for them because they have faster access to the data now and they can direct probably 50% more time to other activities, plus they’re avoiding hiring. On top of the 50% saved, we are probably also avoiding 10 hires." “MapR has given our data scientists the ability to do more stuff. They can do their work much faster – for example, if they had to develop a risk model, and compare the eight hours it used to take compared with now running in an hour, it makes them more productive – about 20% on average." “Internal business processes have become more efficient with MapR. We have reporting that’s available every 15 minutes to 30 minutes depending on what report you’re looking at. And before there was not an efficient reconciliation process. The reconciliation process, if it occurred, would take 1 to 2 months to figure out. And the reconciliation process now takes a day. . . I would say there are five people over saving 24 days collectively out of a month.” Data-Related Staff Benefits of MapR © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 13. Quotes 13 “MapR absolutely has impacted our revenue. Our product is a superior product because of MapR Hadoop, and that means more customers and more revenue. . . We’ve added hundreds of customers and millions of dollars of revenue. I attribute that to having Hadoop with MapR being the core of our platform." “The quality of our outputs has increased dramatically with MapR – for example, we're able to process everything and don't have to do sampling. So we can provide our customers proper and accurate data. We can also do more test iterations to provide a better quality product, and that's helped us get more customers. . . MapR has supported business growth, dramatic growth of a product campaign we're running.” “In our case, the choice of MapR was pretty simple. The alternative was that we just wouldn't do it with anything else, because we couldn't. It actually wouldn't have worked – we couldn't scale up enough to use a different solution. We wouldn't be in business.” Business Benefits of MapR © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 14. Quotes 14 “We haven't had any downtime for a couple of years with MapR. Before, we were experiencing downtime about quarterly, and these incidents had an impact on revenue because it affects client retention, especially in the security space. When a client finds out that someone hacked into your network, it makes for a very unhappy client. We lost clients probably 20-30% of the time when we had downtime.” “It we had built this out ourselves using relational databases, it would have been more expensive than MapR because of phenomenal storage costs. And then we'd have to pay for licenses –overall initial costs would have been a number of times more expensive than MapR.” “Multi-tenancy with MapR is helping us save money, and we can tell a customer now about our multi-tenant capabilities and they will then not request a dedicated environment. And that saves us money. We don’t have to get new MapR servers, new MapR licenses just for their dedicated environment. We are able to leverage the existing cluster.” Reliability and Cost Benefits of MapR © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 15. $2.98M $3.26M $2.02M $51,000 0 1 2 3 4 Business Productivity Benefits Increased IT Staff Productivity Benefits Risk Mitigation - User Productivity Benefits Million$Annual Average Benefits per Organization 15 Apache Hadoop for MapR Total: $8.31 Million © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 16. $74,500 $81,500 $50,600 $1,300 $0 $20,000 $40,000 $60,000 $80,000 $100,000 Business Productivity Benefits Increased Revenue IT Staff Productivity Benefits Risk Mitigation - User Productivity Benefits Annual Average Benefits per 100 TBs 16 Apache Hadoop for MapR Total: $207,900 © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 17. $2.91M $0.26M $0.59M $0.59M $0 $4.28M $9.54M $9.69M -5 0 5 10 15 20 25 Initial Year 1 Year 2 Year 3 Million$ Investment Benefits Cumulative Net Benefit $20.6 Million Cost Benefit Analysis 17 Apache Hadoop for MapR © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 18. Three-Year ROI Analysis Average per Organization Average per 100 TBs Benefit (discounted) $19.4 Million $486,100 Investment (discounted) $4.0 Million $100,800 Net Present Value $15.4 Million $385,300 ROI (NPV/Investment) 382% 382% Payback (Months) 8.2 8.2 Discount Factor 12% 12% Firmographics of Interviewed Organizations 18 Apache Hadoop for MapR © IDC Visit us at IDC.com and follow us on Twitter: @IDC
  • 19. Key Takeaways  Users found MapR’s distro to be much easier than their prior configuration of Apache Hadoop, resulting in dramatic savings in staff time setting up, configuring, and managing the Hadoop clusters.  MapR’s software for optimizing file i/o, database management, and cluster management delivered significant performance improvements.  The rational way that the MapR Hadoop system is configured, as a converged platform, led to more efficient business processes and better business outcomes. © IDC Visit us at IDC.com and follow us on Twitter: @IDC 19
  • 20. Conclusions/Recommendations © IDC Visit us at IDC.com and follow us on Twitter: @IDC 20 Recommendations  There are multiple major Hadoop distributions, each with its own merits.  This study examined only one of these: MapR.  The clear outcome of this study is demonstrable benefits to customers derived from the use of MapR.  Business application of Hadoop requires the use of a managed distribution of Hadoop.  Users have several leading options in this regard.  MapR deserves to be on the list of those receiving serious consideration. Conclusions
  • 21. © 2016 MapR Technologies 21© 2016 MapR Technologies Driving Better Business Benefits with Hadoop
  • 22. © 2016 MapR Technologies 22 What Contributed to the ROI with MapR Higher productivity derived from: • Higher efficiency – Greater scalability – Higher performance – Better utilization • Greater reliability • Multi-tenancy • Real-time data access
  • 23. © 2016 MapR Technologies 23 Life Without a Converged Platform Streaming Real-time Open Source Analytics (Hadoop, Spark) Operational Cluster (HBase, Other NoSQL) Streaming Cluster (TIBCO, IBM, Kafka) Batch Loads Sources Apps Enterprise Storage (system of record)
  • 24. © 2016 MapR Technologies 24
  • 25. © 2016 MapR Technologies 25 A Modern Big Data Architecture MapR-DB: relational, time series, structured data MapR-FS: emails, blogs, tweets, log files, unstructured data MapR Streams: event data, change data, IoT data Agile, self- service data exploration ETL into operational reporting formats (e.g., Parquet) Multi-tenancy: job/data placement control, volumes Access controls: file, table, column, column family, doc, sub-doc levels Sources RELATIONAL, SAAS, MAINFRAME DOCUMENTS, EMAILS LOG FILES, CLICKSTREAMS ENSORS BLOGS, TWEETS, LINK DATA DATA WAREHOUSES, DATA MARTS Auditing: compliance, analyze user accesses Snapshots: track data lineage and history Table Replication: global multi-master, business continuity MapR Converged Data Platform Enterprise Storage Database Event Streaming MapR-FS MapR-DB MapR Streams
  • 26. © 2016 MapR Technologies 26© 2016 MapR Technologies Higher Efficiency
  • 27. © 2016 MapR Technologies 27 DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode MapR No-NameNode Architecture for Scale DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode NameNode A B C D E FAAA BBBB CCC DDD EEE FFF No special configuration Metadata is persisted to disk Trillions of files (> 5000x advantage)
  • 28. © 2016 MapR Technologies 28 HDFS-Based Scale Issues Limited to 50 – 200 million files in single NameNode (in 128 MB chunks) Federation allows more files, but adds new single points of failure Federation plus Standby NameNodes lead to complex configuration 5 to 20 NameNodes required for 1 billion files
  • 29. © 2016 MapR Technologies 29 Ch un k Ch un k Ch un k Ch un k Blocks are inside chunks, Chunks are inside containers Containers - 32 GB High scale to trillions of files Default size self adjusts Chunks - 256 MB File sharding for parallelism Default size can adjust by directory Blocks - 8 KB Raw device I/O for random reads/writes Small size advantage for snapshot and mirroring deltas The Architectural Key to MapR Scale and Speed
  • 30. © 2016 MapR Technologies 30 Disk I/O Throughput Performance (by Samsung) Samsung Flash Memory Summit 2015 Keynote: https://www.youtube.com/watch?v=fOT63zR7PvU#t=26m45s I/O throughput using upcoming performance enhancements on high performance flash
  • 31. © 2016 MapR Technologies 31 Optimized Resource Consumption Linux File System (general purpose, slower than MapR-FS, leaves HA up to other engines) Storage Hardware HDFS (append-only) Java Virtual Machine HBase (excessive writes) Java Virtual Machine Storage Hardware MapR-FS + MapR-DB + MapR Streams Every layer contends for more CPU and memory Efficient architecture frees up resources; shared HA, DR, and I/O systems Java Virtual Machine Kafka (separate cluster) X X Replace with speed, connectivity, HA/DR Replace with less I/O and RAM consumption Eliminate layer Eliminate layer Replace with full read-write Fast, efficient, direct I/O
  • 32. © 2016 MapR Technologies 32 Data/Job Placement Control for Resource Management … Single MapR Cluster Operational workload on largest servers with SSDs Production analytics on standard servers Archived data on lower power, high disk density nodes Example topology for (optionally) dedicating specific nodes to specific workloads Self-service data exploration on standard servers
  • 33. © 2016 MapR Technologies 33 Containerized Enterprise Using Docker, Mesos & Myriad Mesos YARN Spark Hive MapRed • Unified, shared platform for enterprise apps & data processing • Unified application ecosystem • Shared, persistent, high performance storage for all apps • Multi-tenant, with choice of – • YARN + Services per tenant • Single, shared YARN for all tenants Myriad YARN Spark Hive MapRed Tenant #1 Tenant #2 Tenant #1
  • 34. © 2016 MapR Technologies 34© 2016 MapR Technologies Greater Reliability
  • 35. © 2016 MapR Technologies 35 No NameNode architecture MapReduce/YARN HA NFS HA Instant recovery Rolling upgrades HA is built in • Easy HA at massive scale • Jobs are not impacted by failures • Fast, resilient NFS access • Replicas available within seconds of a node failure • Upgrade the software with no downtime • No special configuration to enable HA High Availability (HA) Everywhere
  • 36. © 2016 MapR Technologies 36 MapR Mirroring for Disaster Recovery • Flexible – Choose the volumes/directories to mirror – Scheduled/incremental to set low RPO – Promotable mirrors to set low RTO • Fast – No performance impact – Block-level (8KB) deltas • Safe – Point-in-time consistency, checksums • Easy – Takes less than two minutes to configure! Production WAN Production Research Datacenter 1 Datacenter 2 WAN EC2
  • 37. © 2016 MapR Technologies 37 MapR-DB Table Replication for Disaster Recovery Multi-master (aka, active/active) replication Active Read/Write End Users • Reduced risk of data loss • Application failover • Faster data access
  • 38. © 2016 MapR Technologies 38 Snapshots for Online Consistent Backups • Point-in-time recovery • Consistent • Efficient • Fast *
  • 39. © 2016 MapR Technologies 39© 2016 MapR Technologies Multi-Tenancy
  • 40. © 2016 MapR Technologies 40 Multi-Tenancy Efficient single cluster with: • Isolation • Quotas • Security and delegation • Reporting
  • 41. © 2016 MapR Technologies 41© 2016 MapR Technologies Real-Time Data Access
  • 42. © 2016 MapR Technologies 42 MapR Platform Services: Open API Architecture Assures Interoperability, Avoids Lock-in HDFS API POSIX NFS SQL, HBase API JSON API Kafka API
  • 43. © 2016 MapR Technologies 43 MapR NFS: Unique Advantage: Direct Integration with Enterprise Real-time applications NFS for file-based applications Hadoop APIs for Hadoop applications ODBC & JDBC for SQL-based applications Mission critical and SLA dependent applications
  • 44. © 2016 MapR Technologies 44
  • 45. © 2016 MapR Technologies 45© 2016 MapR Technologies 45 ThankYou! @mapr dalekim@mapr.com Engage with us! maprtech mapr-technologies https://www.mapr.com/get-started-with-mapr https://www.mapr.com/training https://www.mapr.com/ebooks/big-data-all-stars/
  • 46. © 2016 MapR Technologies 46© 2016 MapR Technologies Q & A