SlideShare a Scribd company logo
1 of 20
MAKING BIG DATA COME ALIVE
Adding Hadoop to Your Analytics Mix:
Challenges and Strategies
Madina Kassengaliyeva
July 23, 2015
2
Madina Kassengaliyeva
Director, Client Services, Think Big
Madina Kassengaliyeva is responsible for ensuring successful
delivery of Think Big’s service engagements. Madina has led
strategy, engineering and data science engagements in a variety
of areas, including recommendation engines, customer
interactions optimization, marketing analytics and compliance.
Madina holds an MBA from the University of Chicago and a BA in
International Studies from American University.
Presenters
© 2015 Think Big, a Teradata Company 8/3/2015
Paul Barsch
Director, Services Marketing, Think Big
Paul Barsch directs marketing programs for Think Big, a Teradata
Company. Paul has been in IT for 15+ years in variety of roles for
Teradata, HP Enterprise Services and KPMG Consulting.
3
Housekeeping
Use the widget bar below to…
Get valuable resources & complete exit survey
Ask Questions to the Presenters
Request online technical help
Go social….
…and follow the conversation
© 2015 Think Big, a Teradata Company 8/3/2015
4
• Hadoop Adoption Path
• Key Challenges – Data,
Organization, Capabilities
• Ideas for Solutions
Agenda
5
Common Hadoop Adoption Path
© 2015 Think Big, a Teradata Company 8/3/2015
1. Address
Immediate
Needs
2. Establish a
Data
Repository
3. Initial
Analytics
Exploration
4. Integrate
Hadoop into
the Analytics
Capabilities
• Hadoop used to
relieve a technology
pain point
• Reduce data
warehouse costs
• Speed up ETL
• The only users are in
technology teams
• More and more data gets
added to Hadoop as a
result of Phase 1
• Greater data variety,
more raw data, deeper
history
• Initial data transfer,
security, and governance
practices are established
• Still perceived as largely
a technology platform
• Limited number of people
or teams conduct POCs
using Hadoop
• Analytics techniques not
available on traditional
platforms are applied
• Early wins indicate
promising business impact
and excitement builds
• Multiple teams use
Hadoop as part of the
analytics infrastructure
• Techniques, methods,
best practices and access
patterns get codified
• Business begins to
capture consistent value
Transition from Phase 3
to Phase 4 is when key
challenges emerge
6
Hadoop Adoption – Critical Point
© 2015 Think Big, a Teradata Company 8/3/2015
7
Key Challenges
© 2015 Think Big, a Teradata Company 8/3/2015
Data
Organization
Capabilities
• Impact of schema on read
• Consistent taxonomies and reference data
• Architecture - access patterns and flows
• Skills, roles and responsibilities
• Lack of common vocabulary
• Knowledge capture and sharing
• Foundational capabilities at the whim of
changing business priorities
• Future that’s hard to envision is hard to build
8
Organization – Key Challenges
© 2015 Think Big, a Teradata Company 8/3/2015
• Skills, roles and responsibilities
o Significant skills gaps between what’s currently available and what is
needed
o Both business and technology do analytics and often engineering, blurring
lines of responsibility or ownership
o “Throw over the wall” doesn’t work
• Lack of common vocabulary
o Every BU (and every leader) have their own understanding of the same
words
o This is rarely discussed
• Knowledge capture and sharing
o Multiple teams work with the same data and similar techniques
o Organization silos do not naturally support broad knowledge transfer
9
• Cross-BU committee to guide
organizational change, define
common vocabulary, defend the
effort to executive leadership and
share success
• Thorough, honest skills assessments to
identify gaps, training needs,
augmentation needs, map to roles
and responsibilities
• Documented tools requirements
based on current and projected skills
• Collaboration architecture
• Plug into existing knowledge transfer
practices and tools and allow for
informal information exchange based
on data access privileges
Organization – Ideas for Solutions
© 2015 Think Big, a Teradata Company 8/3/2015
10
Organization – Key Functions
© 2015 Think Big, a Teradata Company 8/3/2015
Strategy
Data Management & Governance
Architecture Tools Market
Research
Roadmap
Planning
Value
Realization
Future Data
Sources
Services
Support
Visualization &
ReportingData SME’s
Core Platform
Development Testing
Operations
Core Platform
Management
Metrics Tracking &
Reporting Platform Integration
Program
Management
Roadmap
Execution
Cross Group
Coordination
Financial
Management
Small Project
Prioritization
Communication
& Change
Management
Application
Development
Analytic
Sandbox
Data Science
Integration,
Interfaces &
Ingestion
Training
Incident Management Config, Change,
Release ManagementProblem Management
Help DeskKnowledge
Management
Technology
Governanc
e
Data
Quality &
Metrics
Access
Controls
Data
Governance
Metadata
Management
11
• Foundational capabilities at the whim of changing business priorities
• Lack of consensus on what are foundational capabilities
• Let’s be honest, the “Top Project” changes often and the resources go
with it
• Foundational capabilities do not immediately impact the bottom line
• Future that’s hard to envision is hard to build
• Lack of shared vision
• Clarity needed at multiple levels – strategy, operational details, day to
day
Capabilities – Key Challenges
© 2015 Think Big, a Teradata Company 8/3/2015
12
• Consolidate ownership in a team that has
organizational influence and includes
representatives from the business, the
infrastructure, architecture, data, and
analytics
• Back to vocabulary – agree on what
capabilities mean for your business unit and
your technology partners
• Roadmaps are useful – visual representations
of high-level goals against a time line that
should define your projects
• Dedicate resource to capabilities and
protect them
• Check in with your roadmap – does it still
reflect your vision?
Capabilities – Ideas for Solutions
© 2015 Think Big, a Teradata Company 8/3/2015
Photo courtesy of Flickr. Creative Commons.
By E.Bass.
13
Capabilities Pyramid
© 2015 Think Big, a Teradata Company 8/3/2015
14
Capabilities: Roadmap Example
© 2015 Think Big, a Teradata Company 8/3/2015
Analytics
standardized
methods,
code, tools,
team roles
Operations
standardized
processes,
tools, team
roles
Skills and roles
matrix
Data Ingestion, Transfer,
Structuring,
and Governance approach
Unified Model Management
Integrated
Data Science
Variables based on single source
structured data
Variable selection in
Hadoop
Integration with existing
scoring engine
Batch data processing in HadoopIntegration Cross-channel and intraday variables generation
Batch scoring in Hadoop
Natural language processing
to analyze text and voice
Initial real-time scoring
Execution Methodology and
project management
Data and
Models
Organization
and
Managemen
t
Analytics Knowledge
Management
Scoring Architectural
and Analytical design
Data Lifecycle Management
Real-time scoring design
Statistical and machine-learning-based
modeling
Data Exploration of unstructured data
components (e.g. URL, chat text)
Data Exploration of structured data
components (e.g. page views,
Cross-channel variables, variables from unstructured data +
intraday variables
15
• Impact of schema on read
• Hadoop supports a variety of data structures, which simplifies data
ingestion and allows data users to define preferred schemas
• This shifts the burden of defining the schema to the data users
• Consistent taxonomies and reference data
• Meaningful data analysis requires known and consistent taxonomy
• New taxonomies can get created by individual teams
• Reference data changes
• Architecture - access patterns and flows
• Data flows across platforms, regular updates, physical and virtual
constraints
• Decisions on what should be done where
Data – Key Challenges
© 2015 Think Big, a Teradata Company 8/3/2015
16
• Big issue with lots of opinions – see Data Lake
et. al
• Test and define common data manipulation
patterns for different use cases –
aggregations, reductions, basic statistical
derivations
• Centralize the responsibility for data
governance, data architecture, taxonomy,
and maintenance
• Establish knowledge sharing for data post-
analytics
Data – Ideas for Solutions
© 2015 Think Big, a Teradata Company 8/3/2015
Photo courtesy of Flickr. Creative Commons.
By Renzo Ferrante
17
• Data management,
knowledge, architecture, and
processing assurance
• Investment justification,
research, knowledge sharing
• Data aggregation and
enhancement
Client Example – Centralized Data Group
© 2015 Think Big, a Teradata Company 8/3/2015
Data Source 1
Data Source 2
Data Source 3
Data Source 3
Business
Group
Product
Group
Central Tech
Group
18
Conclusions
© 2015 Think Big, a Teradata Company 8/3/2015
Data
Organization
Capabilities
• Centralize data management
• Knowledge of data = knowledge of business
• Technology is not enough – need the right
people and processes
• Executive commitment is key
• Tough conversations can yield much better
alignment
• Dedicate and protect resources to build
capabilities
19
• 100% Big Data Focus
• Founded in 2010 with100+ engagements across 70 clients
• Unlock value of big data with data science and data
engineering services
• Proven vendor-neutral open source integration expertise
• Agile team-based development methodology
• Think Big Academy for skills and organizational development
• Global delivery model
Who is Think Big?
20
Questions
and Answers
Thank You!

More Related Content

What's hot

Gartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureGartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureNadia Smith
 
Maximizing The Value of Your Structured and Unstructured Data with Data Catal...
Maximizing The Value of Your Structured and Unstructured Data with Data Catal...Maximizing The Value of Your Structured and Unstructured Data with Data Catal...
Maximizing The Value of Your Structured and Unstructured Data with Data Catal...Molly Alexander
 
Data Science in Action for an Insurance Product - Shawn Jin
Data Science in Action for an Insurance Product - Shawn JinData Science in Action for an Insurance Product - Shawn Jin
Data Science in Action for an Insurance Product - Shawn JinMolly Alexander
 
Business Value of Data
Business Value of Data Business Value of Data
Business Value of Data UIResearchPark
 
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerEnsuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerMolly Alexander
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemCapgemini
 
Virtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis WorkshopVirtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis WorkshopCCG
 
NLB Analytics Overview
NLB Analytics OverviewNLB Analytics Overview
NLB Analytics OverviewKevin Dingle
 
Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleVasu S
 
The Heart of Data Modeling: The Best Data Modeler is a Lazy Data Modeler
The Heart of Data Modeling: The Best Data Modeler is a Lazy Data ModelerThe Heart of Data Modeling: The Best Data Modeler is a Lazy Data Modeler
The Heart of Data Modeling: The Best Data Modeler is a Lazy Data ModelerDATAVERSITY
 
Keys toSuccess: Business Intelligence Proven, Practical Strategies That Work
Keys toSuccess: Business Intelligence Proven, Practical Strategies That WorkKeys toSuccess: Business Intelligence Proven, Practical Strategies That Work
Keys toSuccess: Business Intelligence Proven, Practical Strategies That WorkSenturus
 
Business Analytics Overview
Business Analytics OverviewBusiness Analytics Overview
Business Analytics OverviewSAP Analytics
 
Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Andrey Akulov
 
Real-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIReal-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIibi
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligenceThilinaWanshathilaka
 
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...VMware Tanzu
 
Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG CCG
 
Predictions for the Future of Graph Database
Predictions for the Future of Graph DatabasePredictions for the Future of Graph Database
Predictions for the Future of Graph DatabaseNeo4j
 

What's hot (20)

Gartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit BrochureGartner Business Intelligence & Analytics Summit Brochure
Gartner Business Intelligence & Analytics Summit Brochure
 
Maximizing The Value of Your Structured and Unstructured Data with Data Catal...
Maximizing The Value of Your Structured and Unstructured Data with Data Catal...Maximizing The Value of Your Structured and Unstructured Data with Data Catal...
Maximizing The Value of Your Structured and Unstructured Data with Data Catal...
 
Data Science in Action for an Insurance Product - Shawn Jin
Data Science in Action for an Insurance Product - Shawn JinData Science in Action for an Insurance Product - Shawn Jin
Data Science in Action for an Insurance Product - Shawn Jin
 
Business Value of Data
Business Value of Data Business Value of Data
Business Value of Data
 
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerEnsuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
 
Virtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis WorkshopVirtual Governance in a Time of Crisis Workshop
Virtual Governance in a Time of Crisis Workshop
 
NLB Analytics Overview
NLB Analytics OverviewNLB Analytics Overview
NLB Analytics Overview
 
Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse Strategies
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | Qubole
 
The Heart of Data Modeling: The Best Data Modeler is a Lazy Data Modeler
The Heart of Data Modeling: The Best Data Modeler is a Lazy Data ModelerThe Heart of Data Modeling: The Best Data Modeler is a Lazy Data Modeler
The Heart of Data Modeling: The Best Data Modeler is a Lazy Data Modeler
 
Keys toSuccess: Business Intelligence Proven, Practical Strategies That Work
Keys toSuccess: Business Intelligence Proven, Practical Strategies That WorkKeys toSuccess: Business Intelligence Proven, Practical Strategies That Work
Keys toSuccess: Business Intelligence Proven, Practical Strategies That Work
 
Business Analytics Overview
Business Analytics OverviewBusiness Analytics Overview
Business Analytics Overview
 
Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.Эволюция Big Data и Information Management. Reference Architecture.
Эволюция Big Data и Information Management. Reference Architecture.
 
Real-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIReal-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BI
 
Introduction to business intelligence
Introduction to business intelligenceIntroduction to business intelligence
Introduction to business intelligence
 
Big data
Big dataBig data
Big data
 
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...
 
Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG
 
Predictions for the Future of Graph Database
Predictions for the Future of Graph DatabasePredictions for the Future of Graph Database
Predictions for the Future of Graph Database
 

Viewers also liked

Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022Rising Media Ltd.
 
Predictive Maintenance by analysing acoustic data in an industrial environment
Predictive Maintenance by analysing acoustic data in an industrial environmentPredictive Maintenance by analysing acoustic data in an industrial environment
Predictive Maintenance by analysing acoustic data in an industrial environmentCapgemini
 
[Tutorial] building machine learning models for predictive maintenance applic...
[Tutorial] building machine learning models for predictive maintenance applic...[Tutorial] building machine learning models for predictive maintenance applic...
[Tutorial] building machine learning models for predictive maintenance applic...PAPIs.io
 
Predictive Analytics: Extending asset management framework for multi-industry...
Predictive Analytics: Extending asset management framework for multi-industry...Predictive Analytics: Extending asset management framework for multi-industry...
Predictive Analytics: Extending asset management framework for multi-industry...Capgemini
 
Deep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up SeattleDeep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up SeattleDomino Data Lab
 

Viewers also liked (6)

Data Modeling on NoSQL
Data Modeling on NoSQLData Modeling on NoSQL
Data Modeling on NoSQL
 
Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022Industrial Analytics and Predictive Maintenance 2017 - 2022
Industrial Analytics and Predictive Maintenance 2017 - 2022
 
Predictive Maintenance by analysing acoustic data in an industrial environment
Predictive Maintenance by analysing acoustic data in an industrial environmentPredictive Maintenance by analysing acoustic data in an industrial environment
Predictive Maintenance by analysing acoustic data in an industrial environment
 
[Tutorial] building machine learning models for predictive maintenance applic...
[Tutorial] building machine learning models for predictive maintenance applic...[Tutorial] building machine learning models for predictive maintenance applic...
[Tutorial] building machine learning models for predictive maintenance applic...
 
Predictive Analytics: Extending asset management framework for multi-industry...
Predictive Analytics: Extending asset management framework for multi-industry...Predictive Analytics: Extending asset management framework for multi-industry...
Predictive Analytics: Extending asset management framework for multi-industry...
 
Deep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up SeattleDeep Learning Use Cases - Data Science Pop-up Seattle
Deep Learning Use Cases - Data Science Pop-up Seattle
 

Similar to Adding Hadoop to Your Analytics Mix?

2013 ALPFA Leadership Submit, Data Analytics in Practice
2013 ALPFA Leadership Submit, Data Analytics in Practice2013 ALPFA Leadership Submit, Data Analytics in Practice
2013 ALPFA Leadership Submit, Data Analytics in PracticeAlejandro Jaramillo
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsDATAVERSITY
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyPerficient, Inc.
 
Bersin by Deloitte - Demystifying Big Data
Bersin by Deloitte - Demystifying Big DataBersin by Deloitte - Demystifying Big Data
Bersin by Deloitte - Demystifying Big DataNetDimensions
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceRoland Bullivant
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesDATAVERSITY
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeDATAVERSITY
 
DAS Slides: Self-Service Reporting and Data Prep – Benefits & Risks
DAS Slides: Self-Service Reporting and Data Prep – Benefits & RisksDAS Slides: Self-Service Reporting and Data Prep – Benefits & Risks
DAS Slides: Self-Service Reporting and Data Prep – Benefits & RisksDATAVERSITY
 
Data-Ed Online Webinar: Metadata Strategies
Data-Ed Online Webinar: Metadata StrategiesData-Ed Online Webinar: Metadata Strategies
Data-Ed Online Webinar: Metadata StrategiesDATAVERSITY
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyDataWorks Summit
 
Building a Data Strategy Your C-Suite Will Support
Building a Data Strategy Your C-Suite Will SupportBuilding a Data Strategy Your C-Suite Will Support
Building a Data Strategy Your C-Suite Will SupportReid Colson
 
Analytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drsAnalytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drsDavid Schiller
 
Data-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing StrategiesData-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing StrategiesData Blueprint
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...DATAVERSITY
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platformHaoran Du
 
Are you getting the most out of your data?
Are you getting the most out of your data?Are you getting the most out of your data?
Are you getting the most out of your data?SAS Canada
 
Data-Ed: Metadata Strategies
 Data-Ed: Metadata Strategies Data-Ed: Metadata Strategies
Data-Ed: Metadata StrategiesData Blueprint
 
Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data Blueprint
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMDATAVERSITY
 

Similar to Adding Hadoop to Your Analytics Mix? (20)

2013 ALPFA Leadership Submit, Data Analytics in Practice
2013 ALPFA Leadership Submit, Data Analytics in Practice2013 ALPFA Leadership Submit, Data Analytics in Practice
2013 ALPFA Leadership Submit, Data Analytics in Practice
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced Analytics
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data Strategy
 
Bersin by Deloitte - Demystifying Big Data
Bersin by Deloitte - Demystifying Big DataBersin by Deloitte - Demystifying Big Data
Bersin by Deloitte - Demystifying Big Data
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
 
DAS Slides: Self-Service Reporting and Data Prep – Benefits & Risks
DAS Slides: Self-Service Reporting and Data Prep – Benefits & RisksDAS Slides: Self-Service Reporting and Data Prep – Benefits & Risks
DAS Slides: Self-Service Reporting and Data Prep – Benefits & Risks
 
Data-Ed Online Webinar: Metadata Strategies
Data-Ed Online Webinar: Metadata StrategiesData-Ed Online Webinar: Metadata Strategies
Data-Ed Online Webinar: Metadata Strategies
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Building a Data Strategy Your C-Suite Will Support
Building a Data Strategy Your C-Suite Will SupportBuilding a Data Strategy Your C-Suite Will Support
Building a Data Strategy Your C-Suite Will Support
 
Analytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drsAnalytic Roadmap Customer Overview - 2015 TUG Final-drs
Analytic Roadmap Customer Overview - 2015 TUG Final-drs
 
Data-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing StrategiesData-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing Strategies
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platform
 
Are you getting the most out of your data?
Are you getting the most out of your data?Are you getting the most out of your data?
Are you getting the most out of your data?
 
Big data@work
Big data@workBig data@work
Big data@work
 
Data-Ed: Metadata Strategies
 Data-Ed: Metadata Strategies Data-Ed: Metadata Strategies
Data-Ed: Metadata Strategies
 
Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDM
 

Recently uploaded

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 

Recently uploaded (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 

Adding Hadoop to Your Analytics Mix?

  • 1. MAKING BIG DATA COME ALIVE Adding Hadoop to Your Analytics Mix: Challenges and Strategies Madina Kassengaliyeva July 23, 2015
  • 2. 2 Madina Kassengaliyeva Director, Client Services, Think Big Madina Kassengaliyeva is responsible for ensuring successful delivery of Think Big’s service engagements. Madina has led strategy, engineering and data science engagements in a variety of areas, including recommendation engines, customer interactions optimization, marketing analytics and compliance. Madina holds an MBA from the University of Chicago and a BA in International Studies from American University. Presenters © 2015 Think Big, a Teradata Company 8/3/2015 Paul Barsch Director, Services Marketing, Think Big Paul Barsch directs marketing programs for Think Big, a Teradata Company. Paul has been in IT for 15+ years in variety of roles for Teradata, HP Enterprise Services and KPMG Consulting.
  • 3. 3 Housekeeping Use the widget bar below to… Get valuable resources & complete exit survey Ask Questions to the Presenters Request online technical help Go social…. …and follow the conversation © 2015 Think Big, a Teradata Company 8/3/2015
  • 4. 4 • Hadoop Adoption Path • Key Challenges – Data, Organization, Capabilities • Ideas for Solutions Agenda
  • 5. 5 Common Hadoop Adoption Path © 2015 Think Big, a Teradata Company 8/3/2015 1. Address Immediate Needs 2. Establish a Data Repository 3. Initial Analytics Exploration 4. Integrate Hadoop into the Analytics Capabilities • Hadoop used to relieve a technology pain point • Reduce data warehouse costs • Speed up ETL • The only users are in technology teams • More and more data gets added to Hadoop as a result of Phase 1 • Greater data variety, more raw data, deeper history • Initial data transfer, security, and governance practices are established • Still perceived as largely a technology platform • Limited number of people or teams conduct POCs using Hadoop • Analytics techniques not available on traditional platforms are applied • Early wins indicate promising business impact and excitement builds • Multiple teams use Hadoop as part of the analytics infrastructure • Techniques, methods, best practices and access patterns get codified • Business begins to capture consistent value Transition from Phase 3 to Phase 4 is when key challenges emerge
  • 6. 6 Hadoop Adoption – Critical Point © 2015 Think Big, a Teradata Company 8/3/2015
  • 7. 7 Key Challenges © 2015 Think Big, a Teradata Company 8/3/2015 Data Organization Capabilities • Impact of schema on read • Consistent taxonomies and reference data • Architecture - access patterns and flows • Skills, roles and responsibilities • Lack of common vocabulary • Knowledge capture and sharing • Foundational capabilities at the whim of changing business priorities • Future that’s hard to envision is hard to build
  • 8. 8 Organization – Key Challenges © 2015 Think Big, a Teradata Company 8/3/2015 • Skills, roles and responsibilities o Significant skills gaps between what’s currently available and what is needed o Both business and technology do analytics and often engineering, blurring lines of responsibility or ownership o “Throw over the wall” doesn’t work • Lack of common vocabulary o Every BU (and every leader) have their own understanding of the same words o This is rarely discussed • Knowledge capture and sharing o Multiple teams work with the same data and similar techniques o Organization silos do not naturally support broad knowledge transfer
  • 9. 9 • Cross-BU committee to guide organizational change, define common vocabulary, defend the effort to executive leadership and share success • Thorough, honest skills assessments to identify gaps, training needs, augmentation needs, map to roles and responsibilities • Documented tools requirements based on current and projected skills • Collaboration architecture • Plug into existing knowledge transfer practices and tools and allow for informal information exchange based on data access privileges Organization – Ideas for Solutions © 2015 Think Big, a Teradata Company 8/3/2015
  • 10. 10 Organization – Key Functions © 2015 Think Big, a Teradata Company 8/3/2015 Strategy Data Management & Governance Architecture Tools Market Research Roadmap Planning Value Realization Future Data Sources Services Support Visualization & ReportingData SME’s Core Platform Development Testing Operations Core Platform Management Metrics Tracking & Reporting Platform Integration Program Management Roadmap Execution Cross Group Coordination Financial Management Small Project Prioritization Communication & Change Management Application Development Analytic Sandbox Data Science Integration, Interfaces & Ingestion Training Incident Management Config, Change, Release ManagementProblem Management Help DeskKnowledge Management Technology Governanc e Data Quality & Metrics Access Controls Data Governance Metadata Management
  • 11. 11 • Foundational capabilities at the whim of changing business priorities • Lack of consensus on what are foundational capabilities • Let’s be honest, the “Top Project” changes often and the resources go with it • Foundational capabilities do not immediately impact the bottom line • Future that’s hard to envision is hard to build • Lack of shared vision • Clarity needed at multiple levels – strategy, operational details, day to day Capabilities – Key Challenges © 2015 Think Big, a Teradata Company 8/3/2015
  • 12. 12 • Consolidate ownership in a team that has organizational influence and includes representatives from the business, the infrastructure, architecture, data, and analytics • Back to vocabulary – agree on what capabilities mean for your business unit and your technology partners • Roadmaps are useful – visual representations of high-level goals against a time line that should define your projects • Dedicate resource to capabilities and protect them • Check in with your roadmap – does it still reflect your vision? Capabilities – Ideas for Solutions © 2015 Think Big, a Teradata Company 8/3/2015 Photo courtesy of Flickr. Creative Commons. By E.Bass.
  • 13. 13 Capabilities Pyramid © 2015 Think Big, a Teradata Company 8/3/2015
  • 14. 14 Capabilities: Roadmap Example © 2015 Think Big, a Teradata Company 8/3/2015 Analytics standardized methods, code, tools, team roles Operations standardized processes, tools, team roles Skills and roles matrix Data Ingestion, Transfer, Structuring, and Governance approach Unified Model Management Integrated Data Science Variables based on single source structured data Variable selection in Hadoop Integration with existing scoring engine Batch data processing in HadoopIntegration Cross-channel and intraday variables generation Batch scoring in Hadoop Natural language processing to analyze text and voice Initial real-time scoring Execution Methodology and project management Data and Models Organization and Managemen t Analytics Knowledge Management Scoring Architectural and Analytical design Data Lifecycle Management Real-time scoring design Statistical and machine-learning-based modeling Data Exploration of unstructured data components (e.g. URL, chat text) Data Exploration of structured data components (e.g. page views, Cross-channel variables, variables from unstructured data + intraday variables
  • 15. 15 • Impact of schema on read • Hadoop supports a variety of data structures, which simplifies data ingestion and allows data users to define preferred schemas • This shifts the burden of defining the schema to the data users • Consistent taxonomies and reference data • Meaningful data analysis requires known and consistent taxonomy • New taxonomies can get created by individual teams • Reference data changes • Architecture - access patterns and flows • Data flows across platforms, regular updates, physical and virtual constraints • Decisions on what should be done where Data – Key Challenges © 2015 Think Big, a Teradata Company 8/3/2015
  • 16. 16 • Big issue with lots of opinions – see Data Lake et. al • Test and define common data manipulation patterns for different use cases – aggregations, reductions, basic statistical derivations • Centralize the responsibility for data governance, data architecture, taxonomy, and maintenance • Establish knowledge sharing for data post- analytics Data – Ideas for Solutions © 2015 Think Big, a Teradata Company 8/3/2015 Photo courtesy of Flickr. Creative Commons. By Renzo Ferrante
  • 17. 17 • Data management, knowledge, architecture, and processing assurance • Investment justification, research, knowledge sharing • Data aggregation and enhancement Client Example – Centralized Data Group © 2015 Think Big, a Teradata Company 8/3/2015 Data Source 1 Data Source 2 Data Source 3 Data Source 3 Business Group Product Group Central Tech Group
  • 18. 18 Conclusions © 2015 Think Big, a Teradata Company 8/3/2015 Data Organization Capabilities • Centralize data management • Knowledge of data = knowledge of business • Technology is not enough – need the right people and processes • Executive commitment is key • Tough conversations can yield much better alignment • Dedicate and protect resources to build capabilities
  • 19. 19 • 100% Big Data Focus • Founded in 2010 with100+ engagements across 70 clients • Unlock value of big data with data science and data engineering services • Proven vendor-neutral open source integration expertise • Agile team-based development methodology • Think Big Academy for skills and organizational development • Global delivery model Who is Think Big?