SlideShare a Scribd company logo
1 of 38
Download to read offline
Grab some coffee and enjoy 
the pre-show banter before 
the top of the hour!
H T 
Technologies 
of 
2014
HOST: 
Eric 
Kavanagh
THIS 
YEAR 
is…
D 
ata 
Science 
ž Considered 
a 
highly 
specialized 
field 
ž Perceived 
as 
an 
expensive 
position 
to 
fill 
given 
the 
required 
skill 
set 
ž Typically 
involves, 
among 
other 
things, 
data 
preparation 
for 
advanced 
analytics
ANALYST: 
John 
Myers 
Research 
Director, 
Enterprise 
Management 
Associates 
ANALYST: 
Robin 
Bloor 
Chief 
Analyst, 
The 
Bloor 
Group 
GUEST: 
Chuck 
Yarbrough 
Director 
of 
Big 
Data 
Product 
Marketing, 
Pentaho 
THE 
LINE 
UP 
GUEST: 
Mark 
Kromer 
Big 
Data 
Analytics 
Product 
Manager, 
Pentaho
INTRODUCING 
John 
Myers
Today’s Presenters 
John Myers, Research Director, EMA 
John has over 10 years of experience working in areas related to business analytics 
in professional services consulting and product development roles. Additionally, John 
helps organizations solve their business analytics problems, whether they relate to 
operational platforms – such as customer care or billing – or applied analytical 
applications – such as revenue assurance or fraud management. 
Slide 8 © 2014 Enterprise Management Associates, Inc.
How are companies using Data Science? 
Slide 9 © 2014 Enterprise Management Associates, Inc.
Data Science Defined 
Data Science is the study of the generalizable extraction 
of business or domain knowledge from data. It 
incorporates varying elements and builds on techniques and 
theories from many fields, including signal processing, 
mathematics, probability models, machine learning, 
statistical learning, computer programming, data engineering, 
pattern recognition and learning, visualization, 
uncertainty modeling, data warehousing, and high 
performance computing. Data Science is not 
restricted to Big Data. Although the fact that data is 
increasing in load, complexity and structure 
makes Big Data an important aspect of Data Science. 
Slide 10 © 2014 Enterprise Management Associates, Inc.
Vision of a “Data Scientist” 
Slide 11 © 2014 Enterprise Management Associates, Inc.
Few and far between… 
Slide 12 © 2014 Enterprise Management Associates, Inc.
Who’s really performing Data Science… 
Slide 13 © 2014 Enterprise Management Associates, Inc.
Many more Business Analysts… 
Slide 14 © 2014 Enterprise Management Associates, Inc.
EMA Hybrid Data Ecosystem 
Slide 15 © 2014 Enterprise Management Associates, Inc.
Empowering Data Scientists AND Business 
Analysts to perform Data Science 
Slide 16 © 2014 Enterprise Management Associates, Inc.
INTRODUCING 
Robin 
Bloor
The Data 
Science 
Dance 
Robin Bloor, Ph.D.
Take Note! 
You can know more 
about a business 
from its data than 
by any other 
means
The Driving Force of Insight 
and 
OPTIMIZATION? 
Foresight 
INSIGHT 
Hindsight Oversight
What is a Data Scientist? 
u Project manager 
u Qualified statistician 
u Domain Business expert 
u Experienced data 
architect 
u Software engineer 
(IT’S A TEAM)
A Process, Not an Activity 
u Data Analytics is a multi-disciplinary 
end-to-end 
process 
u Until recently it was a 
walled-garden. But the 
walls were torn down by: 
• Data availability 
• Scalable technology 
• Open source tools
The Impact of Machine Learning 
Machine learning and processing 
power (parallelism) will CHANGE the 
data analysis process 
Machine learning 
AUTOMATES “data science” 
to some degree
The Data Analysis Budget 
u Data Analysis is 
BUSINESS R&D 
u The focus is on 
business process 
u The outcome of successful 
R&D is a CHANGED PROCESS 
u Think of manufacturing for 
a useful example
INTRODUCING 
Chuck 
Yarbrough 
& 
Mark 
Kromer
DATA SCIENCE PACK 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 26 +1 (866) 660-7555 
CHUCK YARBROUGH 
DIRECTOR, BIG DATA PRODUCT MARKETING 
@CYARBROUGH 
MARK KROMER 
BIG DATA ANALYTICS PRODUCT MANAGER 
@KROMERBIGDATA 
JUNE 18, 2014 
Pentaho’s Hot Topic
The strength of Pentaho 
lies in the power of combination 
Data 
integration 
Big data +Any data 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 27 +1 (866) 660-7555 
Business 
+analytics 
The IT 
department 
Lines of 
+business 
Any data. Any environment. Any analytics.
OUR VISION 
The New Reality: 
Powerful yet simplified analytics for all users 
Billing 
Social 
Media 
Location 
Customer 
Web 
Network 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 28 +1 (866) 660-7555 
Analytics 
ANY Analytics 
• Reports 
• Dashboards 
• Visualizations 
• Discovery 
• Predictive 
• Any role 
Existing & New Data 
Infrastructure & 
Processes 
ANY Environment 
• Data warehouses 
• Data marts 
• Stack vendors 
• Cloud 
• Embedded 
ANY Data 
• Relational 
• Operational 
• Big Data 
• Data sources not 
yet anticipated
Pentaho 5.0 Architected for the Future 
Simplified analytics experience for all users 
Simplified 
Analytics 
Experience 
Blended 
Big Data 
Enterprise 
Big Data 
Integration 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 29 +1 (866) 660-7555
A Spectrum of Big Data Use Cases 
WHAT THE MARKET IS DEPLOYING TODAY AND PLANNING FOR TOMORROW 
Entry 
Transform 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 3300 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide ++11 ((886666)) 666600--77555555 
Advanced 
Optimize 
Data 
Warehouse 
Op.miza.on 
Streamlined 
Data 
Refinery 
Big 
Data 
Explora.on 
Customer 
360 
Degree 
View 
Harnessing 
Machine 
& 
Sensor 
Data 
Next 
Genera.on 
Applica.ons 
Internal 
Big 
Data 
as 
a 
Service 
On-­‐Demand 
Big 
Data 
Blending 
Big 
Data 
Predic.ve 
Analy.cs 
Use Case Complexity 
Business Impact 
Mone.ze 
My 
Data
Pentaho Data Science Pack 
OPERATIONALIZE R AND WEKA, OFFLOAD DATA PREPARATION 
• Allow Data Scientists to focus on analysis 
• Use familiar tools (R, Weka) 
• Leverage a graphical ETL tool to manage 
data preparation 
• Blend Big Data Sources Easily 
• Provide access to data with governance 
• Operationalize the analytic workflow 
• Enable IT to partner with Data Scientists 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 31 +1 (866) 660-7555
What’s in the Pack? 
TOOLS FAMILIAR TO THE DATA SCIENTIST 
• R SCRIPT EXECUTOR 
• Provides access to 5,500+ 
advanced algorithms 
• WEKA FORECASTING 
• Machine learning, time series 
analysis 
• WEKA SCORING 
• Calculates probability values for 
better predictions 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 32 +1 (866) 660-7555 
Analy.c 
Data 
Flows 
PDI 
R/Weka
LEVERAGING THE DATA SCIENCE PACK 
Providing a more complete view for customers 
“…we are now helping clients blend a 360- 
degree view of all equipment data sources for 
early prediction of potential machinery failure.” 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 33 +1 (866) 660-7555 
“There was a gap in the market ….people like 
myself were piecing together solutions to help 
with the data preparation, cleansing and 
orchestration of analytic data sets. The Pentaho 
Data Science Pack fills that gap to operationalize 
the data integration process for advanced and 
predictive analytics ” 
Ken Krooner President at ESRG
“USING WEKA WITH PDI, WE ARE NOW HELPING CLIENTS HAVE A 360- 
DEGREE VIEW OF ALL EQUIPMENT DATA SOURCES TO ENABLE 
CAPABILITIES TO PREDICT EARLY PREDICTION OF POTENTIAL MACHINERY 
FAILURE.” 
Fleet Data via 
Satellite 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 34 +1 (866) 660-7555 
Business User (COO) 
Reporting on Operations 
and Efficiency 
End Users 
Dashboards and Reports 
on Machine Performance 
PDI 
Data 
Marts 
Business 
Analytics 
Server 
Data Scientist 
Data Mining and 
Predictive Data 
Governance 
Local Machine 
and Server Data 
Cross Department 
Operations Data 
PDI 
• Provide 
remote 
and 
onboard 
analy.cs 
for 
mari.me 
fleets 
and 
ships 
• Weka 
with 
PDI, 
to 
help 
clients 
blend 
a 
360-­‐degree 
view
Predictive View of the Customer 
LEVERAGE BLENDED BIG DATA & DATA SCIENCE TO SEIZE OPPORTUNITIES 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 35 +1 (866) 660-7555 
Key 
Considera-ons 
• Requires 
data 
scien.sts 
and 
PhDs 
-­‐ 
expensive 
resources 
• Data 
prep 
for 
predic.ve 
modeling 
can 
be 
labor-­‐intensive 
• Tech 
fit: 
Various 
data 
stores, 
Distributed 
Weka, 
Enterprise 
R 
What 
is 
it? 
• Brings 
mul+-­‐source 
data 
together 
for 
an 
on-­‐demand 
analy+c 
view 
across 
customer 
touch 
points 
• Applies 
predic+ve 
models 
to 
data 
as 
part 
of 
the 
integra+on 
process 
– 
to 
op+mize 
customer-­‐facing 
decisions 
Why 
Do 
It? 
• Recommend 
profitable 
decisions 
for 
front 
line 
teams 
• Automate 
and 
scale 
op-mal 
customer 
interac-ons 
• Boost 
upsell, 
reduce 
churn
Thank You 
JOIN THE CONVERSATION. YOU CAN FIND US ON: 
blog.pentaho.com 
@Pentaho 
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 36 +1 (866) 660-7555 
Facebook.com/Pentaho 
Pentaho Business Analytics
THANK 
YOU! 
The 
Archive 
Trifecta: 
• Inside 
Analysis 
www.insideanalysis.com 
• SlideShare 
www.slideshare.net/InsideAnalysis 
• YouTube 
www.youtube.com/user/BloorGroup

More Related Content

Viewers also liked

Elementos de Maquina
Elementos de MaquinaElementos de Maquina
Elementos de MaquinaPaola Reyes
 
Icade - Forte progression des résultats annuels 2016
Icade - Forte progression des résultats annuels 2016Icade - Forte progression des résultats annuels 2016
Icade - Forte progression des résultats annuels 2016ICADE
 
Programa de mesures contra la contaminació de l'aire
Programa de mesures contra la contaminació de l'airePrograma de mesures contra la contaminació de l'aire
Programa de mesures contra la contaminació de l'aireAjuntament de Barcelona
 
Top 30 US Accountable Care Orgaizations_Feb, 2017
Top 30 US Accountable Care Orgaizations_Feb, 2017Top 30 US Accountable Care Orgaizations_Feb, 2017
Top 30 US Accountable Care Orgaizations_Feb, 2017Levi Shapiro
 
110 preguntas auto evaluacion concurso
110 preguntas auto evaluacion  concurso110 preguntas auto evaluacion  concurso
110 preguntas auto evaluacion concursoJesus Villa
 
REC Solar Customers are Saving Money and the Planet
REC Solar Customers are Saving Money and the PlanetREC Solar Customers are Saving Money and the Planet
REC Solar Customers are Saving Money and the PlanetREC Solar
 
Social Network Prioritization - How to Prioritize Investment in Social Media
Social Network Prioritization - How to Prioritize Investment in Social Media Social Network Prioritization - How to Prioritize Investment in Social Media
Social Network Prioritization - How to Prioritize Investment in Social Media Marketing Nutz
 

Viewers also liked (9)

Elementos de Maquina
Elementos de MaquinaElementos de Maquina
Elementos de Maquina
 
Icade - Forte progression des résultats annuels 2016
Icade - Forte progression des résultats annuels 2016Icade - Forte progression des résultats annuels 2016
Icade - Forte progression des résultats annuels 2016
 
Programa de mesures contra la contaminació de l'aire
Programa de mesures contra la contaminació de l'airePrograma de mesures contra la contaminació de l'aire
Programa de mesures contra la contaminació de l'aire
 
Airbus A380
Airbus A380Airbus A380
Airbus A380
 
Top 30 US Accountable Care Orgaizations_Feb, 2017
Top 30 US Accountable Care Orgaizations_Feb, 2017Top 30 US Accountable Care Orgaizations_Feb, 2017
Top 30 US Accountable Care Orgaizations_Feb, 2017
 
Modulo de quimica
Modulo de quimica Modulo de quimica
Modulo de quimica
 
110 preguntas auto evaluacion concurso
110 preguntas auto evaluacion  concurso110 preguntas auto evaluacion  concurso
110 preguntas auto evaluacion concurso
 
REC Solar Customers are Saving Money and the Planet
REC Solar Customers are Saving Money and the PlanetREC Solar Customers are Saving Money and the Planet
REC Solar Customers are Saving Money and the Planet
 
Social Network Prioritization - How to Prioritize Investment in Social Media
Social Network Prioritization - How to Prioritize Investment in Social Media Social Network Prioritization - How to Prioritize Investment in Social Media
Social Network Prioritization - How to Prioritize Investment in Social Media
 

More from Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIInside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 

More from Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

The Ultimate Toolkit – Equipping the Data Scientist

  • 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  • 5. D ata Science ž Considered a highly specialized field ž Perceived as an expensive position to fill given the required skill set ž Typically involves, among other things, data preparation for advanced analytics
  • 6. ANALYST: John Myers Research Director, Enterprise Management Associates ANALYST: Robin Bloor Chief Analyst, The Bloor Group GUEST: Chuck Yarbrough Director of Big Data Product Marketing, Pentaho THE LINE UP GUEST: Mark Kromer Big Data Analytics Product Manager, Pentaho
  • 8. Today’s Presenters John Myers, Research Director, EMA John has over 10 years of experience working in areas related to business analytics in professional services consulting and product development roles. Additionally, John helps organizations solve their business analytics problems, whether they relate to operational platforms – such as customer care or billing – or applied analytical applications – such as revenue assurance or fraud management. Slide 8 © 2014 Enterprise Management Associates, Inc.
  • 9. How are companies using Data Science? Slide 9 © 2014 Enterprise Management Associates, Inc.
  • 10. Data Science Defined Data Science is the study of the generalizable extraction of business or domain knowledge from data. It incorporates varying elements and builds on techniques and theories from many fields, including signal processing, mathematics, probability models, machine learning, statistical learning, computer programming, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high performance computing. Data Science is not restricted to Big Data. Although the fact that data is increasing in load, complexity and structure makes Big Data an important aspect of Data Science. Slide 10 © 2014 Enterprise Management Associates, Inc.
  • 11. Vision of a “Data Scientist” Slide 11 © 2014 Enterprise Management Associates, Inc.
  • 12. Few and far between… Slide 12 © 2014 Enterprise Management Associates, Inc.
  • 13. Who’s really performing Data Science… Slide 13 © 2014 Enterprise Management Associates, Inc.
  • 14. Many more Business Analysts… Slide 14 © 2014 Enterprise Management Associates, Inc.
  • 15. EMA Hybrid Data Ecosystem Slide 15 © 2014 Enterprise Management Associates, Inc.
  • 16. Empowering Data Scientists AND Business Analysts to perform Data Science Slide 16 © 2014 Enterprise Management Associates, Inc.
  • 18. The Data Science Dance Robin Bloor, Ph.D.
  • 19. Take Note! You can know more about a business from its data than by any other means
  • 20. The Driving Force of Insight and OPTIMIZATION? Foresight INSIGHT Hindsight Oversight
  • 21. What is a Data Scientist? u Project manager u Qualified statistician u Domain Business expert u Experienced data architect u Software engineer (IT’S A TEAM)
  • 22. A Process, Not an Activity u Data Analytics is a multi-disciplinary end-to-end process u Until recently it was a walled-garden. But the walls were torn down by: • Data availability • Scalable technology • Open source tools
  • 23. The Impact of Machine Learning Machine learning and processing power (parallelism) will CHANGE the data analysis process Machine learning AUTOMATES “data science” to some degree
  • 24. The Data Analysis Budget u Data Analysis is BUSINESS R&D u The focus is on business process u The outcome of successful R&D is a CHANGED PROCESS u Think of manufacturing for a useful example
  • 26. DATA SCIENCE PACK © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 26 +1 (866) 660-7555 CHUCK YARBROUGH DIRECTOR, BIG DATA PRODUCT MARKETING @CYARBROUGH MARK KROMER BIG DATA ANALYTICS PRODUCT MANAGER @KROMERBIGDATA JUNE 18, 2014 Pentaho’s Hot Topic
  • 27. The strength of Pentaho lies in the power of combination Data integration Big data +Any data © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 27 +1 (866) 660-7555 Business +analytics The IT department Lines of +business Any data. Any environment. Any analytics.
  • 28. OUR VISION The New Reality: Powerful yet simplified analytics for all users Billing Social Media Location Customer Web Network © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 28 +1 (866) 660-7555 Analytics ANY Analytics • Reports • Dashboards • Visualizations • Discovery • Predictive • Any role Existing & New Data Infrastructure & Processes ANY Environment • Data warehouses • Data marts • Stack vendors • Cloud • Embedded ANY Data • Relational • Operational • Big Data • Data sources not yet anticipated
  • 29. Pentaho 5.0 Architected for the Future Simplified analytics experience for all users Simplified Analytics Experience Blended Big Data Enterprise Big Data Integration © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 29 +1 (866) 660-7555
  • 30. A Spectrum of Big Data Use Cases WHAT THE MARKET IS DEPLOYING TODAY AND PLANNING FOR TOMORROW Entry Transform © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 3300 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide ++11 ((886666)) 666600--77555555 Advanced Optimize Data Warehouse Op.miza.on Streamlined Data Refinery Big Data Explora.on Customer 360 Degree View Harnessing Machine & Sensor Data Next Genera.on Applica.ons Internal Big Data as a Service On-­‐Demand Big Data Blending Big Data Predic.ve Analy.cs Use Case Complexity Business Impact Mone.ze My Data
  • 31. Pentaho Data Science Pack OPERATIONALIZE R AND WEKA, OFFLOAD DATA PREPARATION • Allow Data Scientists to focus on analysis • Use familiar tools (R, Weka) • Leverage a graphical ETL tool to manage data preparation • Blend Big Data Sources Easily • Provide access to data with governance • Operationalize the analytic workflow • Enable IT to partner with Data Scientists © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 31 +1 (866) 660-7555
  • 32. What’s in the Pack? TOOLS FAMILIAR TO THE DATA SCIENTIST • R SCRIPT EXECUTOR • Provides access to 5,500+ advanced algorithms • WEKA FORECASTING • Machine learning, time series analysis • WEKA SCORING • Calculates probability values for better predictions © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 32 +1 (866) 660-7555 Analy.c Data Flows PDI R/Weka
  • 33. LEVERAGING THE DATA SCIENCE PACK Providing a more complete view for customers “…we are now helping clients blend a 360- degree view of all equipment data sources for early prediction of potential machinery failure.” © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 33 +1 (866) 660-7555 “There was a gap in the market ….people like myself were piecing together solutions to help with the data preparation, cleansing and orchestration of analytic data sets. The Pentaho Data Science Pack fills that gap to operationalize the data integration process for advanced and predictive analytics ” Ken Krooner President at ESRG
  • 34. “USING WEKA WITH PDI, WE ARE NOW HELPING CLIENTS HAVE A 360- DEGREE VIEW OF ALL EQUIPMENT DATA SOURCES TO ENABLE CAPABILITIES TO PREDICT EARLY PREDICTION OF POTENTIAL MACHINERY FAILURE.” Fleet Data via Satellite © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 34 +1 (866) 660-7555 Business User (COO) Reporting on Operations and Efficiency End Users Dashboards and Reports on Machine Performance PDI Data Marts Business Analytics Server Data Scientist Data Mining and Predictive Data Governance Local Machine and Server Data Cross Department Operations Data PDI • Provide remote and onboard analy.cs for mari.me fleets and ships • Weka with PDI, to help clients blend a 360-­‐degree view
  • 35. Predictive View of the Customer LEVERAGE BLENDED BIG DATA & DATA SCIENCE TO SEIZE OPPORTUNITIES © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 35 +1 (866) 660-7555 Key Considera-ons • Requires data scien.sts and PhDs -­‐ expensive resources • Data prep for predic.ve modeling can be labor-­‐intensive • Tech fit: Various data stores, Distributed Weka, Enterprise R What is it? • Brings mul+-­‐source data together for an on-­‐demand analy+c view across customer touch points • Applies predic+ve models to data as part of the integra+on process – to op+mize customer-­‐facing decisions Why Do It? • Recommend profitable decisions for front line teams • Automate and scale op-mal customer interac-ons • Boost upsell, reduce churn
  • 36. Thank You JOIN THE CONVERSATION. YOU CAN FIND US ON: blog.pentaho.com @Pentaho © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide 36 +1 (866) 660-7555 Facebook.com/Pentaho Pentaho Business Analytics
  • 37.
  • 38. THANK YOU! The Archive Trifecta: • Inside Analysis www.insideanalysis.com • SlideShare www.slideshare.net/InsideAnalysis • YouTube www.youtube.com/user/BloorGroup