SlideShare a Scribd company logo
1 of 25
@joe_Caserta#DataSummit
@joe_Caserta
Architecting Data For The Modern Enterprise
Presented by
Joe Caserta
May 17, 2017
Data Summit 2017
New York City
#DataSummit
@joe_Caserta#DataSummit
@joe_Caserta#DataSummit
About Joe Caserta
Launched Big Data practice
Co-author, with Ralph Kimball, The Data
Warehouse ETL Toolkit (Wiley)
Data Analysis, Data Warehousing and Business
Intelligence since 1996
Began consulting database programing and data
modeling 25+ years hands-on experience building database
solutions
Founded Caserta Concepts in NYC
Web log analytics solution published in Intelligent
Enterprise magazine
Launched Data Science, Data Interaction and Cloud
practices
Laser focus on extending Data Analytics with Big Data
solutions
1986
2004
1996
2009
2001
2013
2012
2014
Dedicated to Data Governance Techniques on Big
Data (Innovation)
Awarded Top 20 Big Data Companies 2016
Top 20 Most Powerful
Big Data consulting firms
Launched Big Data Warehousing (BDW) Meetup NYC:
2,000+ Members
2016 Awarded Fastest Growing Big Data Companies
2016
Established best practices for big data ecosystem
implementations
@joe_Caserta#DataSummit
About Caserta Concepts
– Consulting Data Innovation
– Award-winning company
– Internationally recognized work force
– Strategy, Architecture, Implementation, Governance
– Innovation Partner
– Strategic Consulting
– Advanced Architecture
– Build & Deploy
- Leader in Enterprise Data Solutions
– Big Data Analytics
– Data Warehousing
– Business Intelligence
Data Science
Cloud Computing
Data Governance
@joe_Caserta#DataSummit
Why is Data so Important?
1500s
Printing Press
1840s
Penny Post
1850s
Telegraph
1850s
Rural Free Post
1890s
Telephone
1900s
Radio
1950s
TV
1970s
PCs
1980s
Internet
1990s
Web
2000s
Social Media, Mobile, Big Data, Cloud
98,000+ Tweets
695,000 Status Updates
11 Million instant messages
698,445 Google Searches
168 million+ emails sent
1,829 TB of data created
217 new mobile web
users
Every 60 Seconds
@joe_Caserta#DataSummit
Understanding the Customer
Awareness Consideration Purchase Service
Loyalty
Expansion
PR
Radio
TV
Print
Outdoor
Word of Mouth
Direct Mail
Customer Service
Physical Touchpoints
Digital Touchpoints
Search
Paid Content
email
Website/
Landing Pages
Social Media
Community
Chat
Social Media
Call Center
Offers
Mailings
Survey
Loyalty Programs
email
Agents
Partners
Ads
Website
Mobile
3rd Party Sites
Offers
Web self-service
@joe_Caserta#DataSummit
Life As We Know It
Business: “I need to analyze some new data”
 IT collects requirements
 Creates normalized and/or dimensional data models
 Profiles and conforms and the data
 Sophisticated ETL programs and quality standards
 Loads it into data models
 Builds a BI semantic layer
 Creates dashboards and reports
IT: “You can access your data in 3-6 months to see if it has value!
– Onboarding new data is difficult!
– Rigid Structures and Data Governance
– Disconnected/removed from business
@joe_Caserta#DataSummit
The Problem: Shadow IT = Data Sprawl
• There is one application for every 5-10 employees generating copies of
the same files leading to massive amounts of duplicate idle data strewn
all across the enterprise. - Michael Vizard, ITBusinessEdge.com
• Employees spend 35% of their work time searching for information...
finding what they seek 50% of the time or less.
- “The High Cost of Not Finding Information,” IDC
@joe_Caserta#DataSummit
@joe_Caserta#DataSummit
The New Data Paradigm
OLD WAY:
• Structure Data  Ingest Data  Analyze Data
• Fully Governed
• Monolith
NEW WAY:
• Ingest Data  Analyze Data  Structure Data
• Just Enough Governance
• Dynamic
RECIPE:
• Data Officer & Data Organization
• Enterprise Data Lake
• Corporate Data Pyramid
@joe_Caserta#DataSummit
Business Value
Cloud-based Data Lake
Big Data Analysis: The Ecosystem of the future
Analyze
Persist
DeployIngest
Data Integration
Identity Resolution
Data Quality
Discovery Exploration
Machine Learning
Models Development
Reports / Dashboards
Applications
APIs
Structured Data
Unstructured Data
SQL, NoSQL, Object Store
Find Share Collaborate
Data Engineer Data Scientist Business Analyst App Developer
Provides innovative and industry
leading technologies to rapidly be
applied to the business without
having to manage compatibility and
data complexity.
Technical Value
Provides an open framework
to reduce the number of
integration points and testing
environments to deliver
business solutions.
or
@joe_Caserta#DataSummit
Ingest Raw
Data
Organize, Define,
Complete
Munging, Blending
Machine Learning
Data Quality and Monitoring
Metadata, ILM , Security
Data Catalog
Data Integration
Fully Governed ( trusted)
Arbitrary/Ad-hoc Queries
and Reporting
Usage Pattern Data Governance
Metadata, ILM,
Security
Corporate Data Pyramid (CDP)
@joe_Caserta#DataSummit
Cloud Component AWS Google Microsoft
Scalable distributed storage S3 GCS Azure Storage
Pluggable fit-for-purpose processing EMR DataProc HDInsight
Compute Services EC2 GCE VMs
Consistent extensible framework Spark Spark Spark
Dimensional MPP Data Warehouse Redshift BigQuery
Azure SQL Data
Warehouse
Data Streaming Kenesis PubSub Azure Stream
Common Interface Jupyter DataLab Azure Notebook
The Data Lake on the Cloud
• Remove barriers between data ingestion and analysis
• Democratize data with Just Enough Data Governance (JEDG)
@joe_Caserta#DataSummit
Which Cloud?
@joe_Caserta#DataSummit
The Clouds Coalesce
Percent of organizations with AWS as primary, also
uses GCP
Percent of organizations with AWS as primary,
also uses Azure
Percent of organizations with GCP as primary, also
uses AWS
41%
32%
31%
Source: Clutch, 2016
@joe_Caserta#DataSummit
• Development local or distributed is identical
• Beautiful high level API’s
• Full universe of Python modules
• Open source and Free
• Blazing fast!
Spark has become our default processing engine for a data engineering & science
Why Spark?
@joe_Caserta#DataSummit
Analytics Development Lifecycle
• Data Science is performed in the ephemeral workspaces
• The work products of data science is promoted from “insights” to real applications.
• Rigorous Data Governance applied
• Processes must be hardened, repeatable, and performant
Big$
Data$
Warehouse$
Data$Science$Workspace$
Data$Lake$–$Integrated$Sandbox$$
Landing$Area$–$Source$Data$in$“Full$Fidelity”$
New
Data
New
Insights
Governance
Refinery
@joe_Caserta#DataSummit
Unexpected Reaction to Change
@joe_Caserta#DataSummit
Global economics
Intensity of competition
Reduce costs
Move to cross-functional teams
New executive leadership
Speed of technical change
Social trends and changes
Period of time in present role
Status & perks of office/dept under threat
No apparent reasons for proposed changes
Lack of understanding of proposed changes
Fear of inability to cope with new technology
Concern over job security
Forces for Change Forces Resisting Change
Status Quo
Moving the Status Quo
http://www.change-management-coach.com/force-field-analysis.html
@joe_Caserta#DataSummit
Introducing the Chief Data Officer
• Evangelize a data vision for the organization
• Support & enforce data governance policies via outreach, training & tools
• Monitor and enforce data quality in collaboration with data owners
• Monitor and enforce data security along with Legal/Security/Compliance
• Work with IT to develop/maintain an enterprise repository of strategic data
• Set standards for analytical reporting and generate data insights
• Provide a single point of accountability for data
initiatives and issues
• Innovate ways to use existing data
• Enrich and augment data by combining internal and
external sources
• Support efficient and agile analytics through training
and templates
@joe_Caserta#DataSummit
The CDO: The Whole Brain Challenge
Front
Back
Analytics Oriented
• Data Science
• Research
Process Oriented
• Data Governance
• Compliance
Operations Oriented
• Shared Services
• Data Engineering
Revenue Oriented
• Revenue Goals
• Monetizing Data
@joe_Caserta#DataSummit
Chief Data Organization (Oversight)
Vertical Business Area
[Sales/Finance/Marketing/Operations/Customer Svc]
Product Owner
SCRUM Master
Agile Development Team
Business Subject Matter Expertise
Data Librarian/Data Stewardship
Data Science/ Statistical Skills
Data Engineering / Architecture
Presentation/ BI Report Development Skills
Data Quality Assurance
DevOps
IT Organization
(Oversight)
Enterprise Data Architect
Solution Engineers
Data Integration Practice
User Experience Practice
QA Practice
Operations Practice
Advanced Analytics
Business Analysts
Data Analysts
Data Scientists
Statisticians
Data Engineers
Planning Organization
Project Managers
Data Organization
Data Gov Coordinator
Data Librarians
Data Stewards
Agile Data Teams
@joe_Caserta#DataSummit
Caution: Assembly Required
 Some of the most hopeful tools are brand new or in
incubation
 Enterprise big data implementations typically combine
products with custom built components
The Buildout
People, Processes and Business commitment are still critical!
Data Integration & Quality Data Catalog & Governance Emerging Solutions
@joe_Caserta#DataSummit
What the Future Holds
• DevOps for Analytics
• Search-Based BI (NLP)
• Artificial Intelligence (AI)
• Virtual Reality BI (VR)
• Virtual Assistant BI (Voice)
• Reporting/Predictions Converge
• Citizen Data Scientists Emerge
@joe_Caserta#DataSummit
Joe Caserta
President, Caserta Concepts
joe@casertaconcepts.com
@joe_caserta
Thank You!

More Related Content

What's hot

Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Caserta
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data LakeCaserta
 
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on HadoopCaserta
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for EveryoneCaserta
 
Moving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsMoving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsCaserta
 
General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017Caserta
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkCaserta
 
Journey to Cloud Analytics
Journey to Cloud Analytics Journey to Cloud Analytics
Journey to Cloud Analytics Datavail
 
Big Data Analytics on the Cloud
Big Data Analytics on the CloudBig Data Analytics on the Cloud
Big Data Analytics on the CloudCaserta
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18Harvinder Atwal
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and InnovationCaserta
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data LakeCaserta
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It? Caserta
 
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyAgile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyTamrMarketing
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and InnovationCaserta
 
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...DATAVERSITY
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseCaserta
 
Defining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentDefining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentCaserta
 
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...StampedeCon
 

What's hot (20)

Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
 
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
 
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on Hadoop
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Moving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsMoving Past Infrastructure Limitations
Moving Past Infrastructure Limitations
 
General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 
Journey to Cloud Analytics
Journey to Cloud Analytics Journey to Cloud Analytics
Journey to Cloud Analytics
 
Big Data Analytics on the Cloud
Big Data Analytics on the CloudBig Data Analytics on the Cloud
Big Data Analytics on the Cloud
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data Lake
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyAgile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
 
Balancing Data Governance and Innovation
Balancing Data Governance and InnovationBalancing Data Governance and Innovation
Balancing Data Governance and Innovation
 
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
Smart Data Webinar: Choosing the Right Data Management Architecture for Cogni...
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Defining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentDefining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business Environment
 
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
 

Viewers also liked

Cloud Computing System models for Distributed and cloud computing & Performan...
Cloud Computing System models for Distributed and cloud computing & Performan...Cloud Computing System models for Distributed and cloud computing & Performan...
Cloud Computing System models for Distributed and cloud computing & Performan...hrmalik20
 
TOON Stephen Galsworthy
TOON Stephen GalsworthyTOON Stephen Galsworthy
TOON Stephen GalsworthyBigDataExpo
 
Why choose VMware vCloud Suite Standard over vSOM
Why choose VMware vCloud Suite Standard over vSOMWhy choose VMware vCloud Suite Standard over vSOM
Why choose VMware vCloud Suite Standard over vSOMAnil Gupta (AJ) - vExpert
 
Technical Radar (Chinese version) 2014-06
Technical Radar (Chinese version) 2014-06Technical Radar (Chinese version) 2014-06
Technical Radar (Chinese version) 2014-06Freyr Lin
 
Architecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi AccountsArchitecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi AccountsAmazon Web Services
 
BVBA SOSIS van Jeroen Meus kent rustige start
BVBA SOSIS van Jeroen Meus kent rustige startBVBA SOSIS van Jeroen Meus kent rustige start
BVBA SOSIS van Jeroen Meus kent rustige startThierry Debels
 
Poor mans spy vs spy using open source tools to detect attackers
Poor mans spy vs spy using open source tools to detect attackersPoor mans spy vs spy using open source tools to detect attackers
Poor mans spy vs spy using open source tools to detect attackersDerek Banks
 
Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...
Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...
Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...CA Technologies
 
Next Generation Data Center Strategies
Next Generation Data Center StrategiesNext Generation Data Center Strategies
Next Generation Data Center StrategiesVenkat Nambiyur
 
Boston Devops Meetup June 22nd
Boston Devops Meetup June 22ndBoston Devops Meetup June 22nd
Boston Devops Meetup June 22ndmdilawari
 
Praktiline pilvekonverents - IT haldust hõlbustavad uuendused
Praktiline pilvekonverents - IT haldust hõlbustavad uuendusedPraktiline pilvekonverents - IT haldust hõlbustavad uuendused
Praktiline pilvekonverents - IT haldust hõlbustavad uuendusedPrimend
 
Building an ai with raspberry pi
Building an ai with raspberry piBuilding an ai with raspberry pi
Building an ai with raspberry piHaesung Lee
 
Build_Buy_StreamAnalytix_WhitePaper
Build_Buy_StreamAnalytix_WhitePaperBuild_Buy_StreamAnalytix_WhitePaper
Build_Buy_StreamAnalytix_WhitePaperJane Roberts
 
Giovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDrivenGiovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDrivenBigDataExpo
 
Disruptive Data Science - How Data Science and Big Data are Transforming Busi...
Disruptive Data Science - How Data Science and Big Data are Transforming Busi...Disruptive Data Science - How Data Science and Big Data are Transforming Busi...
Disruptive Data Science - How Data Science and Big Data are Transforming Busi...EMC
 
2017-10-03 Session aOS - Back from Ignite - MS Experiences
2017-10-03 Session aOS - Back from Ignite - MS Experiences2017-10-03 Session aOS - Back from Ignite - MS Experiences
2017-10-03 Session aOS - Back from Ignite - MS ExperiencesPatrick Guimonet
 
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...IBM (Middle East and Africa)
 
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)William Yeh
 
DFW meetup Cognitive services - parashar - feb 22
DFW meetup Cognitive services -  parashar - feb 22DFW meetup Cognitive services -  parashar - feb 22
DFW meetup Cognitive services - parashar - feb 22Parashar Shah
 

Viewers also liked (20)

Cloud Computing System models for Distributed and cloud computing & Performan...
Cloud Computing System models for Distributed and cloud computing & Performan...Cloud Computing System models for Distributed and cloud computing & Performan...
Cloud Computing System models for Distributed and cloud computing & Performan...
 
TOON Stephen Galsworthy
TOON Stephen GalsworthyTOON Stephen Galsworthy
TOON Stephen Galsworthy
 
Why choose VMware vCloud Suite Standard over vSOM
Why choose VMware vCloud Suite Standard over vSOMWhy choose VMware vCloud Suite Standard over vSOM
Why choose VMware vCloud Suite Standard over vSOM
 
Technical Radar (Chinese version) 2014-06
Technical Radar (Chinese version) 2014-06Technical Radar (Chinese version) 2014-06
Technical Radar (Chinese version) 2014-06
 
Architecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi AccountsArchitecting Security and Governance Across Multi Accounts
Architecting Security and Governance Across Multi Accounts
 
BVBA SOSIS van Jeroen Meus kent rustige start
BVBA SOSIS van Jeroen Meus kent rustige startBVBA SOSIS van Jeroen Meus kent rustige start
BVBA SOSIS van Jeroen Meus kent rustige start
 
Poor mans spy vs spy using open source tools to detect attackers
Poor mans spy vs spy using open source tools to detect attackersPoor mans spy vs spy using open source tools to detect attackers
Poor mans spy vs spy using open source tools to detect attackers
 
Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...
Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...
Agile Operations Keynote: Redefine the Role of IT Operations With Digital Tra...
 
Next Generation Data Center Strategies
Next Generation Data Center StrategiesNext Generation Data Center Strategies
Next Generation Data Center Strategies
 
Sudan tanıtımı
Sudan tanıtımıSudan tanıtımı
Sudan tanıtımı
 
Boston Devops Meetup June 22nd
Boston Devops Meetup June 22ndBoston Devops Meetup June 22nd
Boston Devops Meetup June 22nd
 
Praktiline pilvekonverents - IT haldust hõlbustavad uuendused
Praktiline pilvekonverents - IT haldust hõlbustavad uuendusedPraktiline pilvekonverents - IT haldust hõlbustavad uuendused
Praktiline pilvekonverents - IT haldust hõlbustavad uuendused
 
Building an ai with raspberry pi
Building an ai with raspberry piBuilding an ai with raspberry pi
Building an ai with raspberry pi
 
Build_Buy_StreamAnalytix_WhitePaper
Build_Buy_StreamAnalytix_WhitePaperBuild_Buy_StreamAnalytix_WhitePaper
Build_Buy_StreamAnalytix_WhitePaper
 
Giovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDrivenGiovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDriven
 
Disruptive Data Science - How Data Science and Big Data are Transforming Busi...
Disruptive Data Science - How Data Science and Big Data are Transforming Busi...Disruptive Data Science - How Data Science and Big Data are Transforming Busi...
Disruptive Data Science - How Data Science and Big Data are Transforming Busi...
 
2017-10-03 Session aOS - Back from Ignite - MS Experiences
2017-10-03 Session aOS - Back from Ignite - MS Experiences2017-10-03 Session aOS - Back from Ignite - MS Experiences
2017-10-03 Session aOS - Back from Ignite - MS Experiences
 
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
 
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
 
DFW meetup Cognitive services - parashar - feb 22
DFW meetup Cognitive services -  parashar - feb 22DFW meetup Cognitive services -  parashar - feb 22
DFW meetup Cognitive services - parashar - feb 22
 

Similar to Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote

The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation Caserta
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseDatabricks
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsCaserta
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with MicrosoftCaserta
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Caserta
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopCCG
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeCaserta
 
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data GardenData-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data GardenDATAVERSITY
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DATAVERSITY
 
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...DATAVERSITY
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceCaserta
 
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...DATAVERSITY
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsDATAVERSITY
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Dataconomy Media
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Denodo
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSNicolas Georgeault
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsDATAVERSITY
 

Similar to Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote (20)

The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with Microsoft
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data GardenData-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
 
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture Requirements
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDS
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture Requirements
 

Recently uploaded

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 

Recently uploaded (20)

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 

Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote

  • 1. @joe_Caserta#DataSummit @joe_Caserta Architecting Data For The Modern Enterprise Presented by Joe Caserta May 17, 2017 Data Summit 2017 New York City #DataSummit
  • 3. @joe_Caserta#DataSummit About Joe Caserta Launched Big Data practice Co-author, with Ralph Kimball, The Data Warehouse ETL Toolkit (Wiley) Data Analysis, Data Warehousing and Business Intelligence since 1996 Began consulting database programing and data modeling 25+ years hands-on experience building database solutions Founded Caserta Concepts in NYC Web log analytics solution published in Intelligent Enterprise magazine Launched Data Science, Data Interaction and Cloud practices Laser focus on extending Data Analytics with Big Data solutions 1986 2004 1996 2009 2001 2013 2012 2014 Dedicated to Data Governance Techniques on Big Data (Innovation) Awarded Top 20 Big Data Companies 2016 Top 20 Most Powerful Big Data consulting firms Launched Big Data Warehousing (BDW) Meetup NYC: 2,000+ Members 2016 Awarded Fastest Growing Big Data Companies 2016 Established best practices for big data ecosystem implementations
  • 4. @joe_Caserta#DataSummit About Caserta Concepts – Consulting Data Innovation – Award-winning company – Internationally recognized work force – Strategy, Architecture, Implementation, Governance – Innovation Partner – Strategic Consulting – Advanced Architecture – Build & Deploy - Leader in Enterprise Data Solutions – Big Data Analytics – Data Warehousing – Business Intelligence Data Science Cloud Computing Data Governance
  • 5. @joe_Caserta#DataSummit Why is Data so Important? 1500s Printing Press 1840s Penny Post 1850s Telegraph 1850s Rural Free Post 1890s Telephone 1900s Radio 1950s TV 1970s PCs 1980s Internet 1990s Web 2000s Social Media, Mobile, Big Data, Cloud 98,000+ Tweets 695,000 Status Updates 11 Million instant messages 698,445 Google Searches 168 million+ emails sent 1,829 TB of data created 217 new mobile web users Every 60 Seconds
  • 6. @joe_Caserta#DataSummit Understanding the Customer Awareness Consideration Purchase Service Loyalty Expansion PR Radio TV Print Outdoor Word of Mouth Direct Mail Customer Service Physical Touchpoints Digital Touchpoints Search Paid Content email Website/ Landing Pages Social Media Community Chat Social Media Call Center Offers Mailings Survey Loyalty Programs email Agents Partners Ads Website Mobile 3rd Party Sites Offers Web self-service
  • 7. @joe_Caserta#DataSummit Life As We Know It Business: “I need to analyze some new data”  IT collects requirements  Creates normalized and/or dimensional data models  Profiles and conforms and the data  Sophisticated ETL programs and quality standards  Loads it into data models  Builds a BI semantic layer  Creates dashboards and reports IT: “You can access your data in 3-6 months to see if it has value! – Onboarding new data is difficult! – Rigid Structures and Data Governance – Disconnected/removed from business
  • 8. @joe_Caserta#DataSummit The Problem: Shadow IT = Data Sprawl • There is one application for every 5-10 employees generating copies of the same files leading to massive amounts of duplicate idle data strewn all across the enterprise. - Michael Vizard, ITBusinessEdge.com • Employees spend 35% of their work time searching for information... finding what they seek 50% of the time or less. - “The High Cost of Not Finding Information,” IDC
  • 10. @joe_Caserta#DataSummit The New Data Paradigm OLD WAY: • Structure Data  Ingest Data  Analyze Data • Fully Governed • Monolith NEW WAY: • Ingest Data  Analyze Data  Structure Data • Just Enough Governance • Dynamic RECIPE: • Data Officer & Data Organization • Enterprise Data Lake • Corporate Data Pyramid
  • 11. @joe_Caserta#DataSummit Business Value Cloud-based Data Lake Big Data Analysis: The Ecosystem of the future Analyze Persist DeployIngest Data Integration Identity Resolution Data Quality Discovery Exploration Machine Learning Models Development Reports / Dashboards Applications APIs Structured Data Unstructured Data SQL, NoSQL, Object Store Find Share Collaborate Data Engineer Data Scientist Business Analyst App Developer Provides innovative and industry leading technologies to rapidly be applied to the business without having to manage compatibility and data complexity. Technical Value Provides an open framework to reduce the number of integration points and testing environments to deliver business solutions. or
  • 12. @joe_Caserta#DataSummit Ingest Raw Data Organize, Define, Complete Munging, Blending Machine Learning Data Quality and Monitoring Metadata, ILM , Security Data Catalog Data Integration Fully Governed ( trusted) Arbitrary/Ad-hoc Queries and Reporting Usage Pattern Data Governance Metadata, ILM, Security Corporate Data Pyramid (CDP)
  • 13. @joe_Caserta#DataSummit Cloud Component AWS Google Microsoft Scalable distributed storage S3 GCS Azure Storage Pluggable fit-for-purpose processing EMR DataProc HDInsight Compute Services EC2 GCE VMs Consistent extensible framework Spark Spark Spark Dimensional MPP Data Warehouse Redshift BigQuery Azure SQL Data Warehouse Data Streaming Kenesis PubSub Azure Stream Common Interface Jupyter DataLab Azure Notebook The Data Lake on the Cloud • Remove barriers between data ingestion and analysis • Democratize data with Just Enough Data Governance (JEDG)
  • 15. @joe_Caserta#DataSummit The Clouds Coalesce Percent of organizations with AWS as primary, also uses GCP Percent of organizations with AWS as primary, also uses Azure Percent of organizations with GCP as primary, also uses AWS 41% 32% 31% Source: Clutch, 2016
  • 16. @joe_Caserta#DataSummit • Development local or distributed is identical • Beautiful high level API’s • Full universe of Python modules • Open source and Free • Blazing fast! Spark has become our default processing engine for a data engineering & science Why Spark?
  • 17. @joe_Caserta#DataSummit Analytics Development Lifecycle • Data Science is performed in the ephemeral workspaces • The work products of data science is promoted from “insights” to real applications. • Rigorous Data Governance applied • Processes must be hardened, repeatable, and performant Big$ Data$ Warehouse$ Data$Science$Workspace$ Data$Lake$–$Integrated$Sandbox$$ Landing$Area$–$Source$Data$in$“Full$Fidelity”$ New Data New Insights Governance Refinery
  • 19. @joe_Caserta#DataSummit Global economics Intensity of competition Reduce costs Move to cross-functional teams New executive leadership Speed of technical change Social trends and changes Period of time in present role Status & perks of office/dept under threat No apparent reasons for proposed changes Lack of understanding of proposed changes Fear of inability to cope with new technology Concern over job security Forces for Change Forces Resisting Change Status Quo Moving the Status Quo http://www.change-management-coach.com/force-field-analysis.html
  • 20. @joe_Caserta#DataSummit Introducing the Chief Data Officer • Evangelize a data vision for the organization • Support & enforce data governance policies via outreach, training & tools • Monitor and enforce data quality in collaboration with data owners • Monitor and enforce data security along with Legal/Security/Compliance • Work with IT to develop/maintain an enterprise repository of strategic data • Set standards for analytical reporting and generate data insights • Provide a single point of accountability for data initiatives and issues • Innovate ways to use existing data • Enrich and augment data by combining internal and external sources • Support efficient and agile analytics through training and templates
  • 21. @joe_Caserta#DataSummit The CDO: The Whole Brain Challenge Front Back Analytics Oriented • Data Science • Research Process Oriented • Data Governance • Compliance Operations Oriented • Shared Services • Data Engineering Revenue Oriented • Revenue Goals • Monetizing Data
  • 22. @joe_Caserta#DataSummit Chief Data Organization (Oversight) Vertical Business Area [Sales/Finance/Marketing/Operations/Customer Svc] Product Owner SCRUM Master Agile Development Team Business Subject Matter Expertise Data Librarian/Data Stewardship Data Science/ Statistical Skills Data Engineering / Architecture Presentation/ BI Report Development Skills Data Quality Assurance DevOps IT Organization (Oversight) Enterprise Data Architect Solution Engineers Data Integration Practice User Experience Practice QA Practice Operations Practice Advanced Analytics Business Analysts Data Analysts Data Scientists Statisticians Data Engineers Planning Organization Project Managers Data Organization Data Gov Coordinator Data Librarians Data Stewards Agile Data Teams
  • 23. @joe_Caserta#DataSummit Caution: Assembly Required  Some of the most hopeful tools are brand new or in incubation  Enterprise big data implementations typically combine products with custom built components The Buildout People, Processes and Business commitment are still critical! Data Integration & Quality Data Catalog & Governance Emerging Solutions
  • 24. @joe_Caserta#DataSummit What the Future Holds • DevOps for Analytics • Search-Based BI (NLP) • Artificial Intelligence (AI) • Virtual Reality BI (VR) • Virtual Assistant BI (Voice) • Reporting/Predictions Converge • Citizen Data Scientists Emerge
  • 25. @joe_Caserta#DataSummit Joe Caserta President, Caserta Concepts joe@casertaconcepts.com @joe_caserta Thank You!

Editor's Notes

  1. Capture, Analyze, influence, and maximize every touchpoint online and offline
  2. Ask DG effectiveness questions.
  3. Recent article - Oct 21, 2015
  4. 80% of all business are doing something
  5. The paradigm shift is in the way we onboard and process data: Formerly, we structured data before we would ingest and analyze it, Now, we ingest and analyze data, and then structure it. This allows immediate access for both analysts and data scientists Streamlines the path to cash register We have also moved from fixed capacity to on-demand infrastructure Large datasets and new datasets are being added at a rapid rate They could grow or shrink on demand; many of the providers are startups This minimizes the cost of operation From Monolith to Ecosystem No one set of tools will solve everything Use a diverse set of technologies, and let them evolve over time Solve for this using a combination of three concepts: Cloud Computing, Data lake, and the Polyglot Warehouse.
  6. Data has different audience and usage patterns each tier. All tiers work cohesively to comprise the Big Data Ecosystem All tiers are governed. Only the top tier is fully governed When to use late bind, decided when to structure on case by case. 7 components of gov: Org, Metadata, Security, DQ, Business Integration, MDM, ILM Organization This is the ‘people’ part. Establishing Enterprise Data Council, Data Stewards, etc. Metadata Definitions, lineage (where does this data come from), business definitions, technical metadata Privacy/Security Identify and control sensitive data, regulatory compliance Data Quality and Monitoring Data must be complete and correct. Measure, improve, certify Business Process Integration Policies around data frequency, source availability, etc. Master Data Management Ensure consistent business critical data i.e. Members, Providers, Agents, etc. Information Lifecycle Management (ILM) Data retention, purge schedule, storage/archiving
  7. https://clutch.co/cloud/resources/amazon-web-services-vs-google-cloud-platform-vs-microsoft-azure
  8. “Big Box” tools vs ROI? Prohibitively expensive  limited by licensing $$$ Typically limited to the scalability of a single server
  9. Cascading, Zementis
  10. I’ve been doing it this way for 15 years. It works, don’t mess with it! People must learn: Evolution is inevitable. Evolve or die.
  11. Kurt Lewin’s Force Field analysis
  12. Data Governance Data Insight Generate Revenue Reduce Risk
  13. Over the course of my 30-year career, more change has occurred in the last three years, than in the previous 27 combined. This has been the most disruptive period in data science that I’ve seen.