SlideShare a Scribd company logo
1 of 19
1© Cloudera, Inc. All rights reserved.
The Future of Data
Warehousing:
ETL Will Never be the Same
Ralph Kimball| Founder, Kimball Group
Manish Vipani| Vice President and Chief Architect,
Kaiser Permanente
2© Cloudera, Inc. All rights reserved.
Hadoop’s impact on data warehousing
• Traditional DBMS stack exploded into separate layers
• Data layer: HDFS files, not curated relational tables
• Metadata layer: open extensible HCatalog, not vendor system tables
• Query layer: cottage industry of query engines, not vendor specific SQL
• Schema on Read
• Allow the query layer to decide how to consume the data
• Materialize the view later (e.g., into Parquet files) for high performance
Integration goes far beyond relational tables
• Conformed dimensions remain the glue holding together Hadoop applications
(even if you have never heard of conformed dimensions!)
3© Cloudera, Inc. All rights reserved.
The logical architecture hasn’t changed
• Original Sources  ETL Step  Exposed Presentation Data  BI Application
• BUT, the physical architecture of the back room now looks very different
4© Cloudera, Inc. All rights reserved.
Old back room
• Slow transfer from sources
• Physical transformations required
• Cleaning, normalization required
• Mandated RDBMS table targets
• Metadata limited to system tables
• Presentation layer vendor mandated
• Single focus: RDBMS SQL only
New back room
• Purpose built for high transfer rates
• Physical transformations optional
• Cleaning, normalization discouraged
• Table targets optional or deferred
• Extensible metadata via HCatalog
• Presentation layer open ended
• Before or after any transformations
• Analytic client specific
• Multiple simultaneous personalities
The old and new back rooms
5© Cloudera, Inc. All rights reserved.
Old back room
• Off limits except to ETL staff
• “we aren’t ready”
• “the data must be cleaned”
• “data governance trumps”
• “end users not trusted”
• Traditional IT control
New back room
• Doors open to
• Qualified analytic users
• Automated processes
• Experiments, model building
• Clients other than SQL
• Open data marketplace
The biggest change to the back room
6© Cloudera, Inc. All rights reserved.
The Landing Zone
at Kaiser Permanente
Implementing the new ETL approach in the real world.
A unified data repository for secure and trusted data.
7© Cloudera, Inc. All rights reserved.
Landing Zone
Landing Zone – Home to secure and organized data
• A self service data platform hosting both the raw and prepared data sets for quick business
consumption to drive advanced business insights and decisions.
• Allow seamless data access for authorized users across enterprise business functions.
• Data is organized by domains/use cases in Raw and Refined zone.
• Perimeter security with data encrypted at rest.
• Kerberized with integration to identity and Access Management system.
Parts of Landing Zone
• Raw Zone -> Exact replica of source data.
• Refined Zone -> Transformed prepared data sets organized by use cases.
• User Defined Space -> Secure and common access to raw and trusted data.
• Master Data, Metadata, Internal Reference Data, Industry Reference Data, etc…
8© Cloudera, Inc. All rights reserved.
Landing Zone
SQL Java PIGHIVE
Replicate
Data Selection
Python
Source
Data
Exploratory Intelligence
A MRD
Analyze MineRefineDiscover
E
DW/DM
L
Data Extract
Role Based
Access Control
Perimeter
Security
Data Registry (Tags & Catalog)
Internal Reference Data
Meta Data
Industry Reference Data
HDFS
Master Data
Raw
Zone
User
Defined
Space
Refined
Zone
Usage Data
All Data Encrypted @ Rest
Access
Authentication
Data Load
Extract-
Load
Copy
Landing Zone – A Self Service Data Platform
hosting both the raw and prepared data sets for
quick business consumption.
 Data Security –
 Deployed on secured network with
traffic monitoring.
 Data is encrypted at rest.
 Role based access and authorization.
 Data Organization –
 Exact replica of source data organized by
information domains in Raw Zone.
 Data organized by use cases in the
Refined Zone (transformed prepared
data sets).
 Separate area allocated to track master
data, metadata, internal reference data
& industry specific reference data sets.
Impala
9© Cloudera, Inc. All rights reserved.
The ETL Revolution Poses
Significant Challenges
Some old, some new
10© Cloudera, Inc. All rights reserved.
Old challenges we’ve seen before
• Big data world is furiously implementing stovepipes
• Good news is the excitement of new data sources and analyses
• Bad news is ignoring integration, the fix is to start over
• New departments not seen with traditional data warehousing
• Not on anyone’s radar  rolling their own systems
• Unusual business user profiles, latency demands, security lapses
• Big speed bumps when replacing old systems with new
• Users don’t want to switch
• New results don’t match old results
• Legacy hardware and software absurdly expensive, doesn’t scale reasonably
11© Cloudera, Inc. All rights reserved.
New challenges needing inventive approaches
• Traditional BI decision makers joined by
• Data scientists
• Roll their own ETL, hardware, OSs, programming languages
• Take results to senior management directly
• Don’t stick around for documentation, rollout, user support, maintenance
• Predictive models and modelers
• Constantly changing schemas
• Tricky integration, e.g., joining relational tables to HBase
• Automatic daemons
• Enormous, bursty demand for computing resources
12© Cloudera, Inc. All rights reserved.
Kaiser Permanente’s Pragmatic
Response to the Challenges
Pain Points:
• Lack of user transient store and structural flexibility due to slow adaption to changes.
• Lack of ability to do analytics and hypothesis testing of new data from disparate systems.
Successes:
• Over 10+ proven use cases with some early adopters.
13© Cloudera, Inc. All rights reserved.
Landing Zone use cases
Problem
• Lack insight to understand factors influencing members’ adoption and utilization
of online services.
• Lack data integration and co-relation due to disparate systems.
• Lack 3600 member service utilization view and dashboards.
Resolution
• Summarized and aggregated data sets in landing zone helps in improved
decision making.
• Faster and complete access to data at scale for metrics reporting and analytics.
• Reduced data collection & metric reporting time from 3 weeks to 10 hours.
• Ease of building “decision-centric” dashboards (8 in 3 months).
Online Member Services – “kp.org”
14© Cloudera, Inc. All rights reserved.
Landing Zone use cases cont…
Problem
• Commercial large-scale data warehouse (Teradata) repository is expensive at scale, grows
exponentially, and processes large volumes of queries/month.
• Continuing workload tuning efforts are slow to yield expected results.
Resolution
• Replicate data from Teradata into Landing Zone.
• Rewrite and tune queries to eliminate semantically equivalent queries to achieve better
performance.
Moving Traditional Data Warehouse Workload to Landing Zone
Problem
• Lack of platform to collect and correlate structured and unstructured data from consumer facing
health monitoring devices e.g.: Fitbit, Glucometer, etc.
• Clinicians cannot track members’ health or weight goals, and see usage patterns.
Resolution
• Ingest transactional data and device logs into landing zone and create analytics workspace.
• Enable clinicians to generate aggregated data for tracking member adherence and build
dashboards using native tools.
Digital Services Dashboard – “Interchange”
15© Cloudera, Inc. All rights reserved.
Landing Zone use cases cont…
Problem
• Sequential and fragmented processes having limited ability to enrich data sources to increase
accuracy.
• Lack of clinical and analytical views increases lead time to analysis and inconsistent results.
Resolution
• Ingest data from fragmented system into the Landing Zone.
• Created program-wide clinical and analytical views with refresh speed to 7 hours from 18 hours.
Common Clinical and Analytical Views
Problem
• Current Medicare reporting solution does not maintain history and requires significant effort to
recreate prior reports and perform trend analysis.
• Externally hosted CIMP systems are cost-prohibitive and difficult to scale.
Resolution
• Replicate data from 30+ source systems into Landing Zone providing access to data internally.
• Rebuild reports with improved performance that runs within reasonable time at scale.
• Proved versatility of platform to handle data at scale and created equivalent reports.
Consumer Information Management Platform – CIMP 2.0
16© Cloudera, Inc. All rights reserved.
Architectural Wrap-Up
What does all this mean?
17© Cloudera, Inc. All rights reserved.
Kaiser Permanente is a work in progress
with impressive early results,
and insights for moving forward
• Be the single source of all Kaiser’s data as well as external data leveraged by Kaiser applications,
processes, and for Kaiser decision making.
• “Learn and adapt” model provides common capabilities across rich data set, with increased agility in
provisioning new data sets.
• Enabling data profiling / tagging, semantic search, descriptive, predictive and prescriptive analytics to
drive advanced business insights and decisions.
18© Cloudera, Inc. All rights reserved.
The Back Room Landing Zone has become
a Vibrant Marketplace
• Replaces the quiet ETL back room
• Challenging (exciting) new service role for IT
• Open for business
• Data scientists  A/B testing  experimentation  prototyping
• Simultaneous ETL pipelines  aggregates, high-performance Parquet files, uploads to EDW
• Simultaneous SQL and non-SQL clients
• Immediate access
• Don’t wait for physical transformation  schema-on-read
• Purpose built for extreme I/O performance
19© Cloudera, Inc. All rights reserved.
Thank you
Ralph Kimball, ralphcollector@gmail.com
Manish Vipani, manish.x.vipani@kp.org

More Related Content

What's hot

Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationDATAVERSITY
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data GovernanceDATAVERSITY
 
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)DATAVERSITY
 
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model DATUM LLC
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architectureanicewick
 
Data Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherData Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherDATAVERSITY
 
Data Quality Management: Cleaner Data, Better Reporting
Data Quality Management: Cleaner Data, Better ReportingData Quality Management: Cleaner Data, Better Reporting
Data Quality Management: Cleaner Data, Better Reportingaccenture
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data QualityData Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data QualityDATAVERSITY
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
How to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachHow to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachPrecisely
 
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...DATAVERSITY
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
 
Building a strong Data Management capability with TOGAF and ArchiMate
Building a strong Data Management capability with TOGAF and ArchiMateBuilding a strong Data Management capability with TOGAF and ArchiMate
Building a strong Data Management capability with TOGAF and ArchiMateBas van Gils
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best PracticesBoris Otto
 
MDM for Customer data with Talend
MDM for Customer data with Talend MDM for Customer data with Talend
MDM for Customer data with Talend Jean-Michel Franco
 

What's hot (20)

Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Data Architecture for Data Governance
Data Architecture for Data GovernanceData Architecture for Data Governance
Data Architecture for Data Governance
 
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
 
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 
Data Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherData Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working Together
 
Data Quality Management: Cleaner Data, Better Reporting
Data Quality Management: Cleaner Data, Better ReportingData Quality Management: Cleaner Data, Better Reporting
Data Quality Management: Cleaner Data, Better Reporting
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Strategy
Data StrategyData Strategy
Data Strategy
 
Data Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data QualityData Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data Quality
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
How to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachHow to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First Approach
 
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy – Practical Steps for Aligning with Busi...
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
 
Building a strong Data Management capability with TOGAF and ArchiMate
Building a strong Data Management capability with TOGAF and ArchiMateBuilding a strong Data Management capability with TOGAF and ArchiMate
Building a strong Data Management capability with TOGAF and ArchiMate
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
MDM for Customer data with Talend
MDM for Customer data with Talend MDM for Customer data with Talend
MDM for Customer data with Talend
 

Viewers also liked

Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data ArchitectureZaloni
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataInside Analysis
 
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...Shahzad
 
Ods, edf, eav & global types
Ods, edf, eav & global typesOds, edf, eav & global types
Ods, edf, eav & global typesSTIinnsbruck
 
Computer science __engineering(4)
Computer science __engineering(4)Computer science __engineering(4)
Computer science __engineering(4)vasanthak2k
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...Eric Javier Espino Man
 
Data Mining: Concepts and Techniques — Chapter 2 —
Data Mining:  Concepts and Techniques — Chapter 2 —Data Mining:  Concepts and Techniques — Chapter 2 —
Data Mining: Concepts and Techniques — Chapter 2 —Salah Amean
 
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingDunn Solutions Group
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSKent Graziano
 

Viewers also liked (20)

Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data Architecture
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Hpc 4 5
Hpc 4 5Hpc 4 5
Hpc 4 5
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate Data
 
H0114857
H0114857H0114857
H0114857
 
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
 
Ods, edf, eav & global types
Ods, edf, eav & global typesOds, edf, eav & global types
Ods, edf, eav & global types
 
SAP HORTONWORKS
SAP HORTONWORKSSAP HORTONWORKS
SAP HORTONWORKS
 
Computer science __engineering(4)
Computer science __engineering(4)Computer science __engineering(4)
Computer science __engineering(4)
 
Periyar msc
Periyar mscPeriyar msc
Periyar msc
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
 
Apresentação ODS
Apresentação ODSApresentação ODS
Apresentação ODS
 
Data Mining: Concepts and Techniques — Chapter 2 —
Data Mining:  Concepts and Techniques — Chapter 2 —Data Mining:  Concepts and Techniques — Chapter 2 —
Data Mining: Concepts and Techniques — Chapter 2 —
 
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 
ETL QA
ETL QAETL QA
ETL QA
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
 

Similar to The Future of Data Warehousing is Here

The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeSaurabh K. Gupta
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Saurabh K. Gupta
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Big Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsBig Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsDavid Bennett
 
Oracle Database Appliance X5-2
Oracle Database Appliance X5-2 Oracle Database Appliance X5-2
Oracle Database Appliance X5-2 Yasir El Nimr
 
Chapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning SystemChapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning SystemMuhammad Azmy
 
PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management ArchitecturePradeep Ballal
 

Similar to The Future of Data Warehousing is Here (20)

The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data Lake
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Data warehouseold
Data warehouseoldData warehouseold
Data warehouseold
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Big Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsBig Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate Systems
 
Oracle Database Appliance X5-2
Oracle Database Appliance X5-2 Oracle Database Appliance X5-2
Oracle Database Appliance X5-2
 
Chapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning SystemChapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning System
 
PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management Architecture
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 

Recently uploaded (20)

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 

The Future of Data Warehousing is Here

  • 1. 1© Cloudera, Inc. All rights reserved. The Future of Data Warehousing: ETL Will Never be the Same Ralph Kimball| Founder, Kimball Group Manish Vipani| Vice President and Chief Architect, Kaiser Permanente
  • 2. 2© Cloudera, Inc. All rights reserved. Hadoop’s impact on data warehousing • Traditional DBMS stack exploded into separate layers • Data layer: HDFS files, not curated relational tables • Metadata layer: open extensible HCatalog, not vendor system tables • Query layer: cottage industry of query engines, not vendor specific SQL • Schema on Read • Allow the query layer to decide how to consume the data • Materialize the view later (e.g., into Parquet files) for high performance Integration goes far beyond relational tables • Conformed dimensions remain the glue holding together Hadoop applications (even if you have never heard of conformed dimensions!)
  • 3. 3© Cloudera, Inc. All rights reserved. The logical architecture hasn’t changed • Original Sources  ETL Step  Exposed Presentation Data  BI Application • BUT, the physical architecture of the back room now looks very different
  • 4. 4© Cloudera, Inc. All rights reserved. Old back room • Slow transfer from sources • Physical transformations required • Cleaning, normalization required • Mandated RDBMS table targets • Metadata limited to system tables • Presentation layer vendor mandated • Single focus: RDBMS SQL only New back room • Purpose built for high transfer rates • Physical transformations optional • Cleaning, normalization discouraged • Table targets optional or deferred • Extensible metadata via HCatalog • Presentation layer open ended • Before or after any transformations • Analytic client specific • Multiple simultaneous personalities The old and new back rooms
  • 5. 5© Cloudera, Inc. All rights reserved. Old back room • Off limits except to ETL staff • “we aren’t ready” • “the data must be cleaned” • “data governance trumps” • “end users not trusted” • Traditional IT control New back room • Doors open to • Qualified analytic users • Automated processes • Experiments, model building • Clients other than SQL • Open data marketplace The biggest change to the back room
  • 6. 6© Cloudera, Inc. All rights reserved. The Landing Zone at Kaiser Permanente Implementing the new ETL approach in the real world. A unified data repository for secure and trusted data.
  • 7. 7© Cloudera, Inc. All rights reserved. Landing Zone Landing Zone – Home to secure and organized data • A self service data platform hosting both the raw and prepared data sets for quick business consumption to drive advanced business insights and decisions. • Allow seamless data access for authorized users across enterprise business functions. • Data is organized by domains/use cases in Raw and Refined zone. • Perimeter security with data encrypted at rest. • Kerberized with integration to identity and Access Management system. Parts of Landing Zone • Raw Zone -> Exact replica of source data. • Refined Zone -> Transformed prepared data sets organized by use cases. • User Defined Space -> Secure and common access to raw and trusted data. • Master Data, Metadata, Internal Reference Data, Industry Reference Data, etc…
  • 8. 8© Cloudera, Inc. All rights reserved. Landing Zone SQL Java PIGHIVE Replicate Data Selection Python Source Data Exploratory Intelligence A MRD Analyze MineRefineDiscover E DW/DM L Data Extract Role Based Access Control Perimeter Security Data Registry (Tags & Catalog) Internal Reference Data Meta Data Industry Reference Data HDFS Master Data Raw Zone User Defined Space Refined Zone Usage Data All Data Encrypted @ Rest Access Authentication Data Load Extract- Load Copy Landing Zone – A Self Service Data Platform hosting both the raw and prepared data sets for quick business consumption.  Data Security –  Deployed on secured network with traffic monitoring.  Data is encrypted at rest.  Role based access and authorization.  Data Organization –  Exact replica of source data organized by information domains in Raw Zone.  Data organized by use cases in the Refined Zone (transformed prepared data sets).  Separate area allocated to track master data, metadata, internal reference data & industry specific reference data sets. Impala
  • 9. 9© Cloudera, Inc. All rights reserved. The ETL Revolution Poses Significant Challenges Some old, some new
  • 10. 10© Cloudera, Inc. All rights reserved. Old challenges we’ve seen before • Big data world is furiously implementing stovepipes • Good news is the excitement of new data sources and analyses • Bad news is ignoring integration, the fix is to start over • New departments not seen with traditional data warehousing • Not on anyone’s radar  rolling their own systems • Unusual business user profiles, latency demands, security lapses • Big speed bumps when replacing old systems with new • Users don’t want to switch • New results don’t match old results • Legacy hardware and software absurdly expensive, doesn’t scale reasonably
  • 11. 11© Cloudera, Inc. All rights reserved. New challenges needing inventive approaches • Traditional BI decision makers joined by • Data scientists • Roll their own ETL, hardware, OSs, programming languages • Take results to senior management directly • Don’t stick around for documentation, rollout, user support, maintenance • Predictive models and modelers • Constantly changing schemas • Tricky integration, e.g., joining relational tables to HBase • Automatic daemons • Enormous, bursty demand for computing resources
  • 12. 12© Cloudera, Inc. All rights reserved. Kaiser Permanente’s Pragmatic Response to the Challenges Pain Points: • Lack of user transient store and structural flexibility due to slow adaption to changes. • Lack of ability to do analytics and hypothesis testing of new data from disparate systems. Successes: • Over 10+ proven use cases with some early adopters.
  • 13. 13© Cloudera, Inc. All rights reserved. Landing Zone use cases Problem • Lack insight to understand factors influencing members’ adoption and utilization of online services. • Lack data integration and co-relation due to disparate systems. • Lack 3600 member service utilization view and dashboards. Resolution • Summarized and aggregated data sets in landing zone helps in improved decision making. • Faster and complete access to data at scale for metrics reporting and analytics. • Reduced data collection & metric reporting time from 3 weeks to 10 hours. • Ease of building “decision-centric” dashboards (8 in 3 months). Online Member Services – “kp.org”
  • 14. 14© Cloudera, Inc. All rights reserved. Landing Zone use cases cont… Problem • Commercial large-scale data warehouse (Teradata) repository is expensive at scale, grows exponentially, and processes large volumes of queries/month. • Continuing workload tuning efforts are slow to yield expected results. Resolution • Replicate data from Teradata into Landing Zone. • Rewrite and tune queries to eliminate semantically equivalent queries to achieve better performance. Moving Traditional Data Warehouse Workload to Landing Zone Problem • Lack of platform to collect and correlate structured and unstructured data from consumer facing health monitoring devices e.g.: Fitbit, Glucometer, etc. • Clinicians cannot track members’ health or weight goals, and see usage patterns. Resolution • Ingest transactional data and device logs into landing zone and create analytics workspace. • Enable clinicians to generate aggregated data for tracking member adherence and build dashboards using native tools. Digital Services Dashboard – “Interchange”
  • 15. 15© Cloudera, Inc. All rights reserved. Landing Zone use cases cont… Problem • Sequential and fragmented processes having limited ability to enrich data sources to increase accuracy. • Lack of clinical and analytical views increases lead time to analysis and inconsistent results. Resolution • Ingest data from fragmented system into the Landing Zone. • Created program-wide clinical and analytical views with refresh speed to 7 hours from 18 hours. Common Clinical and Analytical Views Problem • Current Medicare reporting solution does not maintain history and requires significant effort to recreate prior reports and perform trend analysis. • Externally hosted CIMP systems are cost-prohibitive and difficult to scale. Resolution • Replicate data from 30+ source systems into Landing Zone providing access to data internally. • Rebuild reports with improved performance that runs within reasonable time at scale. • Proved versatility of platform to handle data at scale and created equivalent reports. Consumer Information Management Platform – CIMP 2.0
  • 16. 16© Cloudera, Inc. All rights reserved. Architectural Wrap-Up What does all this mean?
  • 17. 17© Cloudera, Inc. All rights reserved. Kaiser Permanente is a work in progress with impressive early results, and insights for moving forward • Be the single source of all Kaiser’s data as well as external data leveraged by Kaiser applications, processes, and for Kaiser decision making. • “Learn and adapt” model provides common capabilities across rich data set, with increased agility in provisioning new data sets. • Enabling data profiling / tagging, semantic search, descriptive, predictive and prescriptive analytics to drive advanced business insights and decisions.
  • 18. 18© Cloudera, Inc. All rights reserved. The Back Room Landing Zone has become a Vibrant Marketplace • Replaces the quiet ETL back room • Challenging (exciting) new service role for IT • Open for business • Data scientists  A/B testing  experimentation  prototyping • Simultaneous ETL pipelines  aggregates, high-performance Parquet files, uploads to EDW • Simultaneous SQL and non-SQL clients • Immediate access • Don’t wait for physical transformation  schema-on-read • Purpose built for extreme I/O performance
  • 19. 19© Cloudera, Inc. All rights reserved. Thank you Ralph Kimball, ralphcollector@gmail.com Manish Vipani, manish.x.vipani@kp.org