SlideShare a Scribd company logo
1 of 34
Download to read offline
Data Pipelines in the
Enterprise and Comparison
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group Inc 5000
@williammcknight
www.mcknightcg.com
(214) 514-1444
Proprietary + Confidential
Powering Data
Experiences
to Drive Growth
Joel McKelvey,
Looker Product Management
Google Cloud
Proprietary + Confidential
1
https://www.forrester.com/report/InsightsDriven+Businesses+Set+The+Pace+For+Global+Growth/-/E-RES130848
“Insights-driven businesses
harness and implement digital
insights strategically and at scale
to drive growth and create
differentiating experiences,
products, and services.”
7x
Faster growth than
global GDP
30%
Growth or more using
advanced analytics in a
transformational way
2.3x
More likely to succeed
during disruption
Proprietary + Confidential
*Source: https://emtemp.gcom.cloud/ngw/globalassets/en/information-technology/documents/trends/gartner-2019-cio-agenda-key-takeaways.pdf
Rebalance Your Technology Portfolio Toward
Digital Transformation
Gartner: Digital-fueled growth is the top investment priority for technology leaders.*
Percent of respondents
increasing investment
Percent of respondents
decreasing investment
Cyber/information security 40%
1%
Cloud services or solutions (Saas, Paa5, etc.) 33%
2%
Core system improvements/transformation 31%
10%
How to implement product-centric delivery (by percentage of respondents)
Business Intelligence or data analytics solution 45%
1%
Digital
Transformation
Proprietary + Confidential
Governed metrics | Best-in-class APIs | In-database | Git version-control | Security | Cloud
Integrated Insights
Sales reps enter discussions
equipped with more context
and usage data embedded
within Salesforce.
Data-driven Workflows
Reduce customer churn
with automated email
campaigns if customer
health drops
Custom Applications
Maintain optimal inventory levels
and pricing with merchandising
and supply chain management
application
Modern BI & Analytics
Self-service analytics for
install operations, sales
pipeline management,
and customer operations
SQL In Results Back
Proprietary + Confidential
‘API-first’ extensibility
A Modern Approach
Semantic modeling layer
In-database architecture
Built on the cloud
strategy of your choice
Proprietary + Confidential
1 in 2
customers integrate
insights/experiences beyond
Looker
2000+
Customers
5000+
Developers
Empower People with the Smarter Use of Data
Proprietary + Confidential
Thank you
Data Pipelines in the
Enterprise and Comparison
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group Inc 5000
@williammcknight
www.mcknightcg.com
(214) 514-1444
Data Integration is about Leveraging Data
• Enterprises and organizations from every industry and scale are working to leverage data to
achieve their strategic objectives.
• To be able to leverage data, it must be efficiently accessed, combined, governed and effectively
managed.
• While the pain caused by a lack of mature data integration and management may be related to
people or process, we find that enterprises have not adopted the right product set or are stuck in
outdated tools
– Often because the technical debt that has accumulated from years of workarounds and gap-fixing existing
processes is too “expensive” to simply rip and replace with more capable modern tools.
• These modern data integration and management platforms are evolving with emerging
technologies.
Data Integration Market Capabilities
• Cloud Native
• Intelligently driven automation suggests and generates new data pipelines between source and
target without manually mapping or design
• Data Orchestration
– Managing the ebb and flow of data throughout the ecosystem,
– Determining what datum is analyzed/stored at the origin,
– Determining what data is moved upstream and at what granularity and state
– Moving, integrating and updating data, metadata and master data and machine learning models as evolution
happens at the core
• Trust created through transparency and understanding
– Modern data management and integration platforms give organizations holistic views of their enterprise data and
deep, thorough lineage paths to show how critical data traces back to a trusted, primary source
• Able to dynamically, and even automatically, scale to meet the increasing complexity and
concurrency demands of query executions.
3
Data Integration Platforms
Capabilities for Cloud Data Integration
Comprehensive Native Connectivity
• Native connectivity to all types of enterprise data and SaaS solutions
– multi-cloud or on-premises
– the associated functionality with on-boarding a new data source
• Leverage technology specific data access API’s when available instead of using generic protocols
(e.g. JDBC, ODBC)
– APIs should allow for better use of data fabric and orchestration, as well as updating of sources and Edge ML
scoring models
• Includes the ability to mass ingest the data, regardless of data type or ingestion method (batch,
real-time, streaming, change data capture)
• Able to parse the data structure automatically and adapt to any structure change in the source
on-the-fly
6
Multi-Latency Data Ingestion
• The selection of batch, real-time, streaming, change data capture, or a combination, in a source-
target mapping, is based on both source and target dynamics.
– Sources may or may not be able to yield data on a real-time, change data basis.
– Occasionally, the slower flow of data is preferable and fast data may be an unnecessary technical challenge.
• Such decisions are often made after project commencement and the enterprise needs to be able
to adjust the ingestion modality with agility.
7
Data Integration Patterns
• Data integration must support various integration patterns (ETL, ELT and streaming)
• Should be a visual code-free development environment
• Data Integration pipelines should include event-based and bulk processing in batch, real-time,
and stream processing where the entire data pipeline is processed as a stream that can be
infused with data at rest
• Leverage templates and wizards, and in some cases AI, to guide data integration development
8
Data Quality and Data Governance
• Enterprise data governance programs need an outlet to define and enforce the data quality rules
per corporate policies and industry regulations
• Built-in ability to cleanse, enrich and standardize any data from multi-cloud and on-premises
sources for all use cases and workloads
• Capabilities to discover, classify and document data elements and, based on the data profile,
automatically build the applicable data quality rules based on data governance policies
• Data marketplace, from where syndicated data (internal and external) can be easily sourced into
the enterprise
– Access to a wide variety of governed data, in an Amazon-like shopping experience, democratizes data such that it
is easily accessible, secure and trusted
9
Data Cataloging and Metadata Management
• Cataloging data is a recommended first step to discovering and understanding your enterprise data so that you
can extract the most value for your data-driven initiatives
• Able to connect and scan metadata for all types of databases, SaaS apps, ETL tools, BI tools, and more, to
provide complete and detailed data lineage
• Have the ability to gather and leverage sensor metadata - for example, to understand if the change in a data
stream is due to a sensor being replaced or is leading to a failure mode
• Automatically classify, tag, and enrich metadata using AI and machine learning
– Identify data domains, similar data sets and data relationships which may be of interest
• Some data management platforms automatically build data pipelines when data consumers browse for data in
the catalog and select a destination
– All a user needs to do is search and find the data
• A data catalog should be able to recommend data sets and rank the search results in terms of usefulness,
making it much easier to ingest data into a cloud data warehouse or data lake
10
Enterprise Trust at Scale
• Certify to industry standards such as SOC, SOC2, HIPAA, ISO, Privacy Shield, regional fencing
requirements and other certifications
• Detect sensitive data and protect against breaches
• Fit DevOps standards
– High availability and recovery
– Ease of deployment
– Maintenance and other DevOps standards
11
Artificial Intelligence and Automation
• Use of and commitment to AI and machine learning to automate much of the work as relates to data
integration, data quality, data cataloging, and data governance
• Ability to rapidly build, deploy and maintain data is now dependent on the degree of intelligence and AI/ML
automation
• AI/ML should be used extensively for data discovery, similarity, matching, classification, schema inference, and
lineage inference, business term association, and much more
• A data pipeline should be self-integrating, self-healing, and self-tuning with anomaly detection
• It should leverage natural language processing for generating rules from text data
12
Ecosystem and Multi-cloud
• Should be hybrid and operable on the three major cloud service
provider platforms (Microsoft Azure, Amazon Web Services,
Google Cloud Platform), as well on-premises
• Financial strength, company strategic partnerships and
implementation partnerships
13
Vendor Analysis
AWS Glue
• Serverless, low-code ETL Platform
• Generates Python or Scala
– Code can be edited in AWS Console, an IDE or notebook
• Billing when resources are running
• Coupled with Glue Data Catalog central metadata repository
• Uses data “crawlers” that automatically detect and infer schemas from many data sources, data
types, and across various types of partitions
– Stores them centrally
• Supports Streaming ETL
– Consumes data from streaming data engines, like Amazon Kinesis and Apache Kafka, cleans
and transforms the data in-flight, and continuously loads the stream into Amazon S3 data
lakes, data warehouses, or other data sinks
15
Azure Data Factory
• Allows you to create workflows for orchestrating data movement and transforming data between
Azure resources
• Creates and schedules pipelines that ingest data from disparate data stores and use ETL-like
processes that transform data visually with data flows or by using compute services such as Azure
HDInsight Hadoop, Azure Databricks, and Azure SQL Database
• The most basic ADF function is the Copy Data pipeline, which has the broadest reach in terms of
source and target variety
– Wrangling Data Flows which is a code-free data exploration, preparation, and validation
service that allows citizen data integrators to build pipelines for their specific uses using
Power Query syntax
• Strong suit is data replication
16
Cloudera
• Broad set of enterprise data cloud services
• CDP provides Replication Manager, a service for copying and migrating data as needed between
environments within CDP and from legacy Cloudera distributions CDH and HDP
• Cloudera inherited NiFi from the Hortonworks acquisition where it was Hortonworks Data Flow
• Nifi is a general-purpose enterprise data integration tool that is fast, easy and secure
• NiFi is often found in a stack with Kafka
• Cloudera Shared Data Experience (SDX) provides seamless and transparent data access and
governance policies for all workloads
• Flink, which Cloudera also provides the engineering for, is a specialist tool for advanced low-
latency stateful stream processing and streaming analytics
17
Fivetran
• Designed as a low setup, no configuration, and no maintenance
data pipeline to lift data from operational sources and deliver it
to a modern cloud data warehouse
• Well-suited for popular cloud applications like Salesforce, Stripe,
and Marketa
• Their transformations are SQL-based rules written by business
users and set to run on a schedule or as new data becomes
available
18
Google
• Alooma Data pipeline tool
• Google Cloud Data Fusion is the foundation of Google Data Integration and offers 200+ adapters
that range from RDBMS, Messaging (legacy and modern), mainframes, applications, SaaS
(SFDC/SAP HANA etc.), Big Data and NoSQL systems
– Data Fusion offers connectivity to 3rd party cloud systems such as AWS S3
• Cloud Data Fusion supports batch, real-time streaming and change data capture for Batch and
Streaming integrations
• Google provides a Data Governance platform for structured data
• Cloud Data Fusion has a built-in metadata repository and some basic metadata management
capabilities for data integration
• Dataflow offers integration with Privacy APIs (in Cloud DLP), and real-time predictions using
Video, NLP, and Image APIs in Cloud AI Platform
19
IBM
• IBM Information Server for IBM Cloud Pak for Data, which includes
capabilities for integration, quality, and governance
• IBM InfoSphere Information Server is available as a managed service on any
cloud or self-managed for customer infrastructure
• You pick the latency style
• There is a shared catalog called the Knowledge Catalog
• Machine learning capabilities are used in automatic generation of reusable
data integration templates with “recipes”, schema propagation and assisted
data discovery, classification and PII data identification
20
Informatica
• The Intelligent Data Platform includes native connectivity, data ingestion, data integration, data
quality, data governance, data cataloging, data privacy and master data management
• Commitment to artificial intelligence is with its AI, CLAIRE engine, permeating all products: such
as automation and pipeline augmentation, metadata discovery, and lineage
• High performant and scalable native connectivity to a broad set of data (on- premises and cloud)
for all types of data (structured, semi-structured, streaming) with advanced security and
encryption
• Cloud Mass Ingestion ingest data from a variety of data
• Codeless, template and wizard-based development environment
• Informatica Enterprise Data Catalog can scan third party catalogs to integrate their metadata
content with the rest of the enterprise metadata
21
Matillion
• Has a migrate function to move or clone business logic and resources, change
data capture, queue messaging, and pub-sub
• Dynamic ETL (such as job variables and iterators) plus data orchestration and
staging mechanisms (such as functions to make real-time adjustments if jobs
run long or fail)
• Automation in pipeline building
• There’s also an entry-level tool called Matillion Data Loader that can be easily
leveraged by citizen data integrators and business analysts
22
Talend
• Talend is an entire data ecosystem, cloud-independent play
• High on governed analytics, crowdsourced knowledge, cloud migrations,
compliance and privacy (GDPR, HIPAA, CCPA, et cetera), data auditing, and
360-degree data hubs (Customer, Product, Employee 360°)
• Embedded machine learning throughout the Talend Data Fabric with
suggested assistance, advanced pattern discovery, automated delivery, data
scoring, and supervised machine learning
• Also Stitch data prep tool
23
Key Takeaways & Application to the Enterprise
• In the heterogenous vendor category, we have 4 vendors: Cloudera, IBM, Informatica, and Talend
• Cloudera, with NiFi, Flink and Sqoop, presents a compelling enterprise option for data fabrics and pipelines and
will grow as these environments proliferate
• Specialist providers are appropriate for a limited scope – either from an associated technology standpoint
within a single cloud service provider or, in the case of FiveTran and Matillion, from a project scope standpoint
• Google’s data management and integration capabilities are high end market and would be a strong option for
any GCP project
• AWS (Glue) is a great option for AWS projects
• Oracle data integration is a strong lock for Oracle projects
• It is fully expected that enterprise environments would have a heterogenous vendor as well as a data
preparation vendor like FiveTran, Matillion or Stitch from Talend
Get the Report
cutt.ly/dataint
Data Pipelines in the
Enterprise and Comparison
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group Inc 5000
@williammcknight
www.mcknightcg.com
(214) 514-1444

More Related Content

What's hot

Slides: How AI Makes Analytics More Human
Slides: How AI Makes Analytics More HumanSlides: How AI Makes Analytics More Human
Slides: How AI Makes Analytics More HumanDATAVERSITY
 
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...DATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsDATAVERSITY
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringDATAVERSITY
 
Data-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementData-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementDATAVERSITY
 
DataEd Slides: Data Modeling is Fundamental
DataEd Slides:  Data Modeling is FundamentalDataEd Slides:  Data Modeling is Fundamental
DataEd Slides: Data Modeling is FundamentalDATAVERSITY
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMDATAVERSITY
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsDATAVERSITY
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaSlides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaDATAVERSITY
 
Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Mark Hewitt
 
The Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big InformationThe Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big InformationDATAVERSITY
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDATAVERSITY
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesLars E Martinsson
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitMing Yuan
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityDATAVERSITY
 

What's hot (20)

Slides: How AI Makes Analytics More Human
Slides: How AI Makes Analytics More HumanSlides: How AI Makes Analytics More Human
Slides: How AI Makes Analytics More Human
 
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
ADV Slides: The Data Needed to Evolve an Enterprise Artificial Intelligence S...
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling Fundamentals
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern Engineering
 
Data-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementData-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content Management
 
DataEd Slides: Data Modeling is Fundamental
DataEd Slides:  Data Modeling is FundamentalDataEd Slides:  Data Modeling is Fundamental
DataEd Slides: Data Modeling is Fundamental
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDM
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in KafkaSlides: Why You Need End-to-End Data Quality to Build Trust in Kafka
Slides: Why You Need End-to-End Data Quality to Build Trust in Kafka
 
Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...
 
The Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big InformationThe Need to Know for Information Architects: Big Data to Big Information
The Need to Know for Information Architects: Big Data to Big Information
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummit
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
 

Similar to ADV Slides: Data Pipelines in the Enterprise and Comparison

The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationDenodo
 
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentDenodo
 
How a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 ViewHow a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 ViewDenodo
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
Digital intelligence satish bhatia
Digital intelligence satish bhatiaDigital intelligence satish bhatia
Digital intelligence satish bhatiaSatish Bhatia
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User InformationDenodo
 
data_blending
data_blendingdata_blending
data_blendingsubit1615
 
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerEnsuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerMolly Alexander
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationDatabricks
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Jeffrey T. Pollock
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Denodo
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudPrecisely
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AIGary Allemann
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 

Similar to ADV Slides: Data Pipelines in the Enterprise and Comparison (20)

The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data Virtualization
 
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data Environment
 
How a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 ViewHow a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 View
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Digital intelligence satish bhatia
Digital intelligence satish bhatiaDigital intelligence satish bhatia
Digital intelligence satish bhatia
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
 
data_blending
data_blendingdata_blending
data_blending
 
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerEnsuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with Alation
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Recently uploaded (20)

Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

ADV Slides: Data Pipelines in the Enterprise and Comparison

  • 1. Data Pipelines in the Enterprise and Comparison Presented by: William McKnight “#1 Global Influencer in Data Warehousing” Onalytica President, McKnight Consulting Group Inc 5000 @williammcknight www.mcknightcg.com (214) 514-1444
  • 2. Proprietary + Confidential Powering Data Experiences to Drive Growth Joel McKelvey, Looker Product Management Google Cloud
  • 3. Proprietary + Confidential 1 https://www.forrester.com/report/InsightsDriven+Businesses+Set+The+Pace+For+Global+Growth/-/E-RES130848 “Insights-driven businesses harness and implement digital insights strategically and at scale to drive growth and create differentiating experiences, products, and services.” 7x Faster growth than global GDP 30% Growth or more using advanced analytics in a transformational way 2.3x More likely to succeed during disruption
  • 4. Proprietary + Confidential *Source: https://emtemp.gcom.cloud/ngw/globalassets/en/information-technology/documents/trends/gartner-2019-cio-agenda-key-takeaways.pdf Rebalance Your Technology Portfolio Toward Digital Transformation Gartner: Digital-fueled growth is the top investment priority for technology leaders.* Percent of respondents increasing investment Percent of respondents decreasing investment Cyber/information security 40% 1% Cloud services or solutions (Saas, Paa5, etc.) 33% 2% Core system improvements/transformation 31% 10% How to implement product-centric delivery (by percentage of respondents) Business Intelligence or data analytics solution 45% 1% Digital Transformation
  • 5. Proprietary + Confidential Governed metrics | Best-in-class APIs | In-database | Git version-control | Security | Cloud Integrated Insights Sales reps enter discussions equipped with more context and usage data embedded within Salesforce. Data-driven Workflows Reduce customer churn with automated email campaigns if customer health drops Custom Applications Maintain optimal inventory levels and pricing with merchandising and supply chain management application Modern BI & Analytics Self-service analytics for install operations, sales pipeline management, and customer operations SQL In Results Back
  • 6. Proprietary + Confidential ‘API-first’ extensibility A Modern Approach Semantic modeling layer In-database architecture Built on the cloud strategy of your choice
  • 7. Proprietary + Confidential 1 in 2 customers integrate insights/experiences beyond Looker 2000+ Customers 5000+ Developers Empower People with the Smarter Use of Data
  • 9. Data Pipelines in the Enterprise and Comparison Presented by: William McKnight “#1 Global Influencer in Data Warehousing” Onalytica President, McKnight Consulting Group Inc 5000 @williammcknight www.mcknightcg.com (214) 514-1444
  • 10. Data Integration is about Leveraging Data • Enterprises and organizations from every industry and scale are working to leverage data to achieve their strategic objectives. • To be able to leverage data, it must be efficiently accessed, combined, governed and effectively managed. • While the pain caused by a lack of mature data integration and management may be related to people or process, we find that enterprises have not adopted the right product set or are stuck in outdated tools – Often because the technical debt that has accumulated from years of workarounds and gap-fixing existing processes is too “expensive” to simply rip and replace with more capable modern tools. • These modern data integration and management platforms are evolving with emerging technologies.
  • 11. Data Integration Market Capabilities • Cloud Native • Intelligently driven automation suggests and generates new data pipelines between source and target without manually mapping or design • Data Orchestration – Managing the ebb and flow of data throughout the ecosystem, – Determining what datum is analyzed/stored at the origin, – Determining what data is moved upstream and at what granularity and state – Moving, integrating and updating data, metadata and master data and machine learning models as evolution happens at the core • Trust created through transparency and understanding – Modern data management and integration platforms give organizations holistic views of their enterprise data and deep, thorough lineage paths to show how critical data traces back to a trusted, primary source • Able to dynamically, and even automatically, scale to meet the increasing complexity and concurrency demands of query executions. 3
  • 13. Capabilities for Cloud Data Integration
  • 14. Comprehensive Native Connectivity • Native connectivity to all types of enterprise data and SaaS solutions – multi-cloud or on-premises – the associated functionality with on-boarding a new data source • Leverage technology specific data access API’s when available instead of using generic protocols (e.g. JDBC, ODBC) – APIs should allow for better use of data fabric and orchestration, as well as updating of sources and Edge ML scoring models • Includes the ability to mass ingest the data, regardless of data type or ingestion method (batch, real-time, streaming, change data capture) • Able to parse the data structure automatically and adapt to any structure change in the source on-the-fly 6
  • 15. Multi-Latency Data Ingestion • The selection of batch, real-time, streaming, change data capture, or a combination, in a source- target mapping, is based on both source and target dynamics. – Sources may or may not be able to yield data on a real-time, change data basis. – Occasionally, the slower flow of data is preferable and fast data may be an unnecessary technical challenge. • Such decisions are often made after project commencement and the enterprise needs to be able to adjust the ingestion modality with agility. 7
  • 16. Data Integration Patterns • Data integration must support various integration patterns (ETL, ELT and streaming) • Should be a visual code-free development environment • Data Integration pipelines should include event-based and bulk processing in batch, real-time, and stream processing where the entire data pipeline is processed as a stream that can be infused with data at rest • Leverage templates and wizards, and in some cases AI, to guide data integration development 8
  • 17. Data Quality and Data Governance • Enterprise data governance programs need an outlet to define and enforce the data quality rules per corporate policies and industry regulations • Built-in ability to cleanse, enrich and standardize any data from multi-cloud and on-premises sources for all use cases and workloads • Capabilities to discover, classify and document data elements and, based on the data profile, automatically build the applicable data quality rules based on data governance policies • Data marketplace, from where syndicated data (internal and external) can be easily sourced into the enterprise – Access to a wide variety of governed data, in an Amazon-like shopping experience, democratizes data such that it is easily accessible, secure and trusted 9
  • 18. Data Cataloging and Metadata Management • Cataloging data is a recommended first step to discovering and understanding your enterprise data so that you can extract the most value for your data-driven initiatives • Able to connect and scan metadata for all types of databases, SaaS apps, ETL tools, BI tools, and more, to provide complete and detailed data lineage • Have the ability to gather and leverage sensor metadata - for example, to understand if the change in a data stream is due to a sensor being replaced or is leading to a failure mode • Automatically classify, tag, and enrich metadata using AI and machine learning – Identify data domains, similar data sets and data relationships which may be of interest • Some data management platforms automatically build data pipelines when data consumers browse for data in the catalog and select a destination – All a user needs to do is search and find the data • A data catalog should be able to recommend data sets and rank the search results in terms of usefulness, making it much easier to ingest data into a cloud data warehouse or data lake 10
  • 19. Enterprise Trust at Scale • Certify to industry standards such as SOC, SOC2, HIPAA, ISO, Privacy Shield, regional fencing requirements and other certifications • Detect sensitive data and protect against breaches • Fit DevOps standards – High availability and recovery – Ease of deployment – Maintenance and other DevOps standards 11
  • 20. Artificial Intelligence and Automation • Use of and commitment to AI and machine learning to automate much of the work as relates to data integration, data quality, data cataloging, and data governance • Ability to rapidly build, deploy and maintain data is now dependent on the degree of intelligence and AI/ML automation • AI/ML should be used extensively for data discovery, similarity, matching, classification, schema inference, and lineage inference, business term association, and much more • A data pipeline should be self-integrating, self-healing, and self-tuning with anomaly detection • It should leverage natural language processing for generating rules from text data 12
  • 21. Ecosystem and Multi-cloud • Should be hybrid and operable on the three major cloud service provider platforms (Microsoft Azure, Amazon Web Services, Google Cloud Platform), as well on-premises • Financial strength, company strategic partnerships and implementation partnerships 13
  • 23. AWS Glue • Serverless, low-code ETL Platform • Generates Python or Scala – Code can be edited in AWS Console, an IDE or notebook • Billing when resources are running • Coupled with Glue Data Catalog central metadata repository • Uses data “crawlers” that automatically detect and infer schemas from many data sources, data types, and across various types of partitions – Stores them centrally • Supports Streaming ETL – Consumes data from streaming data engines, like Amazon Kinesis and Apache Kafka, cleans and transforms the data in-flight, and continuously loads the stream into Amazon S3 data lakes, data warehouses, or other data sinks 15
  • 24. Azure Data Factory • Allows you to create workflows for orchestrating data movement and transforming data between Azure resources • Creates and schedules pipelines that ingest data from disparate data stores and use ETL-like processes that transform data visually with data flows or by using compute services such as Azure HDInsight Hadoop, Azure Databricks, and Azure SQL Database • The most basic ADF function is the Copy Data pipeline, which has the broadest reach in terms of source and target variety – Wrangling Data Flows which is a code-free data exploration, preparation, and validation service that allows citizen data integrators to build pipelines for their specific uses using Power Query syntax • Strong suit is data replication 16
  • 25. Cloudera • Broad set of enterprise data cloud services • CDP provides Replication Manager, a service for copying and migrating data as needed between environments within CDP and from legacy Cloudera distributions CDH and HDP • Cloudera inherited NiFi from the Hortonworks acquisition where it was Hortonworks Data Flow • Nifi is a general-purpose enterprise data integration tool that is fast, easy and secure • NiFi is often found in a stack with Kafka • Cloudera Shared Data Experience (SDX) provides seamless and transparent data access and governance policies for all workloads • Flink, which Cloudera also provides the engineering for, is a specialist tool for advanced low- latency stateful stream processing and streaming analytics 17
  • 26. Fivetran • Designed as a low setup, no configuration, and no maintenance data pipeline to lift data from operational sources and deliver it to a modern cloud data warehouse • Well-suited for popular cloud applications like Salesforce, Stripe, and Marketa • Their transformations are SQL-based rules written by business users and set to run on a schedule or as new data becomes available 18
  • 27. Google • Alooma Data pipeline tool • Google Cloud Data Fusion is the foundation of Google Data Integration and offers 200+ adapters that range from RDBMS, Messaging (legacy and modern), mainframes, applications, SaaS (SFDC/SAP HANA etc.), Big Data and NoSQL systems – Data Fusion offers connectivity to 3rd party cloud systems such as AWS S3 • Cloud Data Fusion supports batch, real-time streaming and change data capture for Batch and Streaming integrations • Google provides a Data Governance platform for structured data • Cloud Data Fusion has a built-in metadata repository and some basic metadata management capabilities for data integration • Dataflow offers integration with Privacy APIs (in Cloud DLP), and real-time predictions using Video, NLP, and Image APIs in Cloud AI Platform 19
  • 28. IBM • IBM Information Server for IBM Cloud Pak for Data, which includes capabilities for integration, quality, and governance • IBM InfoSphere Information Server is available as a managed service on any cloud or self-managed for customer infrastructure • You pick the latency style • There is a shared catalog called the Knowledge Catalog • Machine learning capabilities are used in automatic generation of reusable data integration templates with “recipes”, schema propagation and assisted data discovery, classification and PII data identification 20
  • 29. Informatica • The Intelligent Data Platform includes native connectivity, data ingestion, data integration, data quality, data governance, data cataloging, data privacy and master data management • Commitment to artificial intelligence is with its AI, CLAIRE engine, permeating all products: such as automation and pipeline augmentation, metadata discovery, and lineage • High performant and scalable native connectivity to a broad set of data (on- premises and cloud) for all types of data (structured, semi-structured, streaming) with advanced security and encryption • Cloud Mass Ingestion ingest data from a variety of data • Codeless, template and wizard-based development environment • Informatica Enterprise Data Catalog can scan third party catalogs to integrate their metadata content with the rest of the enterprise metadata 21
  • 30. Matillion • Has a migrate function to move or clone business logic and resources, change data capture, queue messaging, and pub-sub • Dynamic ETL (such as job variables and iterators) plus data orchestration and staging mechanisms (such as functions to make real-time adjustments if jobs run long or fail) • Automation in pipeline building • There’s also an entry-level tool called Matillion Data Loader that can be easily leveraged by citizen data integrators and business analysts 22
  • 31. Talend • Talend is an entire data ecosystem, cloud-independent play • High on governed analytics, crowdsourced knowledge, cloud migrations, compliance and privacy (GDPR, HIPAA, CCPA, et cetera), data auditing, and 360-degree data hubs (Customer, Product, Employee 360°) • Embedded machine learning throughout the Talend Data Fabric with suggested assistance, advanced pattern discovery, automated delivery, data scoring, and supervised machine learning • Also Stitch data prep tool 23
  • 32. Key Takeaways & Application to the Enterprise • In the heterogenous vendor category, we have 4 vendors: Cloudera, IBM, Informatica, and Talend • Cloudera, with NiFi, Flink and Sqoop, presents a compelling enterprise option for data fabrics and pipelines and will grow as these environments proliferate • Specialist providers are appropriate for a limited scope – either from an associated technology standpoint within a single cloud service provider or, in the case of FiveTran and Matillion, from a project scope standpoint • Google’s data management and integration capabilities are high end market and would be a strong option for any GCP project • AWS (Glue) is a great option for AWS projects • Oracle data integration is a strong lock for Oracle projects • It is fully expected that enterprise environments would have a heterogenous vendor as well as a data preparation vendor like FiveTran, Matillion or Stitch from Talend
  • 34. Data Pipelines in the Enterprise and Comparison Presented by: William McKnight “#1 Global Influencer in Data Warehousing” Onalytica President, McKnight Consulting Group Inc 5000 @williammcknight www.mcknightcg.com (214) 514-1444