Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Operationalizing your Data Lake: Get Ready for Advanced Analytics

Presented by Parth Patel, Big Data Solutions Engineer, Zaloni

  • Login to see the comments

Operationalizing your Data Lake: Get Ready for Advanced Analytics

  1. 1. Operationalizing your Data Lake: Get Ready for Advanced Analytics October 22nd , 2017 Parth Patel | Big Data Solutions Engineer
  2. 2. 2 Industry-leading enterprise data lake management, governance and self-service platform Expert data lake professional services (Design, Implementation, Workshops, Training) Solutions to simplify implementation and reduce business risk Enabling the data-powered enterprise Zaloni Confidential and Proprietary
  3. 3. 3 Zaloni proprietary – do not duplicate without permission Increased Agility New Insights Improved Scalability Data lakes are central to the modern data architecture
  4. 4. 4 Zaloni Proprietary Data architecture modernizationTraditionalModern Data Lake Sources ETL EDW Derived (Transformed) Discovery Sandbox EDW Streaming Unstructured Data Various Sources Data Discovery Analytics BI Data Science Data Discovery Analytics BI
  5. 5. Zaloni Confidential and Proprietary - Provided under NDA 5 Zaloni Proprietary 0% of market Optimize Self-Organizing Data Lake • Self-improving data lake via machine learning algorithms • True democratization of big data and analytics • Intelligent data remediation and curation • Recommended Data Security, and Governance policies • Lights out business operations optimized for business success 2% of market Automate Responsive Data Lake • Self-Service Ingestion & Provisioning • 360 View of Customer, Product, etc • Enterprise Data Discovery • Operationalize analytical models into business fabric • Enables immediate data impact on business operations Manage 10% of market Managed Data Lake • Acquire useful data from across the enterprise • Improved visibility and understanding via managed Ingestion of data and metadata • Ensure security and privacy of sensitive data • Operationalize data at scale • Leverage enterprise governance & security policies • Scalable production data lake for new and improved business insights 22% of market Store Data Swamp • Hadoop on premises or in the Cloud • Limited visibility and usability of data • Limited corporate oversight & governance • Sandbox or Dev Environments • Ad hoc and incremental growth of big data applications • Ad-hoc and exploratory insights for individual use cases Zaloni Big Data Maturity Model Stage: Characteristics: Descriptor: Stage Today: Business Impact: Ignore 66% of market • Emphasis on structured data • Limited ability to leverage data at scale • Business emphasis on retrospective reporting and analysis • Strong governance and security policies • Slow to accommodate business changes Data Warehouse Value Realized
  6. 6. 6 Zaloni proprietary – do not duplicate without permission Managing the Data Supply Chain from Source to Consumer CONSUMERS Business Analysts Researchers Data Scientists Applications • Data Lake Management Platform • A software solution for data lake management that enables enterprise-wide scalability • Provides end to end capabilities Self-Service Data Data Lake Management Platform Enable Govern Engage Batch ingestion Streaming Ingestion Auto discovery Data Quality Data Privacy and Security Data Lifecycle Management CatalogMetadata Management Operationalize Transformations Self-Service Data Preparation PRODUCERS File Data Streaming Relational On-premise
  7. 7. 7 Zaloni Proprietary Data Lake Reference Architecture • Data required for LOB specific views - transformed from existing certified data • Consumers are anyone with appropriate role-based access • Standardized on corporate governance/ quality policies • Consumers are anyone with appropriate role-based access • Single version of truth Transient Landing Zone Raw Zone Refined Zone Trusted Zone Sandbox Data Lake • Temporary store of source data • Consumers are IT, Data Stewards • Implemented in highly regulation industries • Original source data ready for consumption • Consumers are ETL developers, data stewards, some data scientists • Single source of truth with history • Data required for LOB specific views - transformed from existing certified data • Consumers are anyone with appropriate role-based access Sensors (or other time series data) Relational Data Stores (OLTP/ODS/DW) Logs (or other unstructured data) Social and shared data
  8. 8. 8 Zaloni Proprietary Machine learning for data lake implementations Loyalty Customer Service TransactionsMarketing 3rd Party ● Easily integrate data silos ● Probabilistic data matching and record linkage ● Automatically classify, encrypt/mask PII/Sensitive data for regulatory compliance Integrate Data Silos
  9. 9. 9 Zaloni Proprietary • Extend data lake beyond Hadoop • Catalog traditional sources • Ingest datasets without IT • Prepare & provision data to your tool of choice Increasing data lake adoption through self-service Self-Service Data Preparation
  10. 10. 10 Zaloni Proprietary DON’T GO IN THE DATA LAKE WITHOUT US Questions?