Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Unleashing the Power of your Data

Orit Alul (Sr. Solutions Architect) @ AWS:
As data is growing at an exponential rate, we are interested not only in being able to analyze the past or present but also in predicting the future!
In this session, Orit will talk about the power of data combined with machine learning.
Building a highly scalable and flexible data architecture in the cloud to collect, process, and analyze data, in order to get timely insights and react quickly to new information.
In addition, Orit will present best practices, performance and optimization tips for building a Data Lake in the cloud.

  • Be the first to comment

  • Be the first to like this

Unleashing the Power of your Data

  1. 1. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cloud Data Lake Orit Alul Solutions Architect – Amazon Web Services @oritalul oritalul
  2. 2. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda • Intro - Data Evolution • What is a Data Lake? • Architectural Principals for Data Platforms
  3. 3. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business Monitoring Business Insights New Business Opportunity Business Optimization Business Transformation Evolving Tools and Methods AI/MLSQL Query
  4. 4. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Data Architecture Challenges • Discovering the data • Maintaining a short time-to-insight • Analyzing the data by different personas • Being cost efficient
  5. 5. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is a Data Lake?
  6. 6. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • A centralized repository for both structured and unstructured data • Store data as-is in open-source file formats to enable direct analytics What is a Data Lake?
  7. 7. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why a Data Lake? • Decouple storage from compute, allowing you to scale • Enable advanced analytics across all of your data sources • Reduce complexity in ETL and operational overhead • Future extensibility as new database and analytics technologies are invented
  8. 8. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Traditionally, Analytics Looked Like This OLTP ERP CRM LOB Data Warehouse Business Intelligence TBs-PBs Scale Schema Defined Prior to Data Load Operational and Ad Hoc Reporting Large Initial Capex + $$K / TB/ Year Relational Data
  9. 9. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Lakes Extend the Traditional Approach OLTP ERP CRM LOB Catalog DW Queries Big Data Processing Interactive Real-Time Web Sensors SocialDevices Business Intelligence Machine Learning TB-EBs Scale All Data in one place, a Single Source of Truth Relational and Non-Relational Data Decouples (low cost) Storage and Compute Schema on Read Diverse Analytical Engines Data Lake 100110000100101011100 101010111001010100001 011111011010001111001 0110010110
  10. 10. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – All Data in One Place Store and analyze all of your data, from all of your sources, in one centralized location. “Why is the data distributed in many locations? Where is the single source of truth ?”
  11. 11. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Quick Ingest Quickly ingest data without needing to force it into a pre-defined schema. “How can I collect data quickly from various sources and store it efficiently?”
  12. 12. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Storage vs Compute Separating your storage and compute allows you to scale each component as required “How can I scale up with the volume of data being generated?”
  13. 13. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Schema on Read “Is there a way I can apply multiple analytics and processing frameworks to the same data?” A Data Lake enables ad-hoc analysis by applying schemas on read, not write.
  14. 14. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Architectural Principals
  15. 15. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Architectural Principles • Build decoupled systems • Data → Store → Process → Store → Analyze → Insights • Use the right tool for the job • Data structure, latency, throughput, access patterns • Leverage managed and serverless services • Scalable/elastic, available, reliable, secure, no/low admin • Use log-centric design patterns • Immutable logs (data lake), materialized views • Be cost-conscious • Big data ≠ big cost • AI/ML enable your applications
  16. 16. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you! Orit Alul Solutions Architect – Amazon Web Services @oritalul oritalul

×