SlideShare a Scribd company logo
Distributed Caching for
Generative AI: Optimizing
LLM Data Pipelines
Shouwei Chen @Alluxio
Head of Community,
Developer Relations
@jasminechenwang
Core Maintainer & PM
Shouwei Chen
Join the conversation
on Slack
alluxio.io/slack
1,200+ Github
contributors & growing
10,000+ Slack
Community Members
Top 10 Most Critical Java
Based Open Source Project[1]
GitHubʼs Top 100 Most Valuable
Repositories Out of 96 Million
[1] Google Comes Up With A Metric For Gauging Critical Open-Source Projects
Open Source Started From UC Berkeley AMPLab
Analytics & AI
in the Hybrid & Multi-Cloud Era
Available:
I/O challenges In ML/DL
Metadata Scalability
CV Training data often
consists of a massive
amount of small raw
files (billions of 100KB
photos)
Cloud API cost
Training job can read
ten thousands of
times for single copy of
files, result in
thousands $
cost/training
High IO throughput
Training jobs are highly
concurrent, require
high I/O to keep
GPU&CPU utilized
Using Alluxio For DL
Distributed Caching
70
70 70
70 70
70
POSIX POSIX POSIX
▪ No full data copy from data source
▪ Serve fast DL training without HPC hardware
▪ Full data management for your machine learning unstructured datasets
Online data platform
Offline data platform / Data warehouse
Inference cluster
models
Offline training platform
Training cluster
Training data
Models /
Training data
models
Training data
Training cluster
models
training data
training data
Training data
Architecture diagram w/ Alluxio as access layer
Online data platform
Offline data platform / Data warehouse
Inference cluster
models
Offline training platform
Training cluster
Training data
Models /
Training data
models
Training data
Training cluster
models
training data
training data
Training data
Problem solved - compare to data migration
Expensive HPC hardware
Customization on data migration create overhead for eng efforts
Engineer have to manually delete the outdate data in persistent layer(Cloud storage & HDFS)
User have to understand the data pipeline between data sources & AI/ML infra
Problem solved - compare to direct access
Low GPU Utilization
Cloud storage API cost in same region / across region data movement cost
Coupled architecture - Business driven(Vendor lock-in)
Architecture diagram w/ Alluxio as access layer
High Scalability
Tens of Billions files /
Training
ESSENTIAL
High Availability
99.99%
ESSENTIAL
High Performance
Higher GPU utilization
ESSENTIAL
Always increasing expectations…
Availability
● No single point failure
● Fault tolerance
● More friendly to K8S and Cloud (No statefulset
needed)
Performance & Scalability
● Unlimited scalability
● Support billions of small files
● Streamlined RPC call
Multi-Tenant in One Cluster
● Proven effectiveness of cost
● Tenant isolation
● Serverless interface
● Easy management
Plugable Quota and Security Management
● Customizable quota management
● Token passthrough for S3/GCS/OSS
● Impersonation & Krb5 for HDFS
Motivation & Benefits
Dora Core Components - Affinity
● Decide which worker to go on
client side
● Worker handle
list/getFileStatus call
● Fetch the worker list from
service periodically.
● Worker persists the metadata
to support quota and tenant
isolation
● For the unsupported calls or
when worker is down, clients
can fallback to UFS
Performance Improvement by Netty
Use Netty for Data Transmission by configuration.
Dora provide netty data transmission that improves reading
performance by 30%-50%.
Advantages compared to gRPC:
● Less data copies through different thread pools.
● Zero-copy transmission avoiding serialization of
Protobuf.
● Avoid OOM by optimizing off-heap memory usage.
● Requires less bytes to transfer as there is no additional
HTTP header any more.
IO throughput improvements
● Up to 9X for unstructured data
● 2-15X for structured data
High Concurrent Read in Dora
Twitter.com/alluxio
Linkedin.com/alluxio
Website
www.alluxio.io
Slack
https://alluxio.io/slack
@
Social Media
Github
https://github.com/Alluxio

More Related Content

Similar to Alluxio Product school Webinar - Distributed Caching for Generative AI

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
Alluxio, Inc.
 
Data Orchestration Platform for the Cloud
Data Orchestration Platform for the CloudData Orchestration Platform for the Cloud
Data Orchestration Platform for the Cloud
Alluxio, Inc.
 

Similar to Alluxio Product school Webinar - Distributed Caching for Generative AI (20)

Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsArchitecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
 
From limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiencyFrom limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiency
 
Data Orchestration Platform for the Cloud
Data Orchestration Platform for the CloudData Orchestration Platform for the Cloud
Data Orchestration Platform for the Cloud
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & MoreMeetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
Alluxio @ Uber Seattle Meetup
Alluxio @ Uber Seattle MeetupAlluxio @ Uber Seattle Meetup
Alluxio @ Uber Seattle Meetup
 
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
 
Achieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloadsAchieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloads
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
AWS Summit Atlanta Keynote
AWS Summit Atlanta KeynoteAWS Summit Atlanta Keynote
AWS Summit Atlanta Keynote
 
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to ProductionAlluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to Production
 
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine LearningWebinar: Three Reasons Why NAS is No Good for AI and Machine Learning
Webinar: Three Reasons Why NAS is No Good for AI and Machine Learning
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 

More from Alluxio, Inc.

More from Alluxio, Inc. (20)

AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 

Recently uploaded

Recently uploaded (20)

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdf
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with StrimziStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdf
 
A Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data MigrationA Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data Migration
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityAPVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 

Alluxio Product school Webinar - Distributed Caching for Generative AI

  • 1. Distributed Caching for Generative AI: Optimizing LLM Data Pipelines Shouwei Chen @Alluxio
  • 2. Head of Community, Developer Relations @jasminechenwang Core Maintainer & PM Shouwei Chen
  • 3. Join the conversation on Slack alluxio.io/slack 1,200+ Github contributors & growing 10,000+ Slack Community Members Top 10 Most Critical Java Based Open Source Project[1] GitHubʼs Top 100 Most Valuable Repositories Out of 96 Million [1] Google Comes Up With A Metric For Gauging Critical Open-Source Projects Open Source Started From UC Berkeley AMPLab
  • 4. Analytics & AI in the Hybrid & Multi-Cloud Era Available:
  • 5.
  • 6. I/O challenges In ML/DL Metadata Scalability CV Training data often consists of a massive amount of small raw files (billions of 100KB photos) Cloud API cost Training job can read ten thousands of times for single copy of files, result in thousands $ cost/training High IO throughput Training jobs are highly concurrent, require high I/O to keep GPU&CPU utilized
  • 7. Using Alluxio For DL Distributed Caching 70 70 70 70 70 70 POSIX POSIX POSIX ▪ No full data copy from data source ▪ Serve fast DL training without HPC hardware ▪ Full data management for your machine learning unstructured datasets
  • 8. Online data platform Offline data platform / Data warehouse Inference cluster models Offline training platform Training cluster Training data Models / Training data models Training data Training cluster models training data training data Training data Architecture diagram w/ Alluxio as access layer
  • 9. Online data platform Offline data platform / Data warehouse Inference cluster models Offline training platform Training cluster Training data Models / Training data models Training data Training cluster models training data training data Training data Problem solved - compare to data migration Expensive HPC hardware Customization on data migration create overhead for eng efforts Engineer have to manually delete the outdate data in persistent layer(Cloud storage & HDFS) User have to understand the data pipeline between data sources & AI/ML infra Problem solved - compare to direct access Low GPU Utilization Cloud storage API cost in same region / across region data movement cost Coupled architecture - Business driven(Vendor lock-in) Architecture diagram w/ Alluxio as access layer
  • 10. High Scalability Tens of Billions files / Training ESSENTIAL High Availability 99.99% ESSENTIAL High Performance Higher GPU utilization ESSENTIAL Always increasing expectations…
  • 11. Availability ● No single point failure ● Fault tolerance ● More friendly to K8S and Cloud (No statefulset needed) Performance & Scalability ● Unlimited scalability ● Support billions of small files ● Streamlined RPC call Multi-Tenant in One Cluster ● Proven effectiveness of cost ● Tenant isolation ● Serverless interface ● Easy management Plugable Quota and Security Management ● Customizable quota management ● Token passthrough for S3/GCS/OSS ● Impersonation & Krb5 for HDFS Motivation & Benefits
  • 12. Dora Core Components - Affinity ● Decide which worker to go on client side ● Worker handle list/getFileStatus call ● Fetch the worker list from service periodically. ● Worker persists the metadata to support quota and tenant isolation ● For the unsupported calls or when worker is down, clients can fallback to UFS
  • 13. Performance Improvement by Netty Use Netty for Data Transmission by configuration. Dora provide netty data transmission that improves reading performance by 30%-50%. Advantages compared to gRPC: ● Less data copies through different thread pools. ● Zero-copy transmission avoiding serialization of Protobuf. ● Avoid OOM by optimizing off-heap memory usage. ● Requires less bytes to transfer as there is no additional HTTP header any more.
  • 14. IO throughput improvements ● Up to 9X for unstructured data ● 2-15X for structured data High Concurrent Read in Dora