SlideShare a Scribd company logo
1 of 48
Scalable Web Architecture and
     Distributed Systems
        The Architecture of Open Source Applications II
                                                 Ver 1.0




                  아키텍트를 꿈꾸는 사람들 cafe.naver.com/architect1
                               현수명 soomong.net
                                                           #soomong
Scalable?




at a primitive level it's just connecting users with remote
resources via the internet - the part that makes it scalable is that
the resources, or access to those resources, are distributed
across multiple servers.
Key principles
basis for decisions in designing web architecture.


                   Availability                                Scalability




             Performance                                          Manageability




                      Reliability                                 Cost




it can be at odds with one another.

high Scalability (more server) VS low manageability (you have to operate an additional server)
high Scalability (more server) VS high cost (the price of the servers)
Key principles


        Availability        - The uptime of a website
                            - Constantly available and resilient to
                            failure

     Performance         - The speed of website
                         - fast responses and low latency


                             - consistently
           Reliability
                             - return the same data
Key principles


   How much traffic can it handle -       Scalability
how easy is it to add more storage -


                      Easy to operate -
                                            Manageability
         Ease of diagnosing problems -



                       How much –           Cost
        h/w, s/w, deploy, maintain -
           build, operate, training -
What are the right pieces?



How these pieces fit together?



         What are the right tradeoffs?
Example. Image Hosting Application

 User can upload images can be requested via a web or API
other important aspects
- no limit to the number of images
- low latency for downloads
- if upload , it should be there
- easy to maintain
- cost-effective
let’s assume that this application has two key parts.
 - upload(write)
 - query(read)
Problem.
  Longer writes will impact the time it takes to read the images.
  (since they two functions will be competing the shared resources)
  Write function will almost always be slower than reads.

Problem.
  Web server has an upper limit on the number of connections.
  (read is ok. but write is bottleneck)
Service
 split out reads and writes of image into their own services.
Service-Oriented Architecture (SOA)
it allows each piece to scale independently of one another like OOP.




 all requests to upload and retrieve images are processed by the
 same server.
 break out these two functions into their own services.
this allows us to scale each of them independently
(always do more reading than writing)


Flickr solves this read/write issue by distributing users across
different shards such that each shard can handle a set number of
users.




   Problem.
      When failure happens
Redundancy




 backup, copies
 If one fails , the system can failover to the healthy copy.
shared-nothing architecture
 - each node is able to operate independently
 - there is no central brain
 - new nodes can be added without special condition or knowledge.
 - they are much more resilient to failure




      Problem.
         There may be very large data sets that are
     unable to fit on a single server.
Partitions
  two choices. scale vertically or horizontally
to scale vertically means adding more resources to an individual
server.
 - adding more hard drives on a single server




to scale horizontally means adding more nodes.
 - adding more servers
 - it should be included as an intrinsic design principle of the system
architecture.
 - to break up your services into partitions, or shards.
 - (by geographic boundaries or by non-playing user VS playing users)




        an image’s name could be formed from a consistent hashing
        scheme mapped across servers.
Problem.
   data locality
  to perform a costly fetch of the required information across the network.


Problem.
   inconsistency
  race condition could occur.
The Building Blocks of
        Fast and Scalable Data Access
  hard part : scaling access to the data.




as they grow, there are two main challenges
scaling access to the app server and to the database.

In a highly scalable application design, the app (or web) server is typically
minimized and often embodies a shared-nothing architecture. This makes the app
server layer of the system horizontally scalable. As a result of this design, the
heavy lifting is pushed down the stack to the database server and supporting
services; it's at this layer where the real scaling and performance challenges come
into play
let’s assume you have many terabytes of data and you want to
allow users to access small portions of that data at random.




 This is particularly challenging because it can be very costly to
 load TBs of data into memory. This directly translates to disk IO.
to make data access a lot faster.




         Caches
               Proxies

                      Indexes

                             Load balancers
Caches
Principle : Recently requested data is likely to be requested
again.

like short-term memory.

Caches can exist at all levels in architecture, but are often found
at the level nearest to the front end.

There are a couple of places you can insert a cache.
1. To insert a cache on your request layer node.




Each time a request is made to the service, the node will quickly
return local, cached data if it exists.
If it is not in cache, the request node will query the data from disk.
2. Request layer is expanded to multiple nodes.




 Problem.
    If your load balancer randomly distributes requests across the
nodes, the same request will go to different nodes, thus increasing
cache misses. Two choices for overcoming this hurdle.
3. Global caches
All the nodes use the same single cache space.
Each of the request nodes queries the cache in the same way it would
a local one.

It can get a bit complicated because it is very easy to overwhelm a
single cache as the number of clients and requests increase.

But it is very effective in some architectures

There are two common forms of global cache
(Who has responsible for retrieval?)
3’. Global caches
Tend to use the first type.


However, there are some cases where the second implementation
makes more sense.

If the cache is being used for very large files, a low cache hit
percentage would cause the cache buffer to become overwhelmed
with cache misses
4. Distributed caches.
Each of its nodes own part of the cached data.
If a refrigerator acts as a cache to the grocery store, a distributed
cache is like putting your food in several locations , convenient
locations for retrieving snacks from, without a trip to the store.

Typically the cache is divided up using a consistent hashing
function.



Advantages.
The increased cache space that can be had just by adding nodes to
the request pool.

Disadvantage.
Remedying a missing node. Some distributed caches get around this
by storing multiple copies of the data on different nodes.
however, you can imagine how this logic can get complicated
quickly, especially when you add or remove nodes from the request
layer.

Problem.
   what to do when the data is not in the cache
popular open source cache is Memcached
Proxies
to filter requests, log requests, or sometimes transform requests
Collapsed forwarding

One way to use a proxy to speed up data access is to collapse the
same (or similar) requests together into one request, and then
return the single result to the requesting clients.
There is some cost associated with this design, since each
request can have slightly higher latency, and some requests may
be slightly delayed to be grouped with similar ones.

But it will improve performance in high load
situations, particularly when that same data is requested over
and over
to collapse requests for data that is spatially close together in
the origin store (consecutively on disk)




 We can set up our proxy to recognize the spatial locality of the
 individual requests, collapsing them into a single request and
 returning only bigB, greatly minimizing the reads from the data
 origin
it is best to put the cache in front of the proxy, for the same reason
 that it is best to let the faster runners start first in a crowded
 marathon race.




Squid                   Varnish
Indexes
to find the correct physical location of the desired data




 In the case of data sets that are many TBs in size, but with very
small payloads (e.g., 1 KB), indexes are a necessity for optimizing
data access.
there are many layers of indexes that serve as a map, moving you
from one location to the next, and so forth, until you get the specific
piece of data you want.




  used to create several different views of the same data
Load balancers
to distribute load across a set of nodes responsible for servicing
requests




Their main purpose is to handle a lot of simultaneous connections
and route those connections to one of the request nodes, allowing
the system to scale to service more requests by just adding nodes
There are many different algorithms

Load balancers can be implemented as software or hardware
appliances.

Open source software load balancer is HAProxy
Multiple load balancers




Like proxies, some load balancers can also route a request differently
depending on the type of request it is.
(Technically these are also known as reverse proxies.)
Queues
      effective management of writes


data may have to be written several places on different servers or
indexes, or the system could just be under high load. In the cases
where writes, or any task for that matter, may take a long time,
achieving performance and availability requires building asynchrony
into the system.
When the server receives more requests than it can handle, then
 each client is forced to wait for the other clients' requests to
 complete before a response can be generated.




This kind of synchronous behavior can severely degrade client performance
Solving this problem effectively requires abstraction between the
client's request and the actual work performed to service it.




 When a client submits task requests to a queue they are no longer
 forced to wait for the results; instead they need only
 acknowledgement that the request was properly received.
There are quite a few open source queues
like RabbitMQ, ActiveMQ, BeanstalkD,
but some also use services like Zookeeper, or even data stores
like Redis.
Yes. It is barely scratching the surface. :)
Reference



                     The Architecture of Open Source Applications
                     http://www.aosabook.org/en/distsys.html


  https://www.lib.uwo.ca/blogs/education/phoneflickr.jpg
Thank you

More Related Content

What's hot

Intro to containerization
Intro to containerizationIntro to containerization
Intro to containerizationBalint Pato
 
DevOps Real-Time Projects | Edureka
DevOps Real-Time Projects | EdurekaDevOps Real-Time Projects | Edureka
DevOps Real-Time Projects | EdurekaEdureka!
 
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid RahimianAPI Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid RahimianVahid Rahimian
 
DevOps without DevOps Tools
DevOps without DevOps ToolsDevOps without DevOps Tools
DevOps without DevOps ToolsJagatveer Singh
 
Improve the Development Process with DevOps Practices by Fedorov Vadim
Improve the Development Process with DevOps Practices by Fedorov VadimImprove the Development Process with DevOps Practices by Fedorov Vadim
Improve the Development Process with DevOps Practices by Fedorov VadimSoftServe
 
Reactive Microservices with Quarkus
Reactive Microservices with QuarkusReactive Microservices with Quarkus
Reactive Microservices with QuarkusNiklas Heidloff
 
Cwin16 - Paris - mule soft
Cwin16 - Paris - mule softCwin16 - Paris - mule soft
Cwin16 - Paris - mule softCapgemini
 
Introduction to Docker - 2017
Introduction to Docker - 2017Introduction to Docker - 2017
Introduction to Docker - 2017Docker, Inc.
 
DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...
DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...
DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...Simplilearn
 
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...Edureka!
 
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...DevOpsDays Tel Aviv
 
Angular based enterprise level frontend architecture
Angular based enterprise level frontend architectureAngular based enterprise level frontend architecture
Angular based enterprise level frontend architectureHimanshu Tamrakar
 
What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...
What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...
What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...Simplilearn
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web appsDirecti Group
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)Hussain Mansoor
 
Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0
Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0
Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0Konveyor Community
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecturetyrantbrian
 

What's hot (20)

Intro to containerization
Intro to containerizationIntro to containerization
Intro to containerization
 
DevOps Real-Time Projects | Edureka
DevOps Real-Time Projects | EdurekaDevOps Real-Time Projects | Edureka
DevOps Real-Time Projects | Edureka
 
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid RahimianAPI Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
 
Introduction to DevOps
Introduction to DevOpsIntroduction to DevOps
Introduction to DevOps
 
DevOps without DevOps Tools
DevOps without DevOps ToolsDevOps without DevOps Tools
DevOps without DevOps Tools
 
Improve the Development Process with DevOps Practices by Fedorov Vadim
Improve the Development Process with DevOps Practices by Fedorov VadimImprove the Development Process with DevOps Practices by Fedorov Vadim
Improve the Development Process with DevOps Practices by Fedorov Vadim
 
Reactive Microservices with Quarkus
Reactive Microservices with QuarkusReactive Microservices with Quarkus
Reactive Microservices with Quarkus
 
Cwin16 - Paris - mule soft
Cwin16 - Paris - mule softCwin16 - Paris - mule soft
Cwin16 - Paris - mule soft
 
Introduction to Docker - 2017
Introduction to Docker - 2017Introduction to Docker - 2017
Introduction to Docker - 2017
 
DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...
DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...
DevOps Tutorial For Beginners | DevOps Tutorial | DevOps Tools | DevOps Train...
 
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
What is DevOps | DevOps Introduction | DevOps Training | DevOps Tutorial | Ed...
 
Architecture: Microservices
Architecture: MicroservicesArchitecture: Microservices
Architecture: Microservices
 
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
 
Why Microservice
Why Microservice Why Microservice
Why Microservice
 
Angular based enterprise level frontend architecture
Angular based enterprise level frontend architectureAngular based enterprise level frontend architecture
Angular based enterprise level frontend architecture
 
What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...
What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...
What is DevOps? | DevOps Introduction | DevOps Tools | DevOps Tutorial For Be...
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web apps
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)
 
Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0
Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0
Mass Migrate Virtual Machines to Kubevirt with Tool Forklift 2.0
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 

Similar to Scalable Web Architecture and Distributed Systems

Enhanced Dynamic Web Caching: For Scalability & Metadata Management
Enhanced Dynamic Web Caching: For Scalability & Metadata ManagementEnhanced Dynamic Web Caching: For Scalability & Metadata Management
Enhanced Dynamic Web Caching: For Scalability & Metadata ManagementDeepak Bagga
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseAlireza Kamrani
 
Top System Design Interview Questions
Top System Design Interview QuestionsTop System Design Interview Questions
Top System Design Interview QuestionsSoniaMathias2
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
Synopsis cloud scalability_jatinchauhan
Synopsis cloud scalability_jatinchauhanSynopsis cloud scalability_jatinchauhan
Synopsis cloud scalability_jatinchauhanJatin Chauhan
 
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...Kalaiselvan (Selvan)
 
Queues, Pools and Caches - Paper
Queues, Pools and Caches - PaperQueues, Pools and Caches - Paper
Queues, Pools and Caches - PaperGwen (Chen) Shapira
 
Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxRadhika R
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsDirecti Group
 
Multi Tenancy In The Cloud
Multi Tenancy In The CloudMulti Tenancy In The Cloud
Multi Tenancy In The Cloudrohit_ainapure
 
Scale from zero to millions of users.pdf
Scale from zero to millions of users.pdfScale from zero to millions of users.pdf
Scale from zero to millions of users.pdfNedyalkoKarabadzhako
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesEditor Jacotech
 
[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage ReductionPerforce
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archroyans
 

Similar to Scalable Web Architecture and Distributed Systems (20)

Enhanced Dynamic Web Caching: For Scalability & Metadata Management
Enhanced Dynamic Web Caching: For Scalability & Metadata ManagementEnhanced Dynamic Web Caching: For Scalability & Metadata Management
Enhanced Dynamic Web Caching: For Scalability & Metadata Management
 
What is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of databaseWhat is Scalability and How can affect on overall system performance of database
What is Scalability and How can affect on overall system performance of database
 
Top System Design Interview Questions
Top System Design Interview QuestionsTop System Design Interview Questions
Top System Design Interview Questions
 
Document 22.pdf
Document 22.pdfDocument 22.pdf
Document 22.pdf
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
Synopsis cloud scalability_jatinchauhan
Synopsis cloud scalability_jatinchauhanSynopsis cloud scalability_jatinchauhan
Synopsis cloud scalability_jatinchauhan
 
Queues, Pools and Caches paper
Queues, Pools and Caches paperQueues, Pools and Caches paper
Queues, Pools and Caches paper
 
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
Silicon India Java Conference: Building Scalable Solutions For Commerce Silic...
 
Queues, Pools and Caches - Paper
Queues, Pools and Caches - PaperQueues, Pools and Caches - Paper
Queues, Pools and Caches - Paper
 
Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptx
 
No sql exploration keyvaluestore
No sql exploration   keyvaluestoreNo sql exploration   keyvaluestore
No sql exploration keyvaluestore
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
Multi Tenancy In The Cloud
Multi Tenancy In The CloudMulti Tenancy In The Cloud
Multi Tenancy In The Cloud
 
My Dissertation 2016
My Dissertation 2016My Dissertation 2016
My Dissertation 2016
 
Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
 
MongoDB
MongoDBMongoDB
MongoDB
 
Scale from zero to millions of users.pdf
Scale from zero to millions of users.pdfScale from zero to millions of users.pdf
Scale from zero to millions of users.pdf
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
 
[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 

More from hyun soomyung

Dependency Breaking Techniques
Dependency Breaking TechniquesDependency Breaking Techniques
Dependency Breaking Techniqueshyun soomyung
 
아꿈사 매니저소개
아꿈사 매니저소개아꿈사 매니저소개
아꿈사 매니저소개hyun soomyung
 
HTML5 & CSS3 - Video,Audio
HTML5 & CSS3 - Video,AudioHTML5 & CSS3 - Video,Audio
HTML5 & CSS3 - Video,Audiohyun soomyung
 
Domain Driven Design
Domain Driven DesignDomain Driven Design
Domain Driven Designhyun soomyung
 
The Art of Computer Programming 2.4 다중연결구조
The Art of Computer Programming 2.4 다중연결구조The Art of Computer Programming 2.4 다중연결구조
The Art of Computer Programming 2.4 다중연결구조hyun soomyung
 
The Art of Computer Programming 2.3.2 Tree
The Art of Computer Programming 2.3.2 TreeThe Art of Computer Programming 2.3.2 Tree
The Art of Computer Programming 2.3.2 Treehyun soomyung
 
Design Pattern - Multithread Ch10
Design Pattern - Multithread Ch10Design Pattern - Multithread Ch10
Design Pattern - Multithread Ch10hyun soomyung
 
The Art of Computer Programming 1.3.2 MIXAL
The Art of Computer Programming 1.3.2 MIXALThe Art of Computer Programming 1.3.2 MIXAL
The Art of Computer Programming 1.3.2 MIXALhyun soomyung
 
The Art of Computer Programming 1.2.5
The Art of Computer Programming 1.2.5The Art of Computer Programming 1.2.5
The Art of Computer Programming 1.2.5hyun soomyung
 
스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)
스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)
스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)hyun soomyung
 
프로그래머의 길,멘토에게 묻다 2장
프로그래머의 길,멘토에게 묻다 2장프로그래머의 길,멘토에게 묻다 2장
프로그래머의 길,멘토에게 묻다 2장hyun soomyung
 
[페차쿠차] 배움의 기술
[페차쿠차] 배움의 기술[페차쿠차] 배움의 기술
[페차쿠차] 배움의 기술hyun soomyung
 
실전 윈도우 디버깅. Ch3. 디버거 해부
실전 윈도우 디버깅. Ch3. 디버거 해부실전 윈도우 디버깅. Ch3. 디버거 해부
실전 윈도우 디버깅. Ch3. 디버거 해부hyun soomyung
 
xUnitTestPattern/chapter8
xUnitTestPattern/chapter8xUnitTestPattern/chapter8
xUnitTestPattern/chapter8hyun soomyung
 
예제로 보는 Pattern 연상법
예제로 보는 Pattern 연상법예제로 보는 Pattern 연상법
예제로 보는 Pattern 연상법hyun soomyung
 
프로그램은 왜 실패하는가?
프로그램은 왜 실패하는가?프로그램은 왜 실패하는가?
프로그램은 왜 실패하는가?hyun soomyung
 

More from hyun soomyung (20)

Dependency Breaking Techniques
Dependency Breaking TechniquesDependency Breaking Techniques
Dependency Breaking Techniques
 
아꿈사 매니저소개
아꿈사 매니저소개아꿈사 매니저소개
아꿈사 매니저소개
 
HTML5 & CSS3 - Video,Audio
HTML5 & CSS3 - Video,AudioHTML5 & CSS3 - Video,Audio
HTML5 & CSS3 - Video,Audio
 
Hybrid app
Hybrid appHybrid app
Hybrid app
 
Domain Driven Design
Domain Driven DesignDomain Driven Design
Domain Driven Design
 
MapReduce
MapReduceMapReduce
MapReduce
 
MongoDB
MongoDBMongoDB
MongoDB
 
The Art of Computer Programming 2.4 다중연결구조
The Art of Computer Programming 2.4 다중연결구조The Art of Computer Programming 2.4 다중연결구조
The Art of Computer Programming 2.4 다중연결구조
 
The Art of Computer Programming 2.3.2 Tree
The Art of Computer Programming 2.3.2 TreeThe Art of Computer Programming 2.3.2 Tree
The Art of Computer Programming 2.3.2 Tree
 
Design Pattern - Multithread Ch10
Design Pattern - Multithread Ch10Design Pattern - Multithread Ch10
Design Pattern - Multithread Ch10
 
The Art of Computer Programming 1.3.2 MIXAL
The Art of Computer Programming 1.3.2 MIXALThe Art of Computer Programming 1.3.2 MIXAL
The Art of Computer Programming 1.3.2 MIXAL
 
The Art of Computer Programming 1.2.5
The Art of Computer Programming 1.2.5The Art of Computer Programming 1.2.5
The Art of Computer Programming 1.2.5
 
스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)
스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)
스터디그룹 패턴 (A PATTERN LANGUAGE FOR STUDY GROUPS)
 
Clojure Chapter.6
Clojure Chapter.6Clojure Chapter.6
Clojure Chapter.6
 
프로그래머의 길,멘토에게 묻다 2장
프로그래머의 길,멘토에게 묻다 2장프로그래머의 길,멘토에게 묻다 2장
프로그래머의 길,멘토에게 묻다 2장
 
[페차쿠차] 배움의 기술
[페차쿠차] 배움의 기술[페차쿠차] 배움의 기술
[페차쿠차] 배움의 기술
 
실전 윈도우 디버깅. Ch3. 디버거 해부
실전 윈도우 디버깅. Ch3. 디버거 해부실전 윈도우 디버깅. Ch3. 디버거 해부
실전 윈도우 디버깅. Ch3. 디버거 해부
 
xUnitTestPattern/chapter8
xUnitTestPattern/chapter8xUnitTestPattern/chapter8
xUnitTestPattern/chapter8
 
예제로 보는 Pattern 연상법
예제로 보는 Pattern 연상법예제로 보는 Pattern 연상법
예제로 보는 Pattern 연상법
 
프로그램은 왜 실패하는가?
프로그램은 왜 실패하는가?프로그램은 왜 실패하는가?
프로그램은 왜 실패하는가?
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Scalable Web Architecture and Distributed Systems

  • 1. Scalable Web Architecture and Distributed Systems The Architecture of Open Source Applications II Ver 1.0 아키텍트를 꿈꾸는 사람들 cafe.naver.com/architect1 현수명 soomong.net #soomong
  • 2. Scalable? at a primitive level it's just connecting users with remote resources via the internet - the part that makes it scalable is that the resources, or access to those resources, are distributed across multiple servers.
  • 3. Key principles basis for decisions in designing web architecture. Availability Scalability Performance Manageability Reliability Cost it can be at odds with one another. high Scalability (more server) VS low manageability (you have to operate an additional server) high Scalability (more server) VS high cost (the price of the servers)
  • 4. Key principles Availability - The uptime of a website - Constantly available and resilient to failure Performance - The speed of website - fast responses and low latency - consistently Reliability - return the same data
  • 5. Key principles How much traffic can it handle - Scalability how easy is it to add more storage - Easy to operate - Manageability Ease of diagnosing problems - How much – Cost h/w, s/w, deploy, maintain - build, operate, training -
  • 6. What are the right pieces? How these pieces fit together? What are the right tradeoffs?
  • 7. Example. Image Hosting Application User can upload images can be requested via a web or API
  • 8. other important aspects - no limit to the number of images - low latency for downloads - if upload , it should be there - easy to maintain - cost-effective
  • 9. let’s assume that this application has two key parts. - upload(write) - query(read)
  • 10. Problem. Longer writes will impact the time it takes to read the images. (since they two functions will be competing the shared resources) Write function will almost always be slower than reads. Problem. Web server has an upper limit on the number of connections. (read is ok. but write is bottleneck)
  • 11. Service split out reads and writes of image into their own services.
  • 12. Service-Oriented Architecture (SOA) it allows each piece to scale independently of one another like OOP. all requests to upload and retrieve images are processed by the same server. break out these two functions into their own services.
  • 13. this allows us to scale each of them independently (always do more reading than writing) Flickr solves this read/write issue by distributing users across different shards such that each shard can handle a set number of users. Problem. When failure happens
  • 14. Redundancy backup, copies If one fails , the system can failover to the healthy copy.
  • 15. shared-nothing architecture - each node is able to operate independently - there is no central brain - new nodes can be added without special condition or knowledge. - they are much more resilient to failure Problem. There may be very large data sets that are unable to fit on a single server.
  • 16. Partitions two choices. scale vertically or horizontally
  • 17. to scale vertically means adding more resources to an individual server. - adding more hard drives on a single server to scale horizontally means adding more nodes. - adding more servers - it should be included as an intrinsic design principle of the system architecture. - to break up your services into partitions, or shards. - (by geographic boundaries or by non-playing user VS playing users) an image’s name could be formed from a consistent hashing scheme mapped across servers.
  • 18. Problem. data locality to perform a costly fetch of the required information across the network. Problem. inconsistency race condition could occur.
  • 19. The Building Blocks of Fast and Scalable Data Access hard part : scaling access to the data. as they grow, there are two main challenges scaling access to the app server and to the database. In a highly scalable application design, the app (or web) server is typically minimized and often embodies a shared-nothing architecture. This makes the app server layer of the system horizontally scalable. As a result of this design, the heavy lifting is pushed down the stack to the database server and supporting services; it's at this layer where the real scaling and performance challenges come into play
  • 20. let’s assume you have many terabytes of data and you want to allow users to access small portions of that data at random. This is particularly challenging because it can be very costly to load TBs of data into memory. This directly translates to disk IO.
  • 21. to make data access a lot faster. Caches Proxies Indexes Load balancers
  • 22. Caches Principle : Recently requested data is likely to be requested again. like short-term memory. Caches can exist at all levels in architecture, but are often found at the level nearest to the front end. There are a couple of places you can insert a cache.
  • 23. 1. To insert a cache on your request layer node. Each time a request is made to the service, the node will quickly return local, cached data if it exists. If it is not in cache, the request node will query the data from disk.
  • 24. 2. Request layer is expanded to multiple nodes. Problem. If your load balancer randomly distributes requests across the nodes, the same request will go to different nodes, thus increasing cache misses. Two choices for overcoming this hurdle.
  • 25. 3. Global caches All the nodes use the same single cache space.
  • 26. Each of the request nodes queries the cache in the same way it would a local one. It can get a bit complicated because it is very easy to overwhelm a single cache as the number of clients and requests increase. But it is very effective in some architectures There are two common forms of global cache (Who has responsible for retrieval?)
  • 28. Tend to use the first type. However, there are some cases where the second implementation makes more sense. If the cache is being used for very large files, a low cache hit percentage would cause the cache buffer to become overwhelmed with cache misses
  • 29. 4. Distributed caches. Each of its nodes own part of the cached data.
  • 30. If a refrigerator acts as a cache to the grocery store, a distributed cache is like putting your food in several locations , convenient locations for retrieving snacks from, without a trip to the store. Typically the cache is divided up using a consistent hashing function. Advantages. The increased cache space that can be had just by adding nodes to the request pool. Disadvantage. Remedying a missing node. Some distributed caches get around this by storing multiple copies of the data on different nodes. however, you can imagine how this logic can get complicated quickly, especially when you add or remove nodes from the request layer. Problem. what to do when the data is not in the cache
  • 31. popular open source cache is Memcached
  • 32. Proxies to filter requests, log requests, or sometimes transform requests
  • 33. Collapsed forwarding One way to use a proxy to speed up data access is to collapse the same (or similar) requests together into one request, and then return the single result to the requesting clients.
  • 34. There is some cost associated with this design, since each request can have slightly higher latency, and some requests may be slightly delayed to be grouped with similar ones. But it will improve performance in high load situations, particularly when that same data is requested over and over
  • 35. to collapse requests for data that is spatially close together in the origin store (consecutively on disk) We can set up our proxy to recognize the spatial locality of the individual requests, collapsing them into a single request and returning only bigB, greatly minimizing the reads from the data origin
  • 36. it is best to put the cache in front of the proxy, for the same reason that it is best to let the faster runners start first in a crowded marathon race. Squid Varnish
  • 37. Indexes to find the correct physical location of the desired data In the case of data sets that are many TBs in size, but with very small payloads (e.g., 1 KB), indexes are a necessity for optimizing data access.
  • 38. there are many layers of indexes that serve as a map, moving you from one location to the next, and so forth, until you get the specific piece of data you want. used to create several different views of the same data
  • 39. Load balancers to distribute load across a set of nodes responsible for servicing requests Their main purpose is to handle a lot of simultaneous connections and route those connections to one of the request nodes, allowing the system to scale to service more requests by just adding nodes
  • 40. There are many different algorithms Load balancers can be implemented as software or hardware appliances. Open source software load balancer is HAProxy
  • 41. Multiple load balancers Like proxies, some load balancers can also route a request differently depending on the type of request it is. (Technically these are also known as reverse proxies.)
  • 42. Queues effective management of writes data may have to be written several places on different servers or indexes, or the system could just be under high load. In the cases where writes, or any task for that matter, may take a long time, achieving performance and availability requires building asynchrony into the system.
  • 43. When the server receives more requests than it can handle, then each client is forced to wait for the other clients' requests to complete before a response can be generated. This kind of synchronous behavior can severely degrade client performance
  • 44. Solving this problem effectively requires abstraction between the client's request and the actual work performed to service it. When a client submits task requests to a queue they are no longer forced to wait for the results; instead they need only acknowledgement that the request was properly received.
  • 45. There are quite a few open source queues like RabbitMQ, ActiveMQ, BeanstalkD, but some also use services like Zookeeper, or even data stores like Redis.
  • 46. Yes. It is barely scratching the surface. :)
  • 47. Reference The Architecture of Open Source Applications http://www.aosabook.org/en/distsys.html https://www.lib.uwo.ca/blogs/education/phoneflickr.jpg