Distributed Systems

Introduction to Distributed
Systems
 What is a distributed system
 A distributed system is a collection of independent computers that appear to
the users of the system as a single system.

Advantages of Distributed Systems
 Data sharing: allow many users to access to a common
data base
 Resource Sharing: expensive peripherals like color
printers
 Communication: enhance human-to-human
communication, e.g., email, chat
 Flexibility: spread the workload over the available
machines
 Reliability: If one machine crashes, the system as a
whole can still survive. Higher availability and improved
reliability.

Disadvantages of Distributed
Systems
 Software: difficult to develop software for distributed
systems
 Network: loss of transmissions
 Security: easy access also applies to secrete data

Hardware Concepts
 MIMD (Multiple-Instruction Multiple-Data)
 Tightly Coupled versus Loosely Coupled
 Tightly coupled systems (multiprocessors)
o shared memory
o intermachine delay short, data rate high
 Loosely coupled systems (multicomputers)
o private memory
o intermachine delay long, data rate low

Bus versus Switched MIMD
• Bus: a single network, backplane, bus, cable or other
medium that connects all machines. E.g., cable TV
• Switched: individual wires from machine to machine,
with many different wiring patterns in use.
 Multiprocessors (shared memory)
 Bus
 Switched
 Multicomputers (private memory)
 Bus
 Switched

Switched Multiprocessors
 Switched Multiprocessors
 for connecting large number (say over 64) of processors
 crossbar switch: n**2 switch points
 omega network: 2x2 switches for n CPUs and n
memories, log n switching stages, each with n/2
switches,
 total (n log n)/2 switches
 delay problem: E.g., n=1024, 10 switching stages from
CPU to memory. a total of 20 switching stages. 100
MIPS 10 nsec instruction execution time need 0.5 nsec
switching time

Cont…
 NUMA (Non-Uniform Memory Access): placement of
program and data
 building a large, tightly-coupled, shared memory
multiprocessor is possible, but is difficult and expensive

Multicomputers
 Bus-Based Multicomputers
 easy to build
 communication volume much smaller
 relatively slow speed LAN (10-100 MIPS, compared to
300 MIPS and up for a backplane bus)
 Switched Multicomputers
 interconnection networks: E.g., grid, hypercube
 hypercube: n-dimensional cube

Software Concepts
• Software more important for users
• Three types:
1. Network Operating Systems
2. (True) Distributed Systems
3. Multiprocessor Time Sharing

(True) Distributed Systems
 tightly-coupled software on loosely-coupled hardware
 provide a single-system image or a virtual uniprocessor
 a single, global interprocess communication
mechanism, process management, file system; the same
system call interface everywhere
 Ideal definition:
“ A distributed system runs on a collection of computers that do
not have shared memory, yet looks like a single computer to
its users.”

Multiprocessor Operating Systems
 Tightly-coupled software on tightly-coupled hardware
 Examples: high-performance servers
 shared memory
 single run queue
 traditional file system as on a single-processor system:
central block cache

Design Issues of Distributed
Systems
• Transparency
• Flexibility
• Reliability
• Performance
• Scalability

Transparency
• How to achieve the single-system image, i.e., how to
make a collection of computers appear as a single
computer.
• Hiding all the distribution from the users as well as
the application programs can be achieved at two
levels:
1) hide the distribution from users
2) at a lower level, make the system look transparent to
programs.
1) and 2) requires uniform interfaces such as access to
files, communication.

Flexibility
• Make it easier to change
• Monolithic Kernel: systems calls are trapped and
executed by the kernel. All system calls are served by
the kernel, e.g., UNIX.
• Microkernel: provides minimal services.
1) IPC
2) some memory management
3) some low-level process management and scheduling
4) low-level i/o
E.g., Mac can support multiple file systems, multiple
system interfaces.

Reliability
• Distributed system should be more reliable than single
system. Example: 3 machines with .95 probability of
being up. 1-.05**3 probability of being up.
– Availability: fraction of time the system is usable.
Redundancy improves it.
– Need to maintain consistency
– Need to be secure
– Fault tolerance: need to mask failures, recover from
errors.

Performance
• Without gain on this, why bother with distributed
systems.
• Performance loss due to communication delays:
– fine-grain parallelism: high degree of interaction
– coarse-grain parallelism
• Performance loss due to making the system fault
tolerant.

Scalability
• Systems grow with time or become obsolete. Techniques
that require resources linearly in terms of the size of the
system are not scalable. e.g., broadcast based query won't
work for large distributed systems.
• Examples of bottlenecks
o Centralized components: a single mail server
o Centralized tables: a single URL address book
o Centralized algorithms: routing based on complete
information

Distributed Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Distributed Systems

Similar to Distributed Systems (20)

Recently uploaded

Recently uploaded (20)

Distributed Systems