SlideShare a Scribd company logo
1 of 31
How to Run Applications Faster ?

      Research Issues in P2P                     • There are 3 ways to improve performance:
                                                    – Work Harder
            Computing                               – Work Smarter
                                                    – Get Help
                                                 • Computer Analogy
                                                    – faster hardware high performance processors or
                                                      peripheral devices
                                                    – Optimized algorithms and techniques used to
                                                      solve computational tasks
                                                    – Multiple computers to solve a particular task




                                                                  Distributed….
                     OUTLINE
                                                • When a handful of powerful computers are
•   Centralized Vs. Distributed                   linked together and communicate with each
•   What is P2P?                                  other
•   P2P Architectures                             – the overall computing power available can be
•   P2P and Applications                            amazingly vast.
•   Search and Replication Techniques             – Such a system can have a higher performance share
•   P2P Security                                    than a single supercomputer.
•   Emerging P2P Applications                     – The objective of such systems is to minimize
•   Conclusion                                      communication and computation cost.




                    Centralized?
• Computation in networks of processing          • Distributed system is an application that executes a
  nodes can be classified into centralized or
  distributed computations.                        collection of protocols to coordinate the actions of
• A centralized solution relies on one node        multiple processes on a communication network,
  being designated as the computer node
  that processes the entire application            such that all components cooperate together to
  locally                                          perform a single or small set of related tasks.
• The central system is shared by all the
  users all the time.
• There is single point of control and
  single point of failure.




                                                                                                          1
Examples of Distributed Systems
                                                               • The Internet
• The collaborating computers can access remote                   – Heterogeneous
  resources as well as local resources in the                     network of computers
  distributed system via the communication network.               and applications
• The existence of multiple autonomous computers is               – Implemented
                                                                  through the Internet
  transparent to the user in a distributed system.
                                                                  Protocol Stack
  – The user is not aware that the jobs are executed by
    multiple computers subsist in remote locations.




  – A centralized algorithm is at the heart of a single
    computer.

  – A distributed algorithm is at the heart of a society of
    computers




    Computer Networks vs. Distributed Systems                                              Distributed….
                                                              • Distributed systems are built up on top of existing networking and
                                                                operating systems software.
• Computer Network: the autonomous computers are
                                                              • The Middleware enables computers to coordinate their activities
  explicitly visible                                            and to share the resources of the system
                                                                 – Middleware is the bridge that connects distributed applications across
• Distributed System: existence of multiple                        dissimilar physical locations, with dissimilar hardware platforms, network
  autonomous computers is transparent                              technologies, operating systems, and programming languages.
                                                              • Middleware provides standard services such as naming, concurrency
• Many problems in common                                       control, event distribution, authorization to specify access rights to
                                                                resources, security etc.
• Normally, every distributed system relies on
  services provided by a computer network.




                                                                                                                                                2
Computing Platforms Evolution: Breaking
                      Administrative Barriers                                                                                               Foster-Kesselman
                                                                                                                  • The Foster-Kesselman duo organized in                  Ian Foster
                                                                                                                    1997, at Argonne National Laboratory,                  Mathematics and Computer
                                                                                                                    a workshop entitled “Building a                        Science Division
                                                                                                                    Computational Grid”.
                                               2100   2100   2100
                                       2100




                                                                                                                                                                           Argonne National Laboratory
                                                                                                                                                                           Argonne, IL 60439
P                          ?                                                                                      • At this moment the term “Grid” was
E                                                                                                                   born.
R
                                                                                                                  • The workshop was followed in 1998 by
                                       2100    2100   2100   2100




F                              2100

                                                                                    Administrative Barriers
O
R                                                                                                                   the publication of the book “The Grid:
M
                                                                                     Individual
                                                                                     Group
                                                                                                                    Blueprint for a New Computing
A
N                                                                                    Department                     Infrastructure” by Foster and
C                                                                                    Campus                         Kesselman themselves.                                  Carl Kesselman
E                                                                                    State                                                                                 Information Sciences Institute
                                                                                     National                     • For these reasons they are not only to                 University of Southern
                                                                                     Globe
                                                                                     Inter Planet
                                                                                                                    be considered the fathers of the Grid                  California
                                                                                     Universe                       but their book, which in the meantime                  Marina del Rey, CA 90292
                                                                                                                    was almost entirely rewritten and re-
                                                                                                                    published in 2003, is also considered the
         Desktop             SMPs or           Local                 Enterprise       Global      Inter Planet
                                                                                                                    “Grid bible”.
      (Single Processor)    SuperCom          Cluster               Cluster/Grid
                              puters                                               Cluster/Grid Cluster/Grid ??




                           The Need for Collaboration?                                                                              Electric Grid and Grid Computing
                                                                                                                  • Computing grids are conceptually not unlike
    • The worldwide business demands intense                                                                        electrical grids.
      problem-solving capabilities for incredibly                                                                 • Electric power grid - a variety of resources
                                                                                                                    contribute power into a shared "pool" for many
      complex problems                                                                                              consumers to access on an as-needed basis.
       – the need for dynamic collaboration of many                                                                  – In an electrical grid, wall outlets allows us to
                                                                                                                        link to an infrastructure of resources that
         computing resources to be able to work together.                                                               generate, distribute, and bill for electricity.
    • This is a difficult challenge across all the technical                                                         – When you connect to the electrical grid, you
                                                                                                                        don‟t need to know where the power plant is
      communities to achieve this level of resource                                                                     or how the current gets to you.
      collaboration within the bounds of the necessary                                                            • Grid computing uses middleware to coordinate
                                                                                                                    disparate IT resources across a network,
      quality requirements of the end user.                                                                         allowing them to function as a virtual whole.
                                                                                                                     – The goal of a computing grid, like that of
                                                                                                                        the electrical grid, is to provide users with access
                                                                                                                        to the resources they need, when they need them.




                                      Why Grids ?
                    Large Scale Exploration needs them

    Solving technology problems using computer
      modeling, simulation and analysis
                                                                                                Geographic
                                                                                                Information
                                                                                                Systems


Life Sciences                           Aerospace




     CAD/CAM
                                                                Military Applications




                                                                                                                                                                                                            3
CERN’s Large Hadron Collider                                                                  Client-Server Model
1800 Physicists, 150 Institutes, 32 Countries                                     The most widely used




                                                                                                Client   invocatio n                                             Server
                                                                                                                                           invocatio n


                                                                                                         result                               result
                                                                                                                       Server



                                                                                                Client
  100 PB of data by 2010; 50,000 CPUs?                                                                                 Key:
                                                                                                                                Process:                 Computer:




                                                                                    Source

The Large Hadron Collider (LHC)                                                     Router
   A gigantic scientific instrument near Geneva
   It is a particle accelerator used by physicists to study the smallest known   “Interested”
   particles – the fundamental building blocks of all things.                      End-host




                                                                                                          Client-Server




                            Why P2P?
                                                                                    Source

                                                                                    Router

                                                                                 “Interested”
                                                                                   End-host




                                                                                                                                                                          4
Client-Server                                                                          Why P2P?
                                              Overloaded!



                                                                                            Personal Computers
                                                                                             80% idle CPU time


                                                                                                                  Internet
                                                                                          Laptop
                                                                                           90% idle CPU time



   Source
                                                                                           Computers in our Lab
   Router                                                                                   99% idle CPU time
                                                                                             !                       Hot Spots become hotter
“Interested”
  End-host




                                                                                                     What is driving P2P?
     Problem with Client-Server Model
                                                                                        • Clients are not so dumb.
      – Scalability                                                                     • Billions of Mhz CPU, tons of terabytes
            • As the number of users increases, there is a higher                         disk, millions of gigabits network
              demand for computing power, storage space, and
                                                                                          bandwidth, …
              bandwidth associated with the server-side
      – Reliability                                                                       – Unused resources.
            • The whole network will depend on the highly loaded
              server to function properly




               Computer System Taxonomy                                                            P2P – An overlay network
                                                                                        • P2P overlay network
                                                                                                                                        C
                                Computer Systems                                           – The connected nodes                                        E

                                                                                             construct a virtual overlay
          Centralized Systems                                                                network on top of the                              F
                                                 Distributed Systems
          (mainframes, SMPs)                                                                 underlying network
                                                                                             infrastructure                       B

                            Client - server                                                                                            C
                                                                       Peer- to- Peer      – Peer-to-peer network                                       E

                                                                                             topology is a virtual overlay   A
                                                                                             at application layer                               F

                                                                                                                                                        G
                                                                                                                                  B

                                                                                                                                                    D

                                                                                                                                                            30




                                                                                                                                                                 5
Typical Characteristics
                                                                                                             • Large Scale: lots of nodes (up to millions)
                                    Internet
                           Client
               Client                 Cache                                                                  • Dynamicity: frequent joins, leaves, failures
      Client                          Proxy     Client
                                                                                                             • Little or no infrastructure
  Client
                  server     server                  Client             Peer-to-peer model                      – No central server

                   Congestion zone
                                                                                                             • Symmetry: all nodes are “peers” – have same role
   Client                                                          Client/      Client/     Client/
                                           Client                  Server       Server
               Client                                                                       Server
                             Client
                                                    Client/
                                                    Server                                         Client/
 Client/server model                                                server server                  Server

                                                    Client/      Congestion zone Client/
                                                    Server
                                                                                          Server
                                                              Client/
                                                              Server         Client/
                                                                             Server




                                    What is it...                                                              P2P Dominates Internet Traffic
• P2P computing is the sharing of computer resources and
  services by direct exchange between systems.
• These resources and services include the exchange of                                                       • P2P has dominated Internet traffic
  information, processing cycles, cache storage, and disk storage                                                               In 2006, more than 60% of Internet traffic
  for files.
• P2P computing takes advantage of existing desktop computing
  power and networking connectivity,
   – allowing economical clients to leverage their collective
      power to benefit the entire enterprise.
• In a P2P architecture, computers that have traditionally been
  used solely as clients communicate directly among themselves
  and can act as both clients and servers, assuming whatever role
  is most efficient for the network.
• Each node (peer) called servent acts as both a SERVer and a
  cliENT




 Shared folder, neighbors
 Client and server
                                                                                                                  Some Statistics about P2P Systems
                                                                        Peer
                                           Peer                                                              • More than 200 million users registered with skype,
            Peer                                                                                               around 10 million on-line users. (2007)
                        Search
                                                                Peer                      Peer               • Around 4.7M hosts participate SETI@Home (2006)
                              Peer                                                                           • BT accounts for 1/3 of Internet traffic (2007)
                                                                                                             • More than 200,000 simultaneous online users on PPLive
                   Retrieve                                                                                    (streaming video network). (2007)
                     File                                         Peer
                                           Peer                                                              • More than 3,000,000 users downloaded PPStream. (2008)
           Peer
                                    Peer                        Peer


                                                                                                                                                                             36




                                                                                                                                                                                  6
P2P Applications
                                                                                        • In Peer-to-Peer (P2P) computing, applications are
                                                                                          segregated into three main categories:
                                                                                           – distributed computing,
                                                                                           – file sharing, and
                                                                                           – collaborative applications
                                                                                        • The three categories of P2P serve different purposes
                                                                                           – Distributed computing applications typically require the
                                                                                             decomposition of larger problem into smaller parallel problems
                                                                                           – File sharing applications require efficient search across wide
                                                                                             area networks and
                                                                                           – Collaborative applications require update mechanisms to
                                                                                             provide consistency in multi-user environment




             P2P Network Architectures                                                                     P2P Computing
• Centralized (Napster)
                                                                                        • File sharing (e.g.,
• Decentralized                                                                           Gnutella, Freenet,
                                                                                                                                       Communication and collaboration
                                                                                                                                          Groove
                                                                                                                                          Skype
  – Unstructured (Gnutella)                                                               Limewire, KaZaA)
  – Structured (Chord)                                                                  • Collaboration (e.g.,
                                                                                          Magi, Groove, Jabber)                                         Napster
• Hierarchical (MBone)                                                                  • Distributed computing
                                                                                                                                                        Gnutella
                                                                                                                                                        Kazaa
                                                                                                                                                        Freenet     File sharing
• Hybrid (EDonkey)                                                                        (e.g., SETI@home,                                             Overnet
                                                                                          Search for                         SETI@Home
                                                                                          Extraterrestrial                   folding@Home

                                                                                          Intelligence)               Distributed computing




                            Computer Systems



      Centralized Systems                  Distributed Systems
(mainframes, SMPs, workstations)


                         Client - server
                                                                 Peer-to-Peer
                                                                                                      P2P FILE SHARING
                                                                                                        APPLICATIONS
                                            Centralized               Decentralized




                                                     Structured          Unstructured




                                                                                                                                                                                   7
P2P Applications
                                                                               Napster: Example

• File sharing (music, movies, …)
     – utilise the idle disk space for storage and the existing                                                        m5

       network bandwidth for search and download.                                             m6                             E

     – The cost of operation is very low                                                            F
                                                                                                                        m1   A                                 D
          • majority of peers collect only objects that they are                                        E?
                                                                                                             E          m2   B                        m4
            interested in anyway.                                                                                       m3
                                                                                                                        m4
                                                                                                                             C
                                                                                                                             D
                                                                                                                 E?
                                                                                                                        m5   E
     – Eg: Napster, KaZaA and Gnutella                                                                            m5    m6   F

                                                                                                                                                           C
                                                                                                             A
                                                                                                                                                 m3
                                                                                                        m1
                                                                                                                                 m2




                       File Sharing Services                                                        Unstructured P2P

• Publish – insert a new file into the network                                        Flooded to connected peers                      Flooded between supernodes

• Lookup – given a file name X, find the host
  that stores the file
• Retrieval – get a copy of the file                                         search
                                                                                             transfer
                                                                                                                                                      supernode


• Join – join the network                                                                                                                   2.query


• Leave – leave the network
                – Neighbors                                                                                                       1.query                          peer node




                         Centralized P2P                                                           File Sharing: Gnutella
•   Utilize a central directory for object
    location                                                                  • Gnutella is a file sharing protocol
•   For file-sharing P2P, location inquiry            Centralized Server
    form central servers then downloaded
    directly from peers
                                                                              • Gnutella was originally designed by Nullsoft, a
•   Benefits
      – Simplicity
                                                                                subsidiary of America Online.
•
      – Limited bandwidth usage
    Drawbacks                                   1. query
                                                                              • Its architecture is completely decentralised and
      – Unreliable (single point of failure),
         performance bottleneck, and
                                                            upload indexes      distributed
         scalability limits
      – Vulnerable to DoS attacks
                                                    2. response
                                                                              • When a client wishes to connect to the network
      – Copyright infringement
                                                                                they run through a list of nodes that are most likely
                                                                                to be up or take a list from a website and then
                                                           3. transfer          connect to how ever many nodes they want




                                                                                                                                                                               8
Gnutella Search Mechanism                                                          Peer-to-Peer File Sharing is all about the trading of
                                                                               copyrighted music and videos without paying anything to the
                                                                                                          authors
       Assume: m1’s neighbors are m2 and m3; m3’s neighbors
        are m4 and m5;… A,B,C,D,E,F are resources
       TTL

                                       m5
                                                                              query
                                                  E                           music
            m6
                                                                              category
                 F                                                        D
                          E                           E?
                                                           E?    m4
                                                                              KaZaA
                                            E?                                Native
                                                                              Windows
                                  E?
                                                                              Application
                                                                      C
                              A                                               banner
                                                      B         m3
                     m1                                                       ad
                                                 m2
                                                                              3 million users online
                                                                                                              sharing 4 PetaBytes of data




• Advantages
  – Fast lookup
  – Low join and leave overhead
  – Popular files are replicated many times, so lookup with small TTL
    will usually find the file
     • Can choose to retrieve from a number of sources                                                 Searching
• Disadvantages
  – Not 100% success rate, since TTL is limited
  – Very high communication overhead
  – Uneven load distribution




                                   Kazaa                                                 Search in Unstructured P2P

                                                                              Two general types of search in unstructured p2p:
                                                                              Blind: try to propagate the query to a sufficient
                                                                              number of nodes (example Gnutella)
                                                                              Informed: utilize         information      about document
                                                                              locations
      Sharman Networks


   Kazaa is a file sharing program that allow you to download
      audio,video, images, documents and software files.




                                                                                                                                             9
Blind Search Methods
                                                                                                             APS – an example
                     BFS and Random Walk
                                                                               Node J holds the requested object
                                                                               Nodes deploy 2 walkers, initially
                                                                               All index values are 20
                                                                               TTL=3




• BFS                               Random walks
    •In unstructured networks, flooding would exhaust bandwidth of network.




                                                                                   Collaborative Community
                        Informed search                                       • Rapidly changing work environment
                                                                                 – Out-sourcing, in-sourcing, home-sourcing
                                                                                 – Tight integration and team work with customers,
         Informed: utilize information about document                             partners, vendors
          locations.                                                          • P2P allows management of documents at level
                                                                                of closed working groups.
             APS                                                             • The collaboration software is designed to
                                                                                improve the productivity of individuals with
                                                                                common goals or interests.
                                                                              • Groove is a collaborative P2P system
                                                                                (http://www.groove.net)
                                                                                 – Part of the Microsoft Office system
                                                                                 – Document sharing and collaboration –
                                                                                    • vital for a business.
                                                                                 – Office Groove 2007 is a collaboration software program
                                                                                     • helps teams work together dynamically and effectively, even
                                                                                       if team members work for different organizations, or work
                                                                                       remotely.

                                                                                                                                  Work Together: Anyone, Anytime, Anyplace
                                                                                                                                        Microsoft Office Groove 2007




          Adaptive Probabilistic Search
• Each node keeps a local index             Example (indices at node A)
  consisting of one entry for each
  object it has requested per neighbor.         A chooses B with Pr=0.3
• Index values represent the                    A chooses C with Pr =0.5
  probability of finding that object            A chooses D with Pr=0.2
  through that neighbor
• Searching is based on the
  simultaneous deployment of k
  walkers and probabilistic forwarding.
• if a hit occurs, the walker terminates
  successfully.
• On a miss, the query is forwarded to
  one of the node‟s neighbors.




                                                                                                                                                                             10
Distributed Computing: SETI@home
 Search for Extraterrestrial Intelligence -if we are alone
  in the universe or whether there is intelligent life
  somewhere else in the Universe.
 Over two million computers crunching away and
  downloading data gathered from the Arecibo radio
  telescope in Puerto Rico, USA
 The SETI@Home project is widely regarded as the
  fastest computer in the world
 Sharing of resources such as computation power,
  network bandwidth and storage
 Achieves computing power cheaper than a
  supercomputer can provide.
 Developed by the Space Sciences Laboratory, at the
  University of California, Berkeley, in the United
  States.http://setiathome.ssl.berkeley.edu
 Launched in 1996




            How SETI@home works?
   Collect data source
   Use telescope to collect data source from outer space at
    Arecibo.
   The SETI@home use data recorder to record data source on
    removable tape.
   Distribution of data source
   SETI@home divide data into fixed-size work units.
   SETI@home distribute these data via Internet from the
    servers to a client program.
   Client program computes result ,then returns it to the server,
    and gets another work unit.




            How SETI@home works? …
• Scientific experiment - uses Internet-connected computers
• Distributes a screen saver–based application to users
• Applies signal analysis algorithms different data sets to process radio-telescope data.
• Has more than 3 million users




                                   3. SETI client gets
                                   data from server and runs
      Main Server
                             4. Client sends results
                             back to server
      Radio-telescope
           Data


                                                          2. SETI client (screen
                                                          Saver) starts




                                                                                            11
Super nodes
• “… a free program that uses the latest P2P…technology to                                • Super nodes are Skip clients run by users that have a
  bring affordable and high quality voice communications to people                          “good” Internet connection and a “good” computer.
  all over the world…”                                                                    • Having a good Internet connection means having a public
• Skype offers voice, video, chat and data transfer                                         IP address, without firewall restrictions.
  services over IP                                                                        • A good computer is a machine that can forward other
• The first stable version of Skype has been released in July                               users‟ communications and handle many connections.
  2004, since then the number of users kept on growing.                                   • SN have a role of relay in the network
• Nowadays Skype claims having more than 20 millions                                         – Hence, they need a better connectivity and better performances.
  accounts and between 4 and 6 millions of users                                          • 1 SN are used to connect SC together.
  simultaneously connected.




                                                                                                                      Skype
             Skype Software features
                                                                                          Skype – login
• VoIP from computer to computer
   – The most used feature especially.                                                    • Skype clients directly connect to login
• VoIP from computer to regular phone (Skype Out)
   – By registering on Skype‟s website it is possible buy credit and then call all over
                                                                                            servers, whose IP addresses are hard
     the world with very interesting rates compared to rates applied by phone
     companies.
                                                                                            coded within the software.
• Video conferencing Introduced in Skype2.0 in 2006.                                        – In this connection the login name and
• Instant Messaging This feature is comparable to many other                                  the version are sent in clear text format.
  instant messaging clients like MSN Messenger, Yahoo! Messenger,
  Google Talk, etc.
   – The main difference is that Skype does not tell the user whether the person he
                                                                                          • The login server stores all of user
     is chatting with is typing or not. This is due to the P2P design of the Skype
     network.
                                                                                            names and passwords and ensures
• File Transfer                                                                             that names are unique across the
   – The Skype network design has a big influence on the quality of file transfers.
     It can make it very fast (1Mbps) or very slow (3 kbps).                                Skype name space




                Internet Telephony - Skype
• The participants form a self-organizing                                                 • Connection to a bootstrap node
  P2P overlay network to locate and                                                          – When SC (Skype Client) is installed the first time it
  communicate with other participants.
                                                                                               come with a list of SN to connect to.
• The bandwidth is shared and the sound
  or video in real time is shared as resource                                                – First, the Skype Client tries to connect to 5 SN sending
• Skype has a similar architecture as its                                                      a UDP packet to IP addresses of super nodes
  predecessor KaZaA                                                                            randomly chosen in the host cache.
• There are three types of nodes in the                                                      – When the client finds a super node to connect to, it
  Skype network:
                                                                                               refreshes its list of active and available super nodes in
   – Ordinary-peers
                                                                                               host cache.
   – Super-nodes
   – Central login server                                                                    – SC connects to a SN
• Communications are encrypted (RSA)




                                                                                                                                                                 12
Traffic volume content type (Germany, BitTorent)
  Skype - user search
  • Similar to KaZa (searching for callee)
  • Client sends an user name to SN and as an answer
    receives few IP addresses and port numbers
  • Subsequently the client contacts these nodes
  • If it cannot find the user it sends request to its SN
    once again and as a result receives another few IP
    addresses and port numbers
  • The process continues until the user is found




                                                                                              What is PPLive?
  Skype - call establishment                                         What is PPLive?
                                                                       – An online video broadcasting and advertising
  • Routing in the Skype overlay network is done by                       network
                                                                            • Provides an online viewing experience
    the SN.                                                                   comparable to that of traditional TV
                                                                              broadcasting
                                                                            • 75 million global installed base and 20
  • When a SC tries to establish a call, it first ask its                     million monthly active users
                                                                            • 600+ channels on PPLive with content
    SN (if it is not a SN itself) where is the callee and                     ranging from news, music, sports, movies,
    tries to connect directly to it.                                          games, live video and other interactive
                                                                              services to a global audience
                                                                       – An efficient P2P technique platform and test
       – If the SC is restricted because of firewall then it will         bench
         connect to the callee using a SN as a relay.                History of PPLive:
       – If both a caller and a callee have public IP addresses, a   • Bill’s story
                                                                         – Inventor of PPLive core technology
         caller sends signaling information over TCP to a callee         – Dropped out of post-graduate program to start
                                                                            PPLive




  P2P VIDEO STREAMING                                                                                 PPLIVE
• Streaming video is content sent in compressed form over
  the Internet and displayed by the viewer in real time.
• With streaming video or streaming media, a Web user does
  not have to wait to download a file to play it - the media is
  sent in a continuous stream of data and is played as it
  arrives.
• The user needs a player, which is a special program that
  uncompresses and sends video data to the display and
  audio data to speakers.
• A player can be either an integral part of a browser or
  downloaded from the software maker's Web site.

• P2P streaming
   – P2P TV
       • PPLive, PPStream, Joost (by Skype
         founders), …




                                                                                                                            13
Streaming Tree Reconstruction after a Peer
                            Industry Trends                                                             Departure
PPLive is well positioned to exploit the next explosive growth




          Advanced                        Video Streaming
                                                                 PPLive
         Applications

                                    VOIP
                                                         Skype

                            Downloading
                                            BitTorrent
                           File Sharing
           Basic                  Napster
        Applications



                                    2001       2003      2004     2005




                                           PPLive                                                    Multi-tree Streaming




        Media Server (channel management server) - Retrieve list of channels via HTTP

        Membership Server -Retrieve small list of members nodes of interest via UDP     Since all peers are involved in the data distribution, the load is
                                                                                        spread among all nodes.




                        Single-tree Streaming                                           A snapshot of a tree-based overlay with 231 nodes
• A common approach to P2P
  streaming is to organize
  participating peers into a single
  tree-structured overlay
    – The content is pushed from the
      source towards all peers.
    – This way organizing peers is called
      single-tree streaming.
• In these systems, peers are
  hierarchically organized in a tree
  structure where the root is the
  stream source.
• The content is spread as a
  continuous flow of information
  from the source down to the
  tree.




                                                                                                                                                             14
Overall Architecture
                                                                          Web Server          Tracker

                          Bit Torrent

                 •Created by Brahm Cohen in 2001
                                                                                                         C
                                                                  A
                                                                                                        Peer
                                                                Peer                                    [Seed]
                                                                                        B
                                                              [Leech]
                                                             Downloader                Peer
                                                                “US”               [Leech]




                 What is BitTorrent?
                                                                          Overall Architecture
• A peer-to-peer file transfer protocol
                                                                                              Tracker
• Extremely popular today                                                 Web Server

• “Pull-based” “swarming” approach
• Each file split into smaller pieces
• Nodes request desired pieces from
  neighbors
• As opposed to parents pushing data
                                                                                                         C
  that they receive                                               A
• Pieces not downloaded in sequential                                                                   Peer
  order                                                         Peer                                    [Seed]
                                                                                        B
• Encourages contribution by all nodes                        [Leech]
                                                             Downloader                Peer
                                                                “US”               [Leech]




                 Overall Architecture                                     Overall Architecture
                  Web Server              Tracker                         Web Server          Tracker




                                                     C                                                   C
         A                                                        A
                                                    Peer                                                Peer
       Peer                                         [Seed]      Peer                                    [Seed]
                                B                                                       B
     [Leech]                                                  [Leech]
    Downloader                 Peer                          Downloader                Peer
       “US”                [Leech]                              “US”               [Leech]




                                                                                                                 15
Overall Architecture                                       BitTorrent Lingo
             Web Server          Tracker

                                                      Seeder = a peer that provides the complete file.
                                                      Initial seeder = a peer that provides the initial copy.
                                                                                Leecher
                                                             Initial seeder
                                                                                                     One who is downloading
                                            C
     A
                                           Peer
   Peer                                    [Seed]                                                                Leecher
                           B
 [Leech]
Downloader                Peer
                                                       Seeder
   “US”               [Leech]




             Overall Architecture                                    BitTorrent Basics
             Web Server          Tracker
                                                    • Files are broken into pieces.
                                                      – Users each download different pieces from the
                                                        original uploader (seed).
                                                      – Users exchange the pieces with their peers to obtain
                                                        the ones they are missing.

     A
                                            C
                                                    • This process is organized by a centralized server
                                           Peer       called the Tracker.
   Peer                                    [Seed]
                           B
 [Leech]
Downloader                Peer
   “US”               [Leech]




             Overall Architecture                                    Critical Elements
             Web Server          Tracker
                                                    • A web server
                                                       – stores and serves the .torrent file.
                                                       – For example:
                                                           • http://bt.btchina.net                  Web Server

                                                           • http://bt.ydy.com/
                                            C
     A
                                           Peer                 The Lord of Ring.torrent
   Peer                                    [Seed]
                           B
 [Leech]
                                                                                           Troy.torrent
Downloader                Peer
   “US”               [Leech]




                                                                                                                              16
BitTorrent Swarm
                  Critical Elements
                                                                                        • Swarm
 • The .torrent file                                                                          – Set of peers all downloading the same file
   – Static „metainfo‟ file to contain necessary                                              – Organized as a random mesh
     information :                                                                      • Each node knows list of pieces downloaded by neighbors
       •   URL of tracker                                                               • Node requests pieces it does not own from neighbors
       •   Piece length – Usually 256 KB                              Matrix.torrent    -------------------------------------------------
       •   SHA-1 hashes of each piece in file                                           • swarm
       •   IP address of the Tracker                                                          – The group of machines that are collectively connected for a
                                                                                                particular file.
                                                                                                    • For example, if you start a BitTorrent client and it tells you that you're
                                                                                                      connected to 10 peers and 3 seeds, then the swarm consists of you and
                                                                                                      those 13 other people.




                                                                                                     How a node enters a swarm
                  Critical Elements
                                                                                                       for file “popeye.mp4”
 • A BitTorrent tracker
   – The tracker maintains information about all BitTorrent                                                                                • File popeye.mp4.torrent
     clients utilizing each torrent.                                                                                                         hosted at a (well-known)
   – The tracker identifies the network location of each client                                                                              webserver
     either uploading or downloading the P2P file associated with                                                                          • The .torrent has address of
     a torrent.
                                                                                                                                             tracker for file
   – It also tracks which fragment(s) of that file each client
     possesses, to assist in efficient data sharing between clients.                                                                       • The tracker, which runs on a
       • i.e. the tracker keeps track of all peers downloading file                                                                          webserver as well, keeps
   For example:                                                                                                                              track of all peers
       • http://bt.cnxp.com:8080/announce                                                                                                    downloading file
       • http://btfans.3322.org:6969/announce




                    Critical Elements                                                                How a node enters a swarm
                                                                                                       for file “popeye.mp4”
• An end user (peer)                                                                                               www.bittorrent.com
  – Guys who want to use BitTorrent must install                                                                                           • File popeye.mp4.torrent
    corresponding software or plug-in for web browsers.                                                                                      hosted at a (well-known)
                                                                                                1
  – Downloader (leecher) : Peer has only a part ( or none ) of                                                                               webserver
    the file.                                                                          Peer                                                • The .torrent has address of
                                                                                                                                             tracker for file
  – Seeder: Peer has the complete file, and chooses to stay
                                                                                                                                           • The tracker, which runs on a
    in the system to allow other peers to download
                                                                                                                                             webserver as well, keeps
  – BitTorrent clients connect to a tracker when attempting                                                                                  track of all peers
    to work with torrent files.                                                                                                              downloading file
     • The tracker notifies the client of the P2P file location (that is
       normally on a different, remote server).




                                                                                                                                                                                   17
How a node enters a swarm                                                 Three elements necessary to sharing a file
               for file “popeye.mp4”                                                              with BitTorrent
                         www.bittorrent.com                                        •       The tracker - coordinates connections among the peers.
                                                                                       –      Tracker doesn't know anything of the actual contents of a file
                                                  • File popeye.mp4.torrent            –      Generally, it's considered good manners to continue seeding a file after you
                                                    hosted at a (well-known)                  have finished downloading, to help out others.
                                                    webserver                      •       The web server - stores and serves the .torrent file.
             2                                    • The .torrent has address of    •       At least one seeder
Peer
                                                                                       –      Contains any of the file's actual contents.
                                                    tracker for file                   –      The seeder is almost always an end-user's desktop machine (peer), rather
                              Tracker             • The tracker, which runs on a              than a dedicated server machine.
                                                    webserver as well, keeps           –      Seeding is monitored by the Tracker
                                                                                       –      Seed your file for a long time to prevent peers from being left with
                                                    track of all peers                        incomplete files.
                                                    downloading file               •       When you finish a download in BitTorrent, and you are only
                                                                                           uploading, you're seeding!




             How a node enters a swarm
                                                                                                                         File sharing
               for file “popeye.mp4”
                         www.bittorrent.com
                                                                                       Large files are broken into pieces of size between
                                                  • File popeye.mp4.torrent
                                                    hosted at a (well-known)           64 KB and 1 MB
                                                    webserver
Peer                                              • The .torrent has address of
                                                    tracker for file
         3                    Tracker             • The tracker, which runs on a
                                                    webserver as well, keeps
                                                    track of all peers
                                                    downloading file
                                                                                               1           2         3          4       5         6          7       8
 Swarm




                       BT: publishing a file                                                                     A trivial example
                                                                                                                                    {1,2,3,4,5,6,7,8,9,10}
                             Harry Potter.torrent
                 Bob
                                                                                                                         User

                                                                                                                       Seeder:
                                                                                                                       John
                                                             Web Server




                                                                                                                                                        {}
                                                                                                                                                         {1,2,3}
                                                        Tracker                                                                                          {1,2,3,5}
                                                                                                                {}
                                                                                                               {1,2,3}
                                                                                                                {1,2,3,4}
                                                                                                                {1,2,3,4,5}                      User




       Downloader:       Seeder:              Downloader:                                           User
                                                                                                                                             Downloader
       A                 B                    C                                                 Downloader                                   Joe
                                                                                                Fan Bin




                                                                                                                                                                             18
Types of P2P Attacks
         P2P Technical Challenges
                                                                 • Poisoning: a client can provide content that doesn‟t
                                                                   match the description.
   •   Routing protocols                                           – A client A, can broadcast a message saying it needs file
   •   Network topologies                                            „X‟. A malicious client can send a message back to A
   •   Peer discovery                                                saying it has file X, then send it file Y.
   •   Communication/coordination protocols                      • Denial of Service attacks that decrease or cease
   •   Quality of service                                          total capable network activity.
   •   Security                                                  • Defection attacks which allow a client to participate
                                                                   on the network with a very low upload-to-
                                                                   download ratio.




                                                                              Types of P2P Attacks….


         P2P SECURITY                                             • Virus attacks, where a malicious client can add
                                                                    viruses into files shared on the network.
                                                                  • Malware attacks, where the P2P software
            Security is the condition of being protected            contains spyware.
                       against danger or loss.                    • Filtering attacks, where network operators may
                                                                    attempt to prevent P2P network data from being
                                                                    carried out.




                    P2P Security                                              Attacks On & From
• P2P file sharing networks are constantly under                  • Attacks on P2P systems:
  attack.
• P2P is potentially more vulnerable than client server.
  – Decentralized
  – More difficult to manage and control                          • Attacks from P2P Systems:
• Need to understand the security issues for
  architecting future P2P apps


                                                           111                                                             114




                                                                                                                                 19
Attacks on P2P sharing                                                             File Pollution
   Two types:
                                                                           Unsuspecting users
                                                                                                                      Alice
                                                                           spread pollution !
   • Pollution: file corruption  File Content
   • Index poisoning  File Index




                                                                   115                                Bob                118




                                                                                                  File Pollution
original content
                     polluted content

                                                                           Unsuspecting users
                                                                           spread pollution !
        pollution
        company




                                                                                           Yuck

                     File Pollution
                                                                   116                                                   119




                        File Pollution                                                   INDEX POISONING
                                                                         • Aim of the attacker is to make several
                                                                           peers believe that some popular file is
                                                                           present with the victim.
                                                                         • Attacker sends a location publish
                                                 pollution                 message to every crawled peer.
                                                 server                  • In this message, the attacker includes
                                                                           victim‟s IP address and port number.
      pollution                                                          • Attacker puts the file hash of a popular
      company                                                              file along with the message.
                                        file sharing                     • Peer B adds this file hash into it along
                                        network                            with the location of the victim.
                    pollution                                pollution   • When a peer C searches for that file, it
                    server                                   server        may be told by some poisoned peer that
                                                                           victim has the file.

                                                 pollution
                                                 server
                                                                   117




                                                                                                                               20
Index Poisoning                                                                      Free Riding

                                                                               • Peers share little or no data in P2P file-sharing
                                                                                 systems
                                 index                     23.123.78.6
                           title     location                                  • Measurement
                           bigparty 123.12.7.98
                           smallfun 23.123.78.6
                                                                                   – Nearly 70% of Gnutella users share no files
      123.12.7.98          heyhey 234.8.89.20                                      – Nearly 50% of all responses are returned by the
                                                                                     top 1% of sharing hosts
                                          file sharing                         • Incentive mechanisms to encourage user
                                          network
                                                                                 cooperation

                                                         234.8.89.20
                                                                         121




                    Index Poisoning

                                                                                                             P2P Worms
                                 index                     23.123.78.6
                           title     location
                           bigparty 123.12.7.98                                                 Topological                 Passive
      123.12.7.98
                           smallfun 23.123.78.6
                           heyhey 234.8.89.20
                                                                                                Scan Worms                  Worms
                           bighit    111.22.22.22




                                                                               A computer worm is a self-replicating malware computer program.
                                                         234.8.89.20           It uses a computer network to send copies of itself to other nodes
                           111.22.22.22                                        It may do so without any user intervention.
                                                                         122




   ROUTING TABLE POISONING                                                            TOPOLOGICAL WORM ATTACK
• The aim of the attacker is to
  make the peers add victim as
  their neighbors
• Attacker sends node
  announcement messages to
  every crawled peer.
• Attacker includes victim‟s IP
  address and port number in
  these messages
• The peers add victim as their
  neighbor
• Query messages are forwarded
  to the victim




                                                                                                                                                     21
TOPOLOGICAL WORM ATTACK                                                                        Effects

                                                                                  • Eating up free disk space
                                                                                  • Benjamin opens a Web page, called
                                                                                    benjamin.xww.de to display banner ads.
                                                                                    – One day morning the Benjamin.xww.de Web site
                                                                                      had a message saying: "Domain closed due to
                                                                                      massive abuse."




                     PASSIVE P2P WORMS

• Vulnerability in the protocol
• Wait for the vulnerable targets to contact them
• Case 1
  – Worm can create infected copies of itself with attractive filenames and
    place them in the shared folder of the P2P client or will replace the files
    present in the shared folder with itself                                        How vulnerable is BitTorrent?
  – e.g. VBS.Gnutella, Benjamin Worm etc.
• Case 2
  – Answers positively to a proportion of search queries by changing the
    name of the corrupted file to match the search query
  – e.g. Gnuman




                                                                                                                                     131




               P2P-Worm.Win32.Benjamin.a
                                                                                               Pollution Attack
 • P2P-Worm.Win32.Benjamin.a (Kaspersky Lab) is also
   known as: Worm.P2P.Benjamin.a (Kaspersky Lab),                                 • 1. The peers
   W32/Benjamin.worm (McAfee),                                                      receive the peer
   W32.Benjamin.Worm (Symantec),
   Win32.HLLW.Benjamin (Doctor Web)                                                 list from the
 • This worm uses the Kazaa file exchange P2P network                               tracker.
   to spread itself.
 • Benjamin is written in Borland Delphi and is
   approximately 216 Kb in size - it is compressed by the
   AsPack utility.




                                                                                                                                           22
Pollution Attack                    DDOS Attack
• 2. One peer                   • DDOS = Distributed denial of service
  contacts the                  • Based on the fact the BitTorrent Tracker has no
  attacker for a                  mechanism for validating peers.
  chunk of the file.            • Uses modified client software




             Pollution Attack                    DDOS Attack
• The attacker sends            • 1. The attacker
  back a false                    downloads a large
  chunk.                          number of torrent
• This false chunk                files from a web
  will fail its hash              server.
  and will be
  discarded.




             Pollution Attack                    DDOS Attack
• 4. Attacker                   • 2. The attacker parses
  requests all chunks             the torrent files with a
                                  modified BitTorrent
  from swarm and                  client and spoofs his IP
  wastes their                    address and port
  upload bandwidth.               number with the victims
                                  as he announces he is
                                  joining the swarm.




                                                                                    23
Current Solutions: Pollution
                              DDOS Attack
                                                                                                       Attacks
• 3. As the tracker                                                                      • Blacklisting
  receives requests for a                                                                  – Achieved using software such as Peer Guardian or
  list of participating                                                                      moBlock.
  peers from other                                                                         – Blocks connections from blacklisted IPs which are
  clients it sends the                                                                       downloaded from an online database.
  victims IP and port
  number.




                                                                                        Solutions – TRUST and REPUTATION
                              DDOS Attack
                                                                                         • Most of the solutions proposed to solve the problem of attacks are
 • 4. The peers then                                                                       based on building trust (and/or reputation) between
   attempt to                                                                              the peers
   connect to the                                                                        • Some of the popular approaches are:
                                                                                            – DCRS - Bit Torrent
   victim to try and
                                                                                            – EigenTrust
   download a chunk
                                                                                            – XRep
   of the file.                                                                          • These approaches do slow down the attack




                         Attack illustration
                                                                                          What is Trust? What is reputation?
                                                                                       • Trust – a peer‟s belief in another peer‟s capabilities, honesty
                                           victim
                                                                                         and reliability based on its own experiences.
                                                                                       • Reputation – a peer‟s belief in another peer‟s capabilities,
                                                        Who has the files?               honesty and reliability based on recommendations received
                                          Tracker                                        from other peers.
                                                                             clients
                                                                                         – Reputation can be centralized, computed by a third party or it can
   Discussion                                                                              be decentralized, computed independently by each other after
     forum                                                                                 asking other peers recommendations.
                                                Victim has the files!
                .torrent
                   .torrent
                      .torrent
                         .torrent
                            .torrent
                               .torrent   attacker




                                                                                                                                                                24
What is Trust? ……..                                       An Example Trust Management System
• Both Trust and Reputation are used to evaluate a peer‟s                            (BitTorrent)
  trustworthiness.
• Trust and Reputation increase or decrease with further             • Debit-Credit Reputation System
  experience.                                                        • Each client calculates a local trust
• Trust and reputation both depend on some context.                    score for their peers Based on valid
                                                                       pieces uploaded /downloaded
• For example:
                                                                     • Tracker combines these individual
  – Mike trusts John as his doctor, but he doesn‟t trust John as a     scores to make a global score
    mechanic who can fix his car.
      • In the context of seeing a doctor, John is trustworthy
      • In the context of fixing a car, John untrustworthy.




                                                                                       DCRS… …(cont’d)
      What is Trust Management ?
                                                                      Local Trust Score Computation
  • “Trust Management” was first coined by Blaze
    et. al 1996                                                       Fij=Uij-Dij,
                                                                               Uij – the number of chunks that i uploaded to j,
    – a coherent framework for the study of security
                                                                               Dij- the number of chunks that i downloaded from j
      policies, security credentials and trust                        Using Fij, the local trust score LTij is computed as
      relationships.                                                      -1 if bogus chunk is uploaded by peer j
                                                                          0 if Fij >t
                                                                          1 if Fij <= t, where „t‟ is the fairness threshold




              Reputation Management                                                      DCRS… …(cont’d)
• Need for trust mechanisms                                          Global Trust Score Computation
  – To assess trustworthiness of peers and the content               • Global Trust Scores are a representation the rest of the
        • Malicious peers generate unlimited number of inauthentic
                                                                       swarms opinion of a peer.
          files                                                      • At regular interval the tracker receives the local trust
   – To deter malicious behavior                                       scores of peers in the swarm.
• Reputation is an assumption that past behavior is                  • The tracker chooses „k‟ , where „k‟ is < the number of
  indicative of future behavior                                        peers in the swarm, random local trust scores for peer j in
                                                                       the swarm.
• Use of reputation to build trust
                                                                     • Tracker uses k local trust scores for peer j and sets the
                                                                       average of them as the global trust score for j




                                                                                                                                     25
Research Issues in P2P Netwroks
Research Issues in P2P Netwroks
Research Issues in P2P Netwroks
Research Issues in P2P Netwroks
Research Issues in P2P Netwroks
Research Issues in P2P Netwroks

More Related Content

What's hot

Distributed Processing
Distributed ProcessingDistributed Processing
Distributed ProcessingImtiaz Hussain
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed SystemsRupsee
 
IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...
IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...
IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...Vincent Kwon
 
Apos week 1 4
Apos week 1   4Apos week 1   4
Apos week 1 4alixafar
 
Digital technologies
Digital technologiesDigital technologies
Digital technologiesjamberryxxx
 
Intro (Distributed computing)
Intro (Distributed computing)Intro (Distributed computing)
Intro (Distributed computing)Sri Prasanna
 
Distributed computing for new bloods
Distributed computing for new bloodsDistributed computing for new bloods
Distributed computing for new bloodsRaymond Tay
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Fazli Amin
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit INANDINI SHARMA
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systemsvampugani
 
Distributed computing
Distributed computingDistributed computing
Distributed computingDeepak John
 
Distributed Operating System
Distributed Operating SystemDistributed Operating System
Distributed Operating SystemAjithaG9
 
Distributed Computing system
Distributed Computing system Distributed Computing system
Distributed Computing system Sarvesh Meena
 
Chapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsChapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsFrancelyno Murela
 
Distributed operating system
Distributed operating systemDistributed operating system
Distributed operating systemPrankit Mishra
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systemsnaveedchak
 
Distributed systems1
Distributed systems1Distributed systems1
Distributed systems1Sumita Das
 

What's hot (20)

Distributed Processing
Distributed ProcessingDistributed Processing
Distributed Processing
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
 
Module 1
Module 1Module 1
Module 1
 
IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...
IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...
IBM Smart Business Desktop Cloud - How to optimise the ROI from your desktop ...
 
Apos week 1 4
Apos week 1   4Apos week 1   4
Apos week 1 4
 
Digital technologies
Digital technologiesDigital technologies
Digital technologies
 
Intro (Distributed computing)
Intro (Distributed computing)Intro (Distributed computing)
Intro (Distributed computing)
 
Distributed computing for new bloods
Distributed computing for new bloodsDistributed computing for new bloods
Distributed computing for new bloods
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)
 
Microkernel Evolution
Microkernel EvolutionMicrokernel Evolution
Microkernel Evolution
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 
Distributed Operating System
Distributed Operating SystemDistributed Operating System
Distributed Operating System
 
Distributed Computing system
Distributed Computing system Distributed Computing system
Distributed Computing system
 
Chapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systemsChapter 1 -_characterization_of_distributed_systems
Chapter 1 -_characterization_of_distributed_systems
 
Distributed operating system
Distributed operating systemDistributed operating system
Distributed operating system
 
Cloud Computing & Distributed Computing
Cloud Computing & Distributed ComputingCloud Computing & Distributed Computing
Cloud Computing & Distributed Computing
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
 
Distributed systems1
Distributed systems1Distributed systems1
Distributed systems1
 

Viewers also liked

Ruth Consuelo Ibarra Ballesteros vocal del CNE en Galápagos
Ruth Consuelo Ibarra Ballesteros vocal del CNE en GalápagosRuth Consuelo Ibarra Ballesteros vocal del CNE en Galápagos
Ruth Consuelo Ibarra Ballesteros vocal del CNE en GalápagosCarlos Mena
 
S om bezit aan te geven
S om bezit aan te gevenS om bezit aan te geven
S om bezit aan te gevenmissie_maes
 
jwb_cv_Mar15
jwb_cv_Mar15jwb_cv_Mar15
jwb_cv_Mar15Joe Bull
 
Compare and contrast two contents pages
Compare and contrast two contents pagesCompare and contrast two contents pages
Compare and contrast two contents pagesheatherjanew
 
Poster Noro 2016 Faculty day.pptx (2)
Poster Noro 2016 Faculty day.pptx (2)Poster Noro 2016 Faculty day.pptx (2)
Poster Noro 2016 Faculty day.pptx (2)Kgothatso Meno
 
Creating a library web presence
Creating a library web presenceCreating a library web presence
Creating a library web presencetechnolibrary
 
Клуб Здоровья - презентация
Клуб Здоровья - презентацияКлуб Здоровья - презентация
Клуб Здоровья - презентацияTanya341
 

Viewers also liked (12)

Ruth Consuelo Ibarra Ballesteros vocal del CNE en Galápagos
Ruth Consuelo Ibarra Ballesteros vocal del CNE en GalápagosRuth Consuelo Ibarra Ballesteros vocal del CNE en Galápagos
Ruth Consuelo Ibarra Ballesteros vocal del CNE en Galápagos
 
Test
TestTest
Test
 
Resume_Feb_16
Resume_Feb_16Resume_Feb_16
Resume_Feb_16
 
S om bezit aan te geven
S om bezit aan te gevenS om bezit aan te geven
S om bezit aan te geven
 
jwb_cv_Mar15
jwb_cv_Mar15jwb_cv_Mar15
jwb_cv_Mar15
 
Compare and contrast two contents pages
Compare and contrast two contents pagesCompare and contrast two contents pages
Compare and contrast two contents pages
 
Planning plj 3r
Planning plj 3rPlanning plj 3r
Planning plj 3r
 
Lean In: Find Your Passion
Lean In: Find Your PassionLean In: Find Your Passion
Lean In: Find Your Passion
 
Poster Noro 2016 Faculty day.pptx (2)
Poster Noro 2016 Faculty day.pptx (2)Poster Noro 2016 Faculty day.pptx (2)
Poster Noro 2016 Faculty day.pptx (2)
 
Creating a library web presence
Creating a library web presenceCreating a library web presence
Creating a library web presence
 
Environmental Outreach Makeovers
Environmental Outreach MakeoversEnvironmental Outreach Makeovers
Environmental Outreach Makeovers
 
Клуб Здоровья - презентация
Клуб Здоровья - презентацияКлуб Здоровья - презентация
Клуб Здоровья - презентация
 

Similar to Research Issues in P2P Netwroks

introduction to cloud computing for college.pdf
introduction to cloud computing for college.pdfintroduction to cloud computing for college.pdf
introduction to cloud computing for college.pdfsnehan789
 
Unit 1 architecture of distributed systems
Unit 1 architecture of distributed systemsUnit 1 architecture of distributed systems
Unit 1 architecture of distributed systemskaran2190
 
CCS335 – CLOUD COMPUTING.pptx
CCS335 – CLOUD COMPUTING.pptxCCS335 – CLOUD COMPUTING.pptx
CCS335 – CLOUD COMPUTING.pptxNiviV4
 
Concepts of Distributed Computing & Cloud Computing
Concepts of Distributed Computing & Cloud Computing Concepts of Distributed Computing & Cloud Computing
Concepts of Distributed Computing & Cloud Computing Hitesh Kumar Markam
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]Raul Soto
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material ccAnkit Gupta
 
_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdf_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdfTyStrk
 
Week 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdfWeek 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdfJohn422973
 
OIT552 Cloud Computing Material
OIT552 Cloud Computing MaterialOIT552 Cloud Computing Material
OIT552 Cloud Computing Materialpkaviya
 
3 - Grid Computing.pptx
3 - Grid Computing.pptx3 - Grid Computing.pptx
3 - Grid Computing.pptxRiazSalim1
 
Parallel and Distributed Computing: BOINC Grid Implementation Paper
Parallel and Distributed Computing: BOINC Grid Implementation PaperParallel and Distributed Computing: BOINC Grid Implementation Paper
Parallel and Distributed Computing: BOINC Grid Implementation PaperRodrigo Neves
 
CCS335 - Cloud architecture model and infrastructure
CCS335 - Cloud architecture model and infrastructureCCS335 - Cloud architecture model and infrastructure
CCS335 - Cloud architecture model and infrastructureNiviV4
 
vssutcloud computing.pptx
vssutcloud computing.pptxvssutcloud computing.pptx
vssutcloud computing.pptxMunmunSaha7
 
Distributed Computing
Distributed Computing Distributed Computing
Distributed Computing Megha yadav
 

Similar to Research Issues in P2P Netwroks (20)

CCUnit1.pdf
CCUnit1.pdfCCUnit1.pdf
CCUnit1.pdf
 
introduction to cloud computing for college.pdf
introduction to cloud computing for college.pdfintroduction to cloud computing for college.pdf
introduction to cloud computing for college.pdf
 
Unit 1 architecture of distributed systems
Unit 1 architecture of distributed systemsUnit 1 architecture of distributed systems
Unit 1 architecture of distributed systems
 
CCS335 – CLOUD COMPUTING.pptx
CCS335 – CLOUD COMPUTING.pptxCCS335 – CLOUD COMPUTING.pptx
CCS335 – CLOUD COMPUTING.pptx
 
Concepts of Distributed Computing & Cloud Computing
Concepts of Distributed Computing & Cloud Computing Concepts of Distributed Computing & Cloud Computing
Concepts of Distributed Computing & Cloud Computing
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material cc
 
_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdf_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdf
 
Week 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdfWeek 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdf
 
OIT552 Cloud Computing Material
OIT552 Cloud Computing MaterialOIT552 Cloud Computing Material
OIT552 Cloud Computing Material
 
Komputasi Awan
Komputasi AwanKomputasi Awan
Komputasi Awan
 
3 - Grid Computing.pptx
3 - Grid Computing.pptx3 - Grid Computing.pptx
3 - Grid Computing.pptx
 
"Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N...
"Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N..."Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N...
"Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo N...
 
Parallel and Distributed Computing: BOINC Grid Implementation Paper
Parallel and Distributed Computing: BOINC Grid Implementation PaperParallel and Distributed Computing: BOINC Grid Implementation Paper
Parallel and Distributed Computing: BOINC Grid Implementation Paper
 
CCS335 - Cloud architecture model and infrastructure
CCS335 - Cloud architecture model and infrastructureCCS335 - Cloud architecture model and infrastructure
CCS335 - Cloud architecture model and infrastructure
 
CC unit 1.pptx
CC unit 1.pptxCC unit 1.pptx
CC unit 1.pptx
 
vssutcloud computing.pptx
vssutcloud computing.pptxvssutcloud computing.pptx
vssutcloud computing.pptx
 
Distributed Computing
Distributed Computing Distributed Computing
Distributed Computing
 
OS_MD_1.pdf
OS_MD_1.pdfOS_MD_1.pdf
OS_MD_1.pdf
 
Types of computing
Types of computingTypes of computing
Types of computing
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Research Issues in P2P Netwroks

  • 1. How to Run Applications Faster ? Research Issues in P2P • There are 3 ways to improve performance: – Work Harder Computing – Work Smarter – Get Help • Computer Analogy – faster hardware high performance processors or peripheral devices – Optimized algorithms and techniques used to solve computational tasks – Multiple computers to solve a particular task Distributed…. OUTLINE • When a handful of powerful computers are • Centralized Vs. Distributed linked together and communicate with each • What is P2P? other • P2P Architectures – the overall computing power available can be • P2P and Applications amazingly vast. • Search and Replication Techniques – Such a system can have a higher performance share • P2P Security than a single supercomputer. • Emerging P2P Applications – The objective of such systems is to minimize • Conclusion communication and computation cost. Centralized? • Computation in networks of processing • Distributed system is an application that executes a nodes can be classified into centralized or distributed computations. collection of protocols to coordinate the actions of • A centralized solution relies on one node multiple processes on a communication network, being designated as the computer node that processes the entire application such that all components cooperate together to locally perform a single or small set of related tasks. • The central system is shared by all the users all the time. • There is single point of control and single point of failure. 1
  • 2. Examples of Distributed Systems • The Internet • The collaborating computers can access remote – Heterogeneous resources as well as local resources in the network of computers distributed system via the communication network. and applications • The existence of multiple autonomous computers is – Implemented through the Internet transparent to the user in a distributed system. Protocol Stack – The user is not aware that the jobs are executed by multiple computers subsist in remote locations. – A centralized algorithm is at the heart of a single computer. – A distributed algorithm is at the heart of a society of computers Computer Networks vs. Distributed Systems Distributed…. • Distributed systems are built up on top of existing networking and operating systems software. • Computer Network: the autonomous computers are • The Middleware enables computers to coordinate their activities explicitly visible and to share the resources of the system – Middleware is the bridge that connects distributed applications across • Distributed System: existence of multiple dissimilar physical locations, with dissimilar hardware platforms, network autonomous computers is transparent technologies, operating systems, and programming languages. • Middleware provides standard services such as naming, concurrency • Many problems in common control, event distribution, authorization to specify access rights to resources, security etc. • Normally, every distributed system relies on services provided by a computer network. 2
  • 3. Computing Platforms Evolution: Breaking Administrative Barriers Foster-Kesselman • The Foster-Kesselman duo organized in Ian Foster 1997, at Argonne National Laboratory, Mathematics and Computer a workshop entitled “Building a Science Division Computational Grid”. 2100 2100 2100 2100 Argonne National Laboratory Argonne, IL 60439 P ? • At this moment the term “Grid” was E born. R • The workshop was followed in 1998 by 2100 2100 2100 2100 F 2100 Administrative Barriers O R the publication of the book “The Grid: M Individual Group Blueprint for a New Computing A N Department Infrastructure” by Foster and C Campus Kesselman themselves. Carl Kesselman E State Information Sciences Institute National • For these reasons they are not only to University of Southern Globe Inter Planet be considered the fathers of the Grid California Universe but their book, which in the meantime Marina del Rey, CA 90292 was almost entirely rewritten and re- published in 2003, is also considered the Desktop SMPs or Local Enterprise Global Inter Planet “Grid bible”. (Single Processor) SuperCom Cluster Cluster/Grid puters Cluster/Grid Cluster/Grid ?? The Need for Collaboration? Electric Grid and Grid Computing • Computing grids are conceptually not unlike • The worldwide business demands intense electrical grids. problem-solving capabilities for incredibly • Electric power grid - a variety of resources contribute power into a shared "pool" for many complex problems consumers to access on an as-needed basis. – the need for dynamic collaboration of many – In an electrical grid, wall outlets allows us to link to an infrastructure of resources that computing resources to be able to work together. generate, distribute, and bill for electricity. • This is a difficult challenge across all the technical – When you connect to the electrical grid, you don‟t need to know where the power plant is communities to achieve this level of resource or how the current gets to you. collaboration within the bounds of the necessary • Grid computing uses middleware to coordinate disparate IT resources across a network, quality requirements of the end user. allowing them to function as a virtual whole. – The goal of a computing grid, like that of the electrical grid, is to provide users with access to the resources they need, when they need them. Why Grids ? Large Scale Exploration needs them Solving technology problems using computer modeling, simulation and analysis Geographic Information Systems Life Sciences Aerospace CAD/CAM Military Applications 3
  • 4. CERN’s Large Hadron Collider Client-Server Model 1800 Physicists, 150 Institutes, 32 Countries The most widely used Client invocatio n Server invocatio n result result Server Client 100 PB of data by 2010; 50,000 CPUs? Key: Process: Computer: Source The Large Hadron Collider (LHC) Router A gigantic scientific instrument near Geneva It is a particle accelerator used by physicists to study the smallest known “Interested” particles – the fundamental building blocks of all things. End-host Client-Server Why P2P? Source Router “Interested” End-host 4
  • 5. Client-Server Why P2P? Overloaded! Personal Computers 80% idle CPU time Internet Laptop 90% idle CPU time Source Computers in our Lab Router 99% idle CPU time ! Hot Spots become hotter “Interested” End-host What is driving P2P? Problem with Client-Server Model • Clients are not so dumb. – Scalability • Billions of Mhz CPU, tons of terabytes • As the number of users increases, there is a higher disk, millions of gigabits network demand for computing power, storage space, and bandwidth, … bandwidth associated with the server-side – Reliability – Unused resources. • The whole network will depend on the highly loaded server to function properly Computer System Taxonomy P2P – An overlay network • P2P overlay network C Computer Systems – The connected nodes E construct a virtual overlay Centralized Systems network on top of the F Distributed Systems (mainframes, SMPs) underlying network infrastructure B Client - server C Peer- to- Peer – Peer-to-peer network E topology is a virtual overlay A at application layer F G B D 30 5
  • 6. Typical Characteristics • Large Scale: lots of nodes (up to millions) Internet Client Client Cache • Dynamicity: frequent joins, leaves, failures Client Proxy Client • Little or no infrastructure Client server server Client Peer-to-peer model – No central server Congestion zone • Symmetry: all nodes are “peers” – have same role Client Client/ Client/ Client/ Client Server Server Client Server Client Client/ Server Client/ Client/server model server server Server Client/ Congestion zone Client/ Server Server Client/ Server Client/ Server What is it... P2P Dominates Internet Traffic • P2P computing is the sharing of computer resources and services by direct exchange between systems. • These resources and services include the exchange of • P2P has dominated Internet traffic information, processing cycles, cache storage, and disk storage In 2006, more than 60% of Internet traffic for files. • P2P computing takes advantage of existing desktop computing power and networking connectivity, – allowing economical clients to leverage their collective power to benefit the entire enterprise. • In a P2P architecture, computers that have traditionally been used solely as clients communicate directly among themselves and can act as both clients and servers, assuming whatever role is most efficient for the network. • Each node (peer) called servent acts as both a SERVer and a cliENT Shared folder, neighbors Client and server Some Statistics about P2P Systems Peer Peer • More than 200 million users registered with skype, Peer around 10 million on-line users. (2007) Search Peer Peer • Around 4.7M hosts participate SETI@Home (2006) Peer • BT accounts for 1/3 of Internet traffic (2007) • More than 200,000 simultaneous online users on PPLive Retrieve (streaming video network). (2007) File Peer Peer • More than 3,000,000 users downloaded PPStream. (2008) Peer Peer Peer 36 6
  • 7. P2P Applications • In Peer-to-Peer (P2P) computing, applications are segregated into three main categories: – distributed computing, – file sharing, and – collaborative applications • The three categories of P2P serve different purposes – Distributed computing applications typically require the decomposition of larger problem into smaller parallel problems – File sharing applications require efficient search across wide area networks and – Collaborative applications require update mechanisms to provide consistency in multi-user environment P2P Network Architectures P2P Computing • Centralized (Napster) • File sharing (e.g., • Decentralized Gnutella, Freenet, Communication and collaboration Groove Skype – Unstructured (Gnutella) Limewire, KaZaA) – Structured (Chord) • Collaboration (e.g., Magi, Groove, Jabber) Napster • Hierarchical (MBone) • Distributed computing Gnutella Kazaa Freenet File sharing • Hybrid (EDonkey) (e.g., SETI@home, Overnet Search for SETI@Home Extraterrestrial folding@Home Intelligence) Distributed computing Computer Systems Centralized Systems Distributed Systems (mainframes, SMPs, workstations) Client - server Peer-to-Peer P2P FILE SHARING APPLICATIONS Centralized Decentralized Structured Unstructured 7
  • 8. P2P Applications Napster: Example • File sharing (music, movies, …) – utilise the idle disk space for storage and the existing m5 network bandwidth for search and download. m6 E – The cost of operation is very low F m1 A D • majority of peers collect only objects that they are E? E m2 B m4 interested in anyway. m3 m4 C D E? m5 E – Eg: Napster, KaZaA and Gnutella m5 m6 F C A m3 m1 m2 File Sharing Services Unstructured P2P • Publish – insert a new file into the network Flooded to connected peers Flooded between supernodes • Lookup – given a file name X, find the host that stores the file • Retrieval – get a copy of the file search transfer supernode • Join – join the network 2.query • Leave – leave the network – Neighbors 1.query peer node Centralized P2P File Sharing: Gnutella • Utilize a central directory for object location • Gnutella is a file sharing protocol • For file-sharing P2P, location inquiry Centralized Server form central servers then downloaded directly from peers • Gnutella was originally designed by Nullsoft, a • Benefits – Simplicity subsidiary of America Online. • – Limited bandwidth usage Drawbacks 1. query • Its architecture is completely decentralised and – Unreliable (single point of failure), performance bottleneck, and upload indexes distributed scalability limits – Vulnerable to DoS attacks 2. response • When a client wishes to connect to the network – Copyright infringement they run through a list of nodes that are most likely to be up or take a list from a website and then 3. transfer connect to how ever many nodes they want 8
  • 9. Gnutella Search Mechanism Peer-to-Peer File Sharing is all about the trading of copyrighted music and videos without paying anything to the authors  Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… A,B,C,D,E,F are resources  TTL m5 query E music m6 category F D E E? E? m4 KaZaA E? Native Windows E? Application C A banner B m3 m1 ad m2 3 million users online sharing 4 PetaBytes of data • Advantages – Fast lookup – Low join and leave overhead – Popular files are replicated many times, so lookup with small TTL will usually find the file • Can choose to retrieve from a number of sources Searching • Disadvantages – Not 100% success rate, since TTL is limited – Very high communication overhead – Uneven load distribution Kazaa Search in Unstructured P2P Two general types of search in unstructured p2p: Blind: try to propagate the query to a sufficient number of nodes (example Gnutella) Informed: utilize information about document locations Sharman Networks Kazaa is a file sharing program that allow you to download audio,video, images, documents and software files. 9
  • 10. Blind Search Methods APS – an example BFS and Random Walk Node J holds the requested object Nodes deploy 2 walkers, initially All index values are 20 TTL=3 • BFS Random walks •In unstructured networks, flooding would exhaust bandwidth of network. Collaborative Community Informed search • Rapidly changing work environment – Out-sourcing, in-sourcing, home-sourcing – Tight integration and team work with customers,  Informed: utilize information about document partners, vendors locations. • P2P allows management of documents at level of closed working groups.  APS • The collaboration software is designed to improve the productivity of individuals with common goals or interests. • Groove is a collaborative P2P system (http://www.groove.net) – Part of the Microsoft Office system – Document sharing and collaboration – • vital for a business. – Office Groove 2007 is a collaboration software program • helps teams work together dynamically and effectively, even if team members work for different organizations, or work remotely. Work Together: Anyone, Anytime, Anyplace Microsoft Office Groove 2007 Adaptive Probabilistic Search • Each node keeps a local index Example (indices at node A) consisting of one entry for each object it has requested per neighbor. A chooses B with Pr=0.3 • Index values represent the A chooses C with Pr =0.5 probability of finding that object A chooses D with Pr=0.2 through that neighbor • Searching is based on the simultaneous deployment of k walkers and probabilistic forwarding. • if a hit occurs, the walker terminates successfully. • On a miss, the query is forwarded to one of the node‟s neighbors. 10
  • 11. Distributed Computing: SETI@home  Search for Extraterrestrial Intelligence -if we are alone in the universe or whether there is intelligent life somewhere else in the Universe.  Over two million computers crunching away and downloading data gathered from the Arecibo radio telescope in Puerto Rico, USA  The SETI@Home project is widely regarded as the fastest computer in the world  Sharing of resources such as computation power, network bandwidth and storage  Achieves computing power cheaper than a supercomputer can provide.  Developed by the Space Sciences Laboratory, at the University of California, Berkeley, in the United States.http://setiathome.ssl.berkeley.edu  Launched in 1996 How SETI@home works?  Collect data source  Use telescope to collect data source from outer space at Arecibo.  The SETI@home use data recorder to record data source on removable tape.  Distribution of data source  SETI@home divide data into fixed-size work units.  SETI@home distribute these data via Internet from the servers to a client program.  Client program computes result ,then returns it to the server, and gets another work unit. How SETI@home works? … • Scientific experiment - uses Internet-connected computers • Distributes a screen saver–based application to users • Applies signal analysis algorithms different data sets to process radio-telescope data. • Has more than 3 million users 3. SETI client gets data from server and runs Main Server 4. Client sends results back to server Radio-telescope Data 2. SETI client (screen Saver) starts 11
  • 12. Super nodes • “… a free program that uses the latest P2P…technology to • Super nodes are Skip clients run by users that have a bring affordable and high quality voice communications to people “good” Internet connection and a “good” computer. all over the world…” • Having a good Internet connection means having a public • Skype offers voice, video, chat and data transfer IP address, without firewall restrictions. services over IP • A good computer is a machine that can forward other • The first stable version of Skype has been released in July users‟ communications and handle many connections. 2004, since then the number of users kept on growing. • SN have a role of relay in the network • Nowadays Skype claims having more than 20 millions – Hence, they need a better connectivity and better performances. accounts and between 4 and 6 millions of users • 1 SN are used to connect SC together. simultaneously connected. Skype Skype Software features Skype – login • VoIP from computer to computer – The most used feature especially. • Skype clients directly connect to login • VoIP from computer to regular phone (Skype Out) – By registering on Skype‟s website it is possible buy credit and then call all over servers, whose IP addresses are hard the world with very interesting rates compared to rates applied by phone companies. coded within the software. • Video conferencing Introduced in Skype2.0 in 2006. – In this connection the login name and • Instant Messaging This feature is comparable to many other the version are sent in clear text format. instant messaging clients like MSN Messenger, Yahoo! Messenger, Google Talk, etc. – The main difference is that Skype does not tell the user whether the person he • The login server stores all of user is chatting with is typing or not. This is due to the P2P design of the Skype network. names and passwords and ensures • File Transfer that names are unique across the – The Skype network design has a big influence on the quality of file transfers. It can make it very fast (1Mbps) or very slow (3 kbps). Skype name space Internet Telephony - Skype • The participants form a self-organizing • Connection to a bootstrap node P2P overlay network to locate and – When SC (Skype Client) is installed the first time it communicate with other participants. come with a list of SN to connect to. • The bandwidth is shared and the sound or video in real time is shared as resource – First, the Skype Client tries to connect to 5 SN sending • Skype has a similar architecture as its a UDP packet to IP addresses of super nodes predecessor KaZaA randomly chosen in the host cache. • There are three types of nodes in the – When the client finds a super node to connect to, it Skype network: refreshes its list of active and available super nodes in – Ordinary-peers host cache. – Super-nodes – Central login server – SC connects to a SN • Communications are encrypted (RSA) 12
  • 13. Traffic volume content type (Germany, BitTorent) Skype - user search • Similar to KaZa (searching for callee) • Client sends an user name to SN and as an answer receives few IP addresses and port numbers • Subsequently the client contacts these nodes • If it cannot find the user it sends request to its SN once again and as a result receives another few IP addresses and port numbers • The process continues until the user is found What is PPLive? Skype - call establishment What is PPLive? – An online video broadcasting and advertising • Routing in the Skype overlay network is done by network • Provides an online viewing experience the SN. comparable to that of traditional TV broadcasting • 75 million global installed base and 20 • When a SC tries to establish a call, it first ask its million monthly active users • 600+ channels on PPLive with content SN (if it is not a SN itself) where is the callee and ranging from news, music, sports, movies, tries to connect directly to it. games, live video and other interactive services to a global audience – An efficient P2P technique platform and test – If the SC is restricted because of firewall then it will bench connect to the callee using a SN as a relay. History of PPLive: – If both a caller and a callee have public IP addresses, a • Bill’s story – Inventor of PPLive core technology caller sends signaling information over TCP to a callee – Dropped out of post-graduate program to start PPLive P2P VIDEO STREAMING PPLIVE • Streaming video is content sent in compressed form over the Internet and displayed by the viewer in real time. • With streaming video or streaming media, a Web user does not have to wait to download a file to play it - the media is sent in a continuous stream of data and is played as it arrives. • The user needs a player, which is a special program that uncompresses and sends video data to the display and audio data to speakers. • A player can be either an integral part of a browser or downloaded from the software maker's Web site. • P2P streaming – P2P TV • PPLive, PPStream, Joost (by Skype founders), … 13
  • 14. Streaming Tree Reconstruction after a Peer Industry Trends Departure PPLive is well positioned to exploit the next explosive growth Advanced Video Streaming PPLive Applications VOIP Skype Downloading BitTorrent File Sharing Basic Napster Applications 2001 2003 2004 2005 PPLive Multi-tree Streaming Media Server (channel management server) - Retrieve list of channels via HTTP Membership Server -Retrieve small list of members nodes of interest via UDP Since all peers are involved in the data distribution, the load is spread among all nodes. Single-tree Streaming A snapshot of a tree-based overlay with 231 nodes • A common approach to P2P streaming is to organize participating peers into a single tree-structured overlay – The content is pushed from the source towards all peers. – This way organizing peers is called single-tree streaming. • In these systems, peers are hierarchically organized in a tree structure where the root is the stream source. • The content is spread as a continuous flow of information from the source down to the tree. 14
  • 15. Overall Architecture Web Server Tracker Bit Torrent •Created by Brahm Cohen in 2001 C A Peer Peer [Seed] B [Leech] Downloader Peer “US” [Leech] What is BitTorrent? Overall Architecture • A peer-to-peer file transfer protocol Tracker • Extremely popular today Web Server • “Pull-based” “swarming” approach • Each file split into smaller pieces • Nodes request desired pieces from neighbors • As opposed to parents pushing data C that they receive A • Pieces not downloaded in sequential Peer order Peer [Seed] B • Encourages contribution by all nodes [Leech] Downloader Peer “US” [Leech] Overall Architecture Overall Architecture Web Server Tracker Web Server Tracker C C A A Peer Peer Peer [Seed] Peer [Seed] B B [Leech] [Leech] Downloader Peer Downloader Peer “US” [Leech] “US” [Leech] 15
  • 16. Overall Architecture BitTorrent Lingo Web Server Tracker Seeder = a peer that provides the complete file. Initial seeder = a peer that provides the initial copy. Leecher Initial seeder One who is downloading C A Peer Peer [Seed] Leecher B [Leech] Downloader Peer Seeder “US” [Leech] Overall Architecture BitTorrent Basics Web Server Tracker • Files are broken into pieces. – Users each download different pieces from the original uploader (seed). – Users exchange the pieces with their peers to obtain the ones they are missing. A C • This process is organized by a centralized server Peer called the Tracker. Peer [Seed] B [Leech] Downloader Peer “US” [Leech] Overall Architecture Critical Elements Web Server Tracker • A web server – stores and serves the .torrent file. – For example: • http://bt.btchina.net Web Server • http://bt.ydy.com/ C A Peer The Lord of Ring.torrent Peer [Seed] B [Leech] Troy.torrent Downloader Peer “US” [Leech] 16
  • 17. BitTorrent Swarm Critical Elements • Swarm • The .torrent file – Set of peers all downloading the same file – Static „metainfo‟ file to contain necessary – Organized as a random mesh information : • Each node knows list of pieces downloaded by neighbors • URL of tracker • Node requests pieces it does not own from neighbors • Piece length – Usually 256 KB Matrix.torrent ------------------------------------------------- • SHA-1 hashes of each piece in file • swarm • IP address of the Tracker – The group of machines that are collectively connected for a particular file. • For example, if you start a BitTorrent client and it tells you that you're connected to 10 peers and 3 seeds, then the swarm consists of you and those 13 other people. How a node enters a swarm Critical Elements for file “popeye.mp4” • A BitTorrent tracker – The tracker maintains information about all BitTorrent • File popeye.mp4.torrent clients utilizing each torrent. hosted at a (well-known) – The tracker identifies the network location of each client webserver either uploading or downloading the P2P file associated with • The .torrent has address of a torrent. tracker for file – It also tracks which fragment(s) of that file each client possesses, to assist in efficient data sharing between clients. • The tracker, which runs on a • i.e. the tracker keeps track of all peers downloading file webserver as well, keeps For example: track of all peers • http://bt.cnxp.com:8080/announce downloading file • http://btfans.3322.org:6969/announce Critical Elements How a node enters a swarm for file “popeye.mp4” • An end user (peer) www.bittorrent.com – Guys who want to use BitTorrent must install • File popeye.mp4.torrent corresponding software or plug-in for web browsers. hosted at a (well-known) 1 – Downloader (leecher) : Peer has only a part ( or none ) of webserver the file. Peer • The .torrent has address of tracker for file – Seeder: Peer has the complete file, and chooses to stay • The tracker, which runs on a in the system to allow other peers to download webserver as well, keeps – BitTorrent clients connect to a tracker when attempting track of all peers to work with torrent files. downloading file • The tracker notifies the client of the P2P file location (that is normally on a different, remote server). 17
  • 18. How a node enters a swarm Three elements necessary to sharing a file for file “popeye.mp4” with BitTorrent www.bittorrent.com • The tracker - coordinates connections among the peers. – Tracker doesn't know anything of the actual contents of a file • File popeye.mp4.torrent – Generally, it's considered good manners to continue seeding a file after you hosted at a (well-known) have finished downloading, to help out others. webserver • The web server - stores and serves the .torrent file. 2 • The .torrent has address of • At least one seeder Peer – Contains any of the file's actual contents. tracker for file – The seeder is almost always an end-user's desktop machine (peer), rather Tracker • The tracker, which runs on a than a dedicated server machine. webserver as well, keeps – Seeding is monitored by the Tracker – Seed your file for a long time to prevent peers from being left with track of all peers incomplete files. downloading file • When you finish a download in BitTorrent, and you are only uploading, you're seeding! How a node enters a swarm File sharing for file “popeye.mp4” www.bittorrent.com Large files are broken into pieces of size between • File popeye.mp4.torrent hosted at a (well-known) 64 KB and 1 MB webserver Peer • The .torrent has address of tracker for file 3 Tracker • The tracker, which runs on a webserver as well, keeps track of all peers downloading file 1 2 3 4 5 6 7 8 Swarm BT: publishing a file A trivial example {1,2,3,4,5,6,7,8,9,10} Harry Potter.torrent Bob User Seeder: John Web Server {} {1,2,3} Tracker {1,2,3,5} {} {1,2,3} {1,2,3,4} {1,2,3,4,5} User Downloader: Seeder: Downloader: User Downloader A B C Downloader Joe Fan Bin 18
  • 19. Types of P2P Attacks P2P Technical Challenges • Poisoning: a client can provide content that doesn‟t match the description. • Routing protocols – A client A, can broadcast a message saying it needs file • Network topologies „X‟. A malicious client can send a message back to A • Peer discovery saying it has file X, then send it file Y. • Communication/coordination protocols • Denial of Service attacks that decrease or cease • Quality of service total capable network activity. • Security • Defection attacks which allow a client to participate on the network with a very low upload-to- download ratio. Types of P2P Attacks…. P2P SECURITY • Virus attacks, where a malicious client can add viruses into files shared on the network. • Malware attacks, where the P2P software Security is the condition of being protected contains spyware. against danger or loss. • Filtering attacks, where network operators may attempt to prevent P2P network data from being carried out. P2P Security Attacks On & From • P2P file sharing networks are constantly under • Attacks on P2P systems: attack. • P2P is potentially more vulnerable than client server. – Decentralized – More difficult to manage and control • Attacks from P2P Systems: • Need to understand the security issues for architecting future P2P apps 111 114 19
  • 20. Attacks on P2P sharing File Pollution Two types: Unsuspecting users Alice spread pollution ! • Pollution: file corruption  File Content • Index poisoning  File Index 115 Bob 118 File Pollution original content polluted content Unsuspecting users spread pollution ! pollution company Yuck File Pollution 116 119 File Pollution INDEX POISONING • Aim of the attacker is to make several peers believe that some popular file is present with the victim. • Attacker sends a location publish pollution message to every crawled peer. server • In this message, the attacker includes victim‟s IP address and port number. pollution • Attacker puts the file hash of a popular company file along with the message. file sharing • Peer B adds this file hash into it along network with the location of the victim. pollution pollution • When a peer C searches for that file, it server server may be told by some poisoned peer that victim has the file. pollution server 117 20
  • 21. Index Poisoning Free Riding • Peers share little or no data in P2P file-sharing systems index 23.123.78.6 title location • Measurement bigparty 123.12.7.98 smallfun 23.123.78.6 – Nearly 70% of Gnutella users share no files 123.12.7.98 heyhey 234.8.89.20 – Nearly 50% of all responses are returned by the top 1% of sharing hosts file sharing • Incentive mechanisms to encourage user network cooperation 234.8.89.20 121 Index Poisoning P2P Worms index 23.123.78.6 title location bigparty 123.12.7.98 Topological Passive 123.12.7.98 smallfun 23.123.78.6 heyhey 234.8.89.20 Scan Worms Worms bighit 111.22.22.22 A computer worm is a self-replicating malware computer program. 234.8.89.20 It uses a computer network to send copies of itself to other nodes 111.22.22.22 It may do so without any user intervention. 122 ROUTING TABLE POISONING TOPOLOGICAL WORM ATTACK • The aim of the attacker is to make the peers add victim as their neighbors • Attacker sends node announcement messages to every crawled peer. • Attacker includes victim‟s IP address and port number in these messages • The peers add victim as their neighbor • Query messages are forwarded to the victim 21
  • 22. TOPOLOGICAL WORM ATTACK Effects • Eating up free disk space • Benjamin opens a Web page, called benjamin.xww.de to display banner ads. – One day morning the Benjamin.xww.de Web site had a message saying: "Domain closed due to massive abuse." PASSIVE P2P WORMS • Vulnerability in the protocol • Wait for the vulnerable targets to contact them • Case 1 – Worm can create infected copies of itself with attractive filenames and place them in the shared folder of the P2P client or will replace the files present in the shared folder with itself How vulnerable is BitTorrent? – e.g. VBS.Gnutella, Benjamin Worm etc. • Case 2 – Answers positively to a proportion of search queries by changing the name of the corrupted file to match the search query – e.g. Gnuman 131 P2P-Worm.Win32.Benjamin.a Pollution Attack • P2P-Worm.Win32.Benjamin.a (Kaspersky Lab) is also known as: Worm.P2P.Benjamin.a (Kaspersky Lab), • 1. The peers W32/Benjamin.worm (McAfee), receive the peer W32.Benjamin.Worm (Symantec), Win32.HLLW.Benjamin (Doctor Web) list from the • This worm uses the Kazaa file exchange P2P network tracker. to spread itself. • Benjamin is written in Borland Delphi and is approximately 216 Kb in size - it is compressed by the AsPack utility. 22
  • 23. Pollution Attack DDOS Attack • 2. One peer • DDOS = Distributed denial of service contacts the • Based on the fact the BitTorrent Tracker has no attacker for a mechanism for validating peers. chunk of the file. • Uses modified client software Pollution Attack DDOS Attack • The attacker sends • 1. The attacker back a false downloads a large chunk. number of torrent • This false chunk files from a web will fail its hash server. and will be discarded. Pollution Attack DDOS Attack • 4. Attacker • 2. The attacker parses requests all chunks the torrent files with a modified BitTorrent from swarm and client and spoofs his IP wastes their address and port upload bandwidth. number with the victims as he announces he is joining the swarm. 23
  • 24. Current Solutions: Pollution DDOS Attack Attacks • 3. As the tracker • Blacklisting receives requests for a – Achieved using software such as Peer Guardian or list of participating moBlock. peers from other – Blocks connections from blacklisted IPs which are clients it sends the downloaded from an online database. victims IP and port number. Solutions – TRUST and REPUTATION DDOS Attack • Most of the solutions proposed to solve the problem of attacks are • 4. The peers then based on building trust (and/or reputation) between attempt to the peers connect to the • Some of the popular approaches are: – DCRS - Bit Torrent victim to try and – EigenTrust download a chunk – XRep of the file. • These approaches do slow down the attack Attack illustration What is Trust? What is reputation? • Trust – a peer‟s belief in another peer‟s capabilities, honesty victim and reliability based on its own experiences. • Reputation – a peer‟s belief in another peer‟s capabilities, Who has the files? honesty and reliability based on recommendations received Tracker from other peers. clients – Reputation can be centralized, computed by a third party or it can Discussion be decentralized, computed independently by each other after forum asking other peers recommendations. Victim has the files! .torrent .torrent .torrent .torrent .torrent .torrent attacker 24
  • 25. What is Trust? …….. An Example Trust Management System • Both Trust and Reputation are used to evaluate a peer‟s (BitTorrent) trustworthiness. • Trust and Reputation increase or decrease with further • Debit-Credit Reputation System experience. • Each client calculates a local trust • Trust and reputation both depend on some context. score for their peers Based on valid pieces uploaded /downloaded • For example: • Tracker combines these individual – Mike trusts John as his doctor, but he doesn‟t trust John as a scores to make a global score mechanic who can fix his car. • In the context of seeing a doctor, John is trustworthy • In the context of fixing a car, John untrustworthy. DCRS… …(cont’d) What is Trust Management ? Local Trust Score Computation • “Trust Management” was first coined by Blaze et. al 1996 Fij=Uij-Dij, Uij – the number of chunks that i uploaded to j, – a coherent framework for the study of security Dij- the number of chunks that i downloaded from j policies, security credentials and trust Using Fij, the local trust score LTij is computed as relationships. -1 if bogus chunk is uploaded by peer j 0 if Fij >t 1 if Fij <= t, where „t‟ is the fairness threshold Reputation Management DCRS… …(cont’d) • Need for trust mechanisms Global Trust Score Computation – To assess trustworthiness of peers and the content • Global Trust Scores are a representation the rest of the • Malicious peers generate unlimited number of inauthentic swarms opinion of a peer. files • At regular interval the tracker receives the local trust – To deter malicious behavior scores of peers in the swarm. • Reputation is an assumption that past behavior is • The tracker chooses „k‟ , where „k‟ is < the number of indicative of future behavior peers in the swarm, random local trust scores for peer j in the swarm. • Use of reputation to build trust • Tracker uses k local trust scores for peer j and sets the average of them as the global trust score for j 25