Network Coding Based P2P File Sharing: History
Contributors:

Peer-to-peer (P2P)  files sharing networks have been one of the most reliable platforms to realize
the potential of network coding, since end hosts at the edge of the Internet have abundant computational resources with modern processors.
In this survey paper, we extensively summarize, assess, compare, and classify the most recently used techniques to improve P2P content distribution systems performance using network coding. To the best of our knowledge, this survey is the first comprehensive survey that specifically focuses on the performance of network coding based P2P file sharing systems.

  • peer-to-peer computing
  • random linear network coding
  • file sharing
  • rarest-piece issue
The Peer-to-Peer (P2P) architecture has triggered a technical revolution of large content distribution service on the Internet. The P2P model is the antithesis of the classic client–server architecture, in which each node can behave either as a server or a client. In the client–server architecture, data stored on the server is sent to clients on a request basis. However, for a large number of clients, the workload on the server may be too heavy, leading to extremely low download rates. Moreover, for this architecture, the server represents a single point of failure. On the contrary, in the P2P architecture, each node is called a servent [1,2] which means that a node may behave as a server as well as a client at the same time. This leads to higher bandwidth utilization, more availability of content, and reduced download times [3]. The BitTorrent protocol [4] is considered to be one of the most successful P2P file sharing protocols. BitTorrent-like systems work in two phases, the first phase utilizes a discovery protocol, in which peers discover and establish connections to each other. Detailed discussions on the discovery protocol can be found in [5,6,7,8,9,10]. The second phase utilizes file dissemination protocol, in which peers start to download and upload the chunks of a file until they get the complete file.
In the file dissemination protocol, the algorithm used for the selection of the pieces to download is highly important to achieve an acceptable level of system performance. In fact, wrong selection choices may lead to a situation where pieces of a file owned by a peer are not required by any other peer. Conversely, a piece which is needed by many peers is either very rare or not available within the network. Consequently, the peer requires a specific policy for downloading the pieces of the file. The original specification of the BitTorrent protocol [4] requires that clients are able to download pieces in a purely random way. Subsequently, new selection techniques were introduced to improve performance [11,12]. In general, an involved peer in any P2P file sharing network must answer the following questions when requesting a file: (1) which pieces should be downloaded, and (2) from which peers? This is known as piece scheduling [13], piece selection [14], or the coupon collector’s problem [15]. The other important challenge, especially in large-scale content distribution networks, is that the departure, or the failure of a peer is naturally dynamic as the peer may depart or fail without notice, which is known as dynamics peer participation or churn [16]. Subsequently, pieces that were owned by the departing peers may become rare or unavailable.
With the assistance of network coding [17], the aforementioned issues can be partially resolved. Rather than distributing the original data pieces, a peer shares encoded pieces, where each encoded piece is a linear combination of the original pieces. In this manner, there is no need for the requester peer to ask for special data pieces since all encoded pieces are almost equally likely to be beneficial to the peer. A peer in the network can blindly download encoded pieces from other peers. When a sufficient number of encoded pieces are owned, the original file can be reconstructed. Using this scheme, the rarest piece problem is avoided, and the download time of the file is potentially reduced [18]. Another feature of network coding is the robustness against sudden peer departure. In the original BitTorrent, if a peer who is the single owner of a piece, leaves the swarm, the complete original file cannot be collected by the remaining peers. However, with network coding, the concern regarding the absence of certain data pieces is no longer an issue. Since the pieces are linearly combined together, each of them is available in a large number of encoded pieces in the network.
The main contributions of this paper are as follows. First, we provide a detailed tutorial of network coding implementation for P2P file sharing. Second, we extensively survey the most important and recent network coding based P2P file sharing protocols and provide a classification for those protocols. Third, this paper, to the best of our knowledge, is the first paper that is specific to network coding based P2P content distribution systems.

This entry is adapted from the peer-reviewed paper 10.3390/app10072206