In a generalization of the FileTransfer
application, some data set has to be distributed from a source to a (possibly large) set of receivers.
There are many use cases for this; here are some examples:
- The LHC case: Data is generated at one point (CERN, the "Tier-0"), and identical copies must be transferred to a distributed set of "Tier-1" centers for storage, processing, and further (partial) dissemination.
- Software download: A new version of e.g. a GNU/Linux distribution is released by a publisher. Many thousand users want to download it over the first few hours/days.
- OS/VM image distribution for data centers and clouds: A disk image containing an OS installation should be replicated to many servers so that virtual machines can be started from it.
Protocols and Systems
- BitTorrent is a peer-to-peer protocol that distributes the dissemination work among a large and dynamic set of nodes.
- Mirror Servers can be used to store copies of popular files; clients are somehow directed to a mirror that is "close" to them and/or has free capacity.
- multicast is useful to efficiently replicate bits when many destinations are interested in the same data; however, building reliable transmission protocols above it is not easy. Some examples of such attempts are
udpcast and FLUTE. The Ghost software by Symantec (originally Norton) is a popular commercial system that can use multicast for image distribution.
- USENET News solves an even more general problem: It distributes data (articles) from many sources to many receivers in a (rather static) peer-to-peer network.
- scp-wave from the OpenNebula cloud toolkit builds on SSH's
scp and uses a distribution tree.
- 08 Dec 2010