Leaf-Spine Architecture, Fat Tree, Folded Clos
An architecture for Data Center networks
that is gaining popularity for its scalability properties.
Servers are connected to "leaf" switches. These are often arranged as "top-of-rack" or ToR
switches. In a redundant setup, each server connects to two
Each leaf switch has connections to all "spine" switches. The spine switches aren't connected directly to each other. Any packet from a given server to another server in another rack goes through the sending server's leaf, then one of the spine switches, then the receiving server's leaf switch. equal-cost multipath
routing is used to distribute traffic across the set of spine switches.
Advantages of the Leaf-Spine Architecture
Compared with earlier data center network architectures such as Three-Tier, leaf-spine networks can "scale out" to fairly large numbers of servers by adding more switches. Both the leaf and the spine switches can be relatively simple and non-modular switches based on "merchant silicon" chipsets. These switches tend to be inexpensive and efficient in terms of delay and power consumption.
Limitations of the Leaf-Spine Architecture
Traffic forwarding through a leaf-spine network needs to be able to make use of multiple paths
. Traditional Ethernet forwarding based on Spanning Trees cannot do this. In the Layer-2 approach, the Ethernet forwarding model is enhanced to support multiple paths, for example using protocols such as TRILL
. In the "overlay
" approach, traffic is forwarded across the leaf-spine fabric using Layer-3 (IP) routing. Where transparent Layer-2 connectivity is required between servers (either physical, or virtual machines running on them), Ethernet frames are encapsulated
in IP packets using protocols such as GRE, VXLAN, STT or similar.
In the traditional three-tier data center network architecture, aggregation switches are relatively complex modular systems, which provide a natural place for additional functionality such as firewalls etc. The simple switches used in typical leaf-spine networks doesn't support such sophisticated processing. Where such functions are needed, they need to be implemented somewhere else. Often they are moved more towards the edge, i.e. inside the servers or hypervisors.