A performance issue can be observed with some distributed applications in data center networks. The basic story is that a client sends a request to multiple servers, and these servers send their responses at more or less the same time. If the senders are well synchronized and the responses are large, this leads to bursts of traffic arriving from multiple ports at the switches "upstream" from the receiver. Because data center switches such as ToR switches often have small buffers, this can lead to correlated packet loss. TCP reacts badly to these kinds of loss, and for applications that have to wait for all the responses, the overall times to completion can increase spectacularly.
There have been a variety of suggestions to address performance issues due to incast:
The effect of incast congestion can be mitigated by having large-enough buffers in network devices where incast can occur. However, this has a direct impact on the cost of the network. Also, larger buffers can lead to increased delays where there is congestion, which harm performance in general.
TCP can be reconfigured to use more aggressive retransmission in cases of packet losses. In particular, the original paper recommends to send the RTO (Retransmission Timeout) value to a low value such as one millisecond. (Its default in Linux seems to be 200ms.)
It has been suggested that AQMandECN can improve behavior of TCP in incast situations.
Improvements to TCP's congestion control have been a popular research topic for a long time. DC-TCP (Data Center TCP) was developed partly in response to Incast problems.
Senders can artificially limit their rate of sending in order to reduce the change of congestion.
Changes at the application level could reduce the amount of incast-induced congestion, for example by de-synchronizing reply traffic in distributed applications.
– %USERSIG{SimonLeinen - 2014-12-31 - 2016-06-25}%