Thursday, August 13, 2009

TCP and Congestion Control, Summarized


There are basically four mechanisms in TCP that are there primarily to deal with the problem of congestion control in the network. As mentioned in class, while routers can help, they really can't solve the problem -- it depends on the proper behavior of nodes at the edges of the network.The four mechanisms are:Slow StartCongestion AvoidanceFast RetransmitFast RecoverySlow Start introduces the Congestion Window, CWND. During Slow Start the CWND grows until the first packet loss, and TCP adjusts its size based on the sender's estimate of what the network can support before triggering additional congestion losses. The rapid opening of CWND during the early stages of the TCP connection is motivated by the desire to use the network's bandwidth efficiently, by filling it with useful work quickly (but not too quickly).Congestion Avoidance is invoked when the connection is its steady state, gently opening CWND around the point that might trigger an additional packet loss.In reality both mechanisms are implemented together, using the variable ssthresh (slow start threshold -- the window size that caused loss divided by 2) and considering other events in the dynamics of the connection. For example, if the packet loss caused a time out before retransmission, CWND is reset to one segment and the connection returns to slow start. However, if some data is reaching the other end of the connection, as indicated by duplicate acknowledges, the CWND is reset to SSTHRESH and grows gently from there.Fast Retransmit is a mechanism that allows retransmission to take place before waiting for a full time out. The rule is to wait for three DUP ACKs before triggering a retransmission. Intuitively, this means that while a segment may have been lost, other segments transmitted from the congestion window are making it to the receiver. So it should be safe to retransmit the lost segment without triggering further losses.Fast Recovery is basically a mechanism to re-enter congestion avoidance after a fast retransmit: three dup acks cause a retransmission, cut the current CWND in half, and reset ssthresh to this value. Open the CWND by one segment for each additional DUP ACK that arrives, to inflate the window from another segment that has left the network. The next ACK (not DUP ACK) indicates that the retransmitted segment arrived at the receiver, reset cwnd to ssthresh and continue congestion avoidance as before.During these special case retransmissions, the sender has to be careful about how it adjusts its round trip estimates, otherwise it may compute a roundtrip timeout that is too conservative.If this all sounds complicated, it is! In a sense, the proof is in the pudding, because the congestion collapse events of the mid-1980s have not been witnessed again. That isn't to say that the network never breaks down -- denial of service attacks, uncontrolled surging traffic, etc. has damaged the behavior of the network. It just that because if the vast majority of end points play be the rules, the network has scaled to huge numbers of users -- a significant (and miraculous) engineering achievement.

0 comments:

Post a Comment