NERSC Case Study: Stateful Firewall

Background

This case study was presented at Joint Techs 2006 (Albuquerque, NM) by Brian Draney of NERSC. NERSC is the US DoE's scientific computer centre, which has ~20 TFlops of processing power and 8.8PB of storage. It uses a 10GE LAN backbone and connects to EDnet at 10 Gbps.

Symptom

A sluggish transfer of data between two end hosts.

Troubleshooting

For a specific IP flow, original packets did not seem to be getting through, but all the re-transmits were. In the Xplot below all the red points represent re-transmitted packets.

Large Send Offload Re-transmit case

The sender's route table showed that the correct PMTU (Path Maximum Transmission Unit) was being used for the destination, but tcpdump showed 64kB packets leaving a 9kB capable interface.

Outcome

It was determined that a Large Send Offload NIC was being used, and this was not honouring the path MTU (becasue it did not appear to have access to the host's routing table). Over-sized packets were being sent and these were being dropped. However, the re-trnasmitted packets were managed by the host's kernel and not the LSO engine, and these did honour the PMTU, so did get through.

-- TobyRodwell - 16 Feb 2006

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2006-02-16 - TobyRodwell
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2004-2009 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.