The following material was presented by the Pine Mountain Group...

Network Slowdowns - Determining Where The Fault Lies.

It's all about layers!

Each protocol layer works on its own tasks and responsibilities apart from the others. By understanding the functions of the layers and how to analyze communications sessions you can be a productive forensic network analyst! For instance, TCP takes data from Application layers and moves it between client and server independently of the application.

One really good way to see if "slowness" is due to "the network" or the "application" is to examine a packet trace file while the slowness is occurring. Assuming the network layer is moving packets correctly for a moment, take a look at the TCP layer functions:

Are there retransmitted packets? Who is retransmitting the packets, the client or server?

* If the client is retransmitting packets, then the server is not acknowledging the data from the client and so the client re-sends the data.

* If the server is retransmitting packets, then the client is not acknowledging the data from the server and so the server re-sends the data.

Once retransmitted packets are found and it is determined which direction, the next is to determine what layer is at fault for the retransmission.

Again, assuming the network layer is fine, we are then trying to determine if it is a TCP stack problem, the entire computer for some reason, or if only one application on the computer is experiencing slowness.

* If the entire TCP stack and thereby all applications on the system are experiencing slowdown, then it points to either a "platform" problem, meaning all systems with the same operating software are affected.

* If only one application is experiencing problems, then it is specific to that application. We can watch all traffic for an application by filtering on that application's TCP Port Number. That way we only see that specific application traffic and can analyze all users traffic of that application. If other general applications run acceptably fast (like FTP), then you can further isolate the problem to a specific application as both the platform and the FTP application are fast, proving that one application is slow.

* If only one user is experiencing problems, then it is something to do with the process that specific user is performing. In this case, we filter on the application's port number on the server and the user's unique port number. Keep in mind that the server's port number for an application stays the same and the user's port number changes with each session. So to do this filter, login while filtering on the server and application and then subsequently tighten the filter to include the user's port number.

* If you don't know the port number of an application, you can find it by examining all the traffic going to or from a server and looking at the data and transaction types with your trusty analyzer.

OK, so how do you know if it is an Application or TCP layer problem?

* Take a look at the TCP Window size advertisements between the client and server. They are usually about 8760 or more. If you see the Window stay down lower than its highest point for more than a second, it may point to a slow application layer pulling data out of TCP's buffer. If the application is very slow, the TCP Window may go to zero preventing additional data from being sent.

* The Window size dictates how much unacknowledged data can be sent, and if it goes to zero, then the recipient of the zero window cannot send any more data until the application catches up. If this is what you see, then the application is constrained and the slowdown is due to the application layer. Since there are many possible reasons for this, we will talk about it in a future tip.

Here are a few things we've seen as the cause of some "slow network" problems:

[These assume the Network layer and below are operating correctly.]

* Packet retransmission due to the server being too fast for the client - packets get to the node, but the node cannot receive at the high packet per second rate.

* Packet retransmission due to the server being overcome by too many requests from too many clients.

* Applications needing more CPU slices due to their load.

* Complex SQL queries taking so long to complete that the querying client and others are significantly delayed.

* Too many simultaneous client sessions at one time, delaying response time, regardless of the amount of data.

* File lock gridlock slowing access to shared files.

* Temporary files filling disk space faster than operating system could purge old files to be used for new files.

Once a computer or network professional understands protocol layers, everything else is easy. Many technologists configure and install things without a clear understanding of the OSI Model. Yes, you've seen it, heard it, but does it live in your head and do you really understand it intuitively when troubleshooting problems?

Get trained in Network Forensics and Standards-Based Network Analysis! Become a Certified <http://www.pmg.com/splash_cna.htm> NetAnalyst(tm).

Bill Alderson

Executive NetAnalyst

Pine Mountain Group, Inc.

 

Often, valuable information comes from other analysts, training courses and first hand experience. Are you interested in sharing a tip with your peers? Respond with your topic to reply@pmg.com <mailto:reply@pmg.com>

networktraining.gif (2810 bytes) <http://www.boomerang1.com/pmg/networktraining.gif>