Introduction to Profiling

[Back]

Profiling an application means investigating its runtime performance by collecting metrics during its execution. One of the most popular metrics is method call count - this is the number of times each function (method) of the program was called during a run. Another useful metric is method clock time - the actual time spent in each of the methods of the program. You can also measure the CPU (Central Processing Unit) time, which directly reflects the work done on behalf of the method by any of the computer's processors. This does not take into account the I/O, sleep, context switch or wait time.

For more in-depth analysis of the program performance, it is very useful to analyze a call graph. Call graphs capture the "call" relationships between the methods. The nodes of the call graph represent the program methods, while the directed arcs represent calls made from one method to another. In a call graph, the call counts or the timing data are collected for the arcs.

Generally, a metric is a mapping which associates numerical values with program static or dynamic elements such as functions, variables, classes, objects, types, or threads. The numerical values may represent various resources used by the program.

Overhead and Intrusion

A major side effect of profiling is that the profiling itself consumes memory and CPU time. This introduces two problems. One is overhead - you'll notice that the profiling runs take longer than the normal runs, in some cases substantially longer. The other problem is intrusion. When the metrics collection uses the same resources that you want to measure, you get the numbers for not only the application, but for the application plus whatever you use to collect the metrics.

There are two basic techniques for profiling: tracing and sampling.

Tracing

Java VMs use tracing with reduction. Briefly, here's how it works. The profile data is collected whenever the application makes a function call. The calling method and the called method (sometimes called "callee") names are recorded along with the time spent in the call. The data is accumulated, so consecutive calls from the same caller to the same callee increase the recorded time value. The number of calls is also recorded.

Tracing requires frequent reading of the current time (or measuring method used to analyze other resources consumed), and can introduce large overhead. It produces accurate call counts and the call graph, but the timing data can be substantially influenced by the additional overhead.

Sampling

In sampling, the program runs at its own pace, but from time to time the profiler checks the application state more closely. The profiler checks the state by temporarily interrupting the program's progress and determining which method is executing. The sampling interval is the elapsed time between two consecutive status checks. Sampling uses "wall clock time" as the basis for the sampling interval, but only collects data for the CPU scheduled threads. The methods that consume more CPU time will be detected more frequently. With a large number of samples, the CPU times for each function are estimated quite well.

Sampling is a complementary technique to tracing. It is characterized by relatively low overhead, produces fairly accurate timing data (at least for long running applications), but cannot produce call counts. Also, the call graph is only partial. Usually a number of less significant arcs and nodes will be missing.

[Back]