Posts Tagged ‘monitoring’

Some thoughts on stress testing web applications with JMeter (part 2)

March 30, 2010 9 comments

In this second part on testing web applications with JMeter, I will mainly write about running the test plans, recording the results and interpreting them.

When do I stop ?

One of the main questions you have to ask yourself when you start stress testing a web application is: when do I stop? This question is not as easy a question as it seems, the response depends on your initial objectives and on “scientific” criteria allowing you to decide when you have met the initial objectives. Eventually, it comes down to measuring and interpreting the “results” of your stress tests.

Before going any further, we should spend some time on the measurable outcomes of a stress test. There are mainly 2 interesting measures that you can record when you run a stress test on a web application:

  • The throughput: is the number of requests per unit of time (seconds, minutes, hours) that are sent to your server during the test.
  • The response time: is the elapsed time from the moment when a given request is sent to the server until the moment when the last bit of information has returned to the client

The throughput is the real load processed by your server during a run but it does not tell you anything about the performance of your server during this same run. This is the reason why you need both measures in order to get a real idea about your server’s performance during a run. The response time tells you how fast your server is handling a given load.

We are now much closer to find an answer to our initial question: you can stop stress testing your application when for a measured throughput the measured response time is “too high”. This is the right answer in an ideal world where information systems behave in a deterministic manner … another way to answer our question could also be: you can stop stress testing your application when your system crashes / collapses / starts to behave unexpectedly … 🙂

However, I will stick to our first answer for a while as it contains another interesting question: what is a “high” response time for a web application (or any application or information system used by real people)? A very interesting answer is given in the article already mentioned in my previous post and in this one as well. To make it short, based on usability studies it is possible to define response time limits where the user interaction with an information system radically changes. These limits are tightly related with the nature of the human being: psychology as well as brain performance 🙂

  • 0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.
  • 1.0 second is about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.
  • 10 seconds is about the limit for keeping the user’s attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.

Using these limits allows us to give a precise end point to the stress tests of a system; it helps us define in collaboration with our client (or users) what is an acceptable response time. For example, the last time I made stress tests for a client, we agreed that the acceptable upper limit of the response times for his system was 7 seconds: he wanted to know how many concurrent users his system would handle.

The remaining problem now is how to measure / estimate the throughput and response times of our system using JMeter: some simple statistics and mathematics are needed here.

Run your test plan and record the meaningful measures …

First of all, JMeter provides us with several different “listeners” allowing to record these 2 variables in various ways (graphics, tables, trees, files). I would say that most of these “listeners” are useless or to put it in a different way, one of them is a must have in order to do have all the necessary information in hand: the Summary Report.

In order to understand this report and to implement scenarios efficiently we must keep the following things in mind:

  • JMeter records response times and throughput for each “sampler” of each “thread group” defined in your test plan.
  • In the Summary Report, one line is displayed for each different “sampler” based on the sampler’s names: you can group  or differentiate samplers in the report just by playing with their names.
  • Each “sampler” is executed  many times: the Summary Report provides us with mean values (and standard deviations) for the throughput and response times of each named “sampler”.
  • Global values (mean and standard deviation) for throughput and response times are also calculated in the Summary Report.
  • The Summary Report allows you to store the measures of each run in a “csv” file: you can thus analyse and interpret the results in a spreadsheet program.

Other reports are also useful particularly at the beginning when building and testing your scenarios:

  • The View Results Tree is very handy when “debugging” a scenario as it allows to monitor all the HTTP Requests and Responses exchanged with the server. The draw back is that it consumes too much memory to be used in a large stress test.
  • The View Results in Table listener is also useful in the early stages of the stress test implementation as it gives a good and fast overview of the execution of a test plan. However, this listener also consumes too much memory to be used in a large stress test.
  • I have also found some very interesting JMeter plugins on a Google Code project. One of them, the “Active Threads Over Time” helped me a lot when trying to set the ramp up throughput by playing with the “ramp up” and “number of threads” parameters of the thread group.

One more element that you should have in mind when performing stress tests is the performance bottleneck of the computer running the tests themselves:

  • It is very common when running stress tests on large production systems to reach the limits of the computer running the tests before reaching the limits of the tested server.
  • When the computer running the tests is reaching its limits (memory, number of threads, cpu …) all the measures recorded by the stress tests tool are wrong or at least biased.
  • There are two way to face this problem: (1) one is to optimize your scenarios and the way you run them and the (2) second is to set up a distributed infrastructure.

(1) In the JMeter manual, you will find the following advises in the section 16.6 of the Best Practises page:

Some suggestions on reducing resource usage.

  • Use non-GUI mode: jmeter -n -t test.jmx -l test.jtl
  • Use as few Listeners as possible; if using the -l flag as above they can all be deleted or disabled.
  • Rather than using lots of similar samplers, use the same sampler in a loop, and use variables (CSV Data Set) to vary the sample.
  • Don’t use functional mode
  • Use CSV output rather than XML
  • Only save the data that you need
  • Use as few Assertions as possible

If your test needs large amounts of data – particularly if it needs to be randomised – create the test data in a file that can be read with CSV Dataset. This avoids wasting resources at run-time.

(2) In the JMeter manual, you will find the Remote Testing page giving you precise instructions necessary to set up a distributed testing environment and a PDF describing how it all works architecture-wise. My experience is that it is all very easy to set up and that it gives excellent results: in the end, it comes down to running the “jmeter-server” scripts on the slaves and to configure the existing host in the master’s configuration file (  The only 2 or 3 little problems I came across with the distributed testing are:

  • Do not forget to give memory to your jmeter slaves and master (set Xms and Xmx in the file) the default values a very low.
  • If you use external resources such as a CSV Data Set, you should have them on all your slave installation under the same location (a full path is needed in your scenario)
  • Beware of multiple thread groups and schedulers, they leak huge amounts of memory on the slaves

Last but not least, you should never perform your stress tests against a server or infrastructure that was just started. Servers usually need a warm-up before they reach their full speed: this is particularly true for the Java platform where you surely don’t want to measure class loading time, JSP compilation time or native compilation time.

Interpret the results …

In order to interpret the results of a stress tests, it is important to understand some basic elements of Statistics:

(1) The mean value (μ)

The following equation show how the mean value (μ) is calculated:

μ = 1/n * Σi=1…n xi

The mean value of a given measure is what is commonly referred to as the average value of this measure. An important thing to understand is that the mean value can be very misleading as it does not show you how close (or far) your values are from the average. An example is always better than a long explanation.

Let’s assume that we are measuring response times in milliseconds in 2 different stress tests:

Stress Test 1:

  • x1=100
  • x2=110
  • x3=90
  • x4=900
  • x5=890
  • x6=910

gives you μ = 1/6 * (100 + 110 + 90 + 900 + 890 + 910) = 500 ms

Stress Test 2:

  • x1=490
  • x2=510
  • x3=535
  • x4=465
  • x5=590
  • x6=410

gives you μ = 1/6 * (490 + 510 + 535 + 465 + 590 + 410) = 500 ms

In both cases the mean value (μ) is the same. However if you observe closely the values taken by the response times you will see that in the first case, the values are “far” from the mean value where in the second case, the values are “close” to the mean value. It is quite obvious with this example that a measure of this distance to the mean value is needed in order to draw any kind of conclusion based on the mean value.

(2) The standard deviation (σ)

The following equation show how the standard deviation (σ) is calculated:

σ = 1/n * √ Σi=1…n (xi-μ)2

The standard deviation (σ) measures the mean distance of the values to their average (μ). In other words it gives us a good idea of the dispersion or variability of the measures to their mean value. Let’s go back to our example and calculate the standard deviation of each of our theoretical stress tests:

Stress Test 1:

σ = 1/6 * sqrt( (100-500)^2 + (110-500)^2 + (90-500)^2 + (900-500)^2 + (890-500)^2 + (910-500)^2 ) ≈ 163 ms

Stress Test 2:

σ = 1/6 * sqrt( (490-500)^2 + (510-500)^2 + (535-500)^2 + (465-500)^2 + (590-500)^2 + (410-500)^2 ) ≈ 23 ms

The 2 values of the standard deviation calculated above are very different:

  • in the first case, the standard deviation is high compared to the mean value, which shows us that our measures are very variable (or mostly far from the mean value) and that the mean value is not very significant.
  • in the second case, the standard deviation is low compared to the mean value, which shows us that our measures are not dispersed (or mostly close to the mean value) and that the mean value is significant.

(3) The sampling size and the quality of the measure

Another interesting question is whether our calculated mean value is a good estimation of the “real” mean value. In other word, when calculating the mean value of the response time during a test case do we have a good estimation of the “real” mean response time of the same scenario repeated indefinitely. In probability theory, the Central Limit Theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed.

The measures of response times and throughput obtained during stress tests comply with the Central Limit Theorem as we usually have: a large number of independent and random measures which have a finite (calculated by JMeter) mean value and standard deviation. We can thus assume that the mean values of the response time and the throughput are approximatively normally distributed.

This allow us to calculate a Confidence Interval for these mean values. The Confidence Interval gives us a measure of the quality of our mean values as it allows us to calculated the variability of our mean value (interval) with a predefined probability. You can for example decide to calculate your Confidence Interval at 95%, which will tell you that the probability to have a mean value within the calculated interval is 95%. On the contrary, you can decide to calculate the probability to have you mean value within a given interval (see the examples below).

The following equation show how the Confidence Interval (CI) is calculated:

CI = [μ – Z*σ/√n, μ + Z*σ/√n]


  • μ is the calculated mean value of our sample,
  • σ is the calculated standard deviation of our sample
  • and Z is the value for which the area under the “bell shaped curve” of the standard normal distribution represents the half the chosen Confidence C (anyone who can explain this better is welcome).

The following table gives values of Z for various given values of Confidence C:

0.80 1.281551565545
0.90 1.644853626951
0.95 1.959963984540
0.98 2.326347874041
0.99 2.575829303549
0.995 2.807033768344
0.998 3.090232306168
0.999 3.290526731492
0.9999 3.890591886413
0.99999 4.417173413469


If we go back to our previous examples, we can calculate the confidence intervals of our mean values at 95% :

CI1 = [500 – 1.96*163/sqrt(6); 500 + 1.96*163/sqrt(6)] ≈ [370; 630]

CI2 = [500 – 1.96*23/sqrt(6); 500 + 1.96*23/sqrt(6)] ≈ [482; 518]

This means that the probability to have a mean response time in the calculated confidence interval is 95%.

We can also calculate the probability to have the mean value in the interval [490, 510]:

10 = Z1 * 163 / sqrt(6) => Z1 = 10 * sqrt(6) / 163 => Z1 ≈ 0.15 => C1 ≈ 12%

10 = Z2 * 23 / sqrt(6) => Z2 = 10 * sqrt(6) / 23 => Z2 ≈ 1.06 => C2 ≈ 71%


These are just given as examples of how to calculate the confidence interval … the conditions are not met for the Central Limit Theorem with such a small sample.

The last 2 examples were made using the following Standard Normal Distribution Tables.


As a conclusion, we can say that the best way to interpret our stress test results is to use the Summary Report provided by JMeter and to store it in a “csv” file for every run. In this report we can find, the mean response time, the mean throughput, the standard deviation of the response time and the standard deviation of the throughput for every named sampler and globally for a the run.

Based on the explanations above, I recommend the following methodology:

  • If we have a high number of samples (which is usually the case in stress tests) and a low standard deviation than we can  conclude without risk that we have a good estimation of the mean value of both the response time and the throughput of our system and that the “real” number will be close to the calculated mean values.
  • If we have a high number of samples (which is usually the case in stress tests) and a high standard deviation, we probably have a good estimation of the mean value but should however consider to  estimate a confidence interval. In any case, if the variability of the measure is high investigation is needed on a technical point of view as variability of response times and throughput is obviously related to instability of the system tested.
  • If we have a low number of samples and a high standard deviation than we almost certainly have a very bad estimation of the mean value, which means that we are measuring the wrong thing, the wrong way.

Monitor your systems while you run the tests …

It is often useful to monitor the system (and its various components) while you are stressing it. Various tools may be used that vary from one platform to another. On the Java platform you may use the excellent “jvisualvm” provided with the latest versions of the JDK and interacting with the various monitoring hooks integrated in the JVM.

Monitoring Java Web Applications is a subject in itself … I can try to share my thoughts on it some time … in another post 😉

Some thoughts on stress testing web applications with JMeter (Part 1)

March 17, 2010 4 comments

A small intro …

Now that I am almost finished with the “stress test” task I was talking about in my previous post, I have several thoughts and experience to share concerning on the subject. I am also planing to write about Java web application profiling on a following post as it somehow relates with the results of a “stress test” task.

The tool I have used to carry on stress test tasks is JMeter (the latest version available at the time of this writing) thus, I will write about JMeter. However, I am interested in any feedback (experience) concerning other tools (or JMeter).

State clearly your objectives …

It is important that you state your objectives clearly as the overall methodology of the stress tests will greatly depend on these objectives.

Some classical examples follow:

  • Give a precise estimate of the maximum load that a given system may serve (peak): this is usually done in order to help plan the future infrastructure of a live system.
  • Find precisely the bottlenecks of a live system during a peak: this is usually done as a preliminary task to profiling and performance tuning tasks.
  • Find precisely the origin of eventual leaks (memory, connection to resources, various resources) during a long run: this is also usually done as a preliminary task to profiling and tuning tasks.
  • Prove that the system you have implemented can hold a theoretical load: usually this was a client’s requirement expressed during the very early stages of a project (for example in the call for tender)
  • Any combination of the aforementioned objectives …

These various different objectives lead to different types of scenarios. To my opinion a good methodology is always to try and implement scenarios that are as close as possible to real and typical use cases of the system you are willing to test. However, in some cases (bullets 2 and 3 above) you may need to write artificial scenarios that will help you identify precisely a functionality of your system that has performance problems.

The following paragraph is about writing “real case” scenarios and test plans covering the aforementioned objectives.

Write good quality scenarios and test plans …

First a difference must be made between “scenarios” on  one hand and “test plans” on the other:

A scenario is (or at least should be) an actual use case of your application carried out by a single user. In JMeter terms, a scenarios is a combination of “samplers” and “controllers” that will be executed by a single “thread” of a “thread group”.

A test plan is the “way” a given scenario will be executed in order to achieve a given objective (as the ones described in the previous paragraph). In JMeter terms, the “way” the scenario will be executed mainly means playing with the following variables on the thread group: the number of threads, the ramp up time and the number of loops executed by a thread.

It is very important to understand the exact meaning of these 3 parameters:

  • The “number of threads” in a thread group is the actual number of threads spawned by JMeter, each one of them used to execute the scenario. In other words, this variable is the number of users executing a “real life” use case on your system. This number is not the number of concurrent / parallel users executing a “real life” use case on your system: the concurrency of the users depends on both the duration of your scenario and the ramp up time configured on the thread group.
  • The “ramp up time” in a thread group is the actual time taken by JMeter to spawn all the threads. If the ramp up time is small compared to the number of threads and the mean duration of a scenario then the number of concurrent threads accessing your system will be high and vice versa. A rough estimation of the throughput (number of requests per second) during the ramp up period of your test plan is: number of threads / ramp up time (in seconds).
  • The “number of loops” in a thread group is the actual number of times that the scenario will be executed by each thread.

Now let’s go back to the implementation of “real case” scenarios using JMeter. I recommend this interesting article on the subject sent to me by a colleague (thanks Petros 😉 ). Some very good methodological hints are given concerning the writing of scenarios in the first paragraphs. Basically, I can give 3 main hints on the subject that are easy to follow and implement with JMeter:

  • Keep scenarios simple:
    Each scenario should correspond to one use case. This makes things much more simple and logical particularly when it comes to interpreting the results of the stress tests.
  • Use “recording” techniques to generate your scenario from a “real” usage of the application:
    JMeter comes with a proxy component, which when started, will record all the HTTP Requests and Response cycles originating from a web browser configured to access your system through this proxy. There are well-known problems with the usage of this proxy when dealing with HTTPS: often, a simple solution is to do all the recording in HTTP and turn the protocol to HTTPS in your scenario afterwards (this supposes that you can make your system run under HTTP for the time of the recording).
  • Don’t forget to record the “think time” of the users:
    The “think time” of a user is the elapsed time between 2 user actions. During this time, the user may be thinking what to do next, answering an urgent call on the phone, talking with a friend … this must be part of the scenario. Fortunately, JMeter allows to record these “think times” and translate them into “Gaussian Waits” inside your scenario (see the article mentioned above for hints on how to do it). In any case, you should always have “waits” in your scenarios simulating in the most realistic manner these “think times” of the real users.
  • Read the JMeter User’s Manual particularly the “Component Reference” in order to find all possibilities provided by the tool. For example:
    You can use an external csv file containing (username, password) couples in order to have each thread login into your system with different credentials.
    You can use regular expressions to parse HTTP Responses and extract data necessary to chain your samplers

Once you have your scenario ready, you must configure your test plan in order to meet your objectives. The tuning of the main parameters of your test plan (number of threads, ramp up and number of loops) is often a “try and error” procedure. However, we can give the 3 following hints:

  • You should try to have a constant throughput during a run:
    It is often very difficult to “control” the throughput particularly during the ramp up period
  • If your objective is to simulate a “peak”:
    You should have a “high” number of threads and a “low” ramp up time and number of loops
  • If your objective is to simulate a “long run”:
    You should have a “medium” number of threads, a “higher” ramp up time and a “high” number of loops

Note: The terms “high”, “higher”, “medium” and “low” are voluntary qualitative in the 3 bullets above as they depend on the system you are testing.

To be continued …

This post is already too long: seems I have to much to say on the subject 😉 Never mind, I will carry on in a following post tomorrow covering the remaining subjects: running the test plans,recording the meaningful measures, interpreting the results, monitoring the systems …

JVM Monitoring with Oracle Application Server 10g R2

March 2, 2010 Leave a comment

A little introduction

I was recently asked to perform some stress tests on a system running Oracle Application Server 10g Release 2 installed on Windows 2003 server. Among other things, the objective was to monitor the system and profile the code in order to detect possible flaws in the code and the server configuration.

One of tasks I had to do was to find a way to monitor the application server’s JVM during the stress tests. Naively, I thought that I could easily use “visualgc” (vmstat 3.0) or even better the “jvisualvm” provided with all the latest releases of the JDK. The rest of this post shows how wrong (and ignorant) I was …

First thing to do: install a decent JVM

As you may already know, only “recent” versions of the JDK are bundled with monitoring tools (jps, jstat, jstad, jvisualvm …) and unfortunately Oracle Application Server 10g R2 is not bundled with something that can be called a “recent” JVM … JDK 1.4.2

However, this is no real problem as you can monitor an older JVM with the tools provided in a recent one: more precisely you can monitor any JVM with a version number greater or equal to 1.4.1 (see jvmstat doc). Basically, you just need to:

  • download and install the latest available JDK (for example 1.6.0_18): jps, jstat and jstatd are included starting with jdk 1.5
  • download and install jvmstat 3.0 if you wish to have the “visualgc” tool and documentation for all the monitoring tools in one bundle.

Once you have done this you can try to run jps on your Windows 2003 Server where you have your Oracle Application Server 10g R2 installed … and … no, it won’t show you any of the JVM’s of the platform 🙂

Still, you can test that everything works as supposed by writing a simple test class such as the following one and running it with the JDK bundled in the Oracle Application Server:

public class Test {
  public static void main(String[] args) throws Exception {
    while (true) {

Once you have run it, this class should output a dot every 10 seconds in your console. If you run jps from another console, you should see a Java process corresponding to your running test class listed in the output produced by the jps tool. This should be enough to reassure you and prove that the jps from a JDK 1.6 can monitor Java processes originating from a JDK 1.4.2 😉

As a matter of fact the main reason why you don’t “see” the Java processes of your Oracle Application Server listed in the output produced by the jps tool is that they are run by very different OS users. This user / permission issue is documented in each of the monitoring tools: for example for jps see towards the end of the “Description” section.

Second thing to do: run the tools with the proper user

Oracle Application Server is installed as a Windows Service and as such all its processes are owned and executed by the Local System User.

When you run the jps tool (or any other monitoring tool provided with your freshly installed JDK 1.6), the user owning and executing the monitoring processes is the one you used to log into your Windows Server. There are several ways to run a command prompt as the same “Local System User” that runs the Windows Services, 2 of them are documented here. I chose to use the the psexec tool from Sysinternals:

psexec -i -s cmd.exe

Once you have a command prompt owned by the “Local System User” all the processes run from there inherit this user. If you run jps from within this command you will find more Java processes listed in the output of the tool … but … once again no luck, the Java processes corresponding to the “OC4J Homes” running your web applications are not there 🙂

Third thing to do: set the temporary directory of the JVM’s

That is the most tricky part of the procedure and finding a solution involved a “deep diving” into the source code of the jps tool and jvmstat classes.

As far as I have understood, starting from JDK 1.4.1 all the JVM’s can produce real time performance metrics in files. These files are located in the temporary directory of the user running the Java process under a folder named “hsperfdata_<user>”. On the Windows platform, for a Java process spawned by a Windows service (and thus owned by the “Local System User”), its performance file is located under c:\WINDOWS\Temp\hsperfdata_SYSTEM and is named after the id of the process.

When you run jps (or any other monitoring tool), the OS user running the command is used to determine the directory where the performance files should be found (based on the user’s temporary directory). For example, jps will return an entry for each file present in this directory.

However, when a Java process corresponding to an “OC4J Home” is spawned by OPMN, the location of the temporary directory is overridden through an environment variable and points to the temporary directory of the user who installed the server (some Administrator user).

The problem is that:

  • on one hand, you have to run jps as the “Local System User” in order to have the sufficient privileges to monitor the Java processes of the Application Server (because it is started as a service).
  • on the other hand the performance files are not located under the temporary directory of this same “Local System User”

The solution is to override the location of the temporary directory of the “OC4J Home”. Fortunately, this is easy using the Enterprise Manager console: in the “Administration” tab of every “OC4J Home”, there is a “Server Properties” link that opens a web form where you can find an “Environment” section. In this section, you just need to add an environment variable named “TEMP” with a value set to “c:\WINDOWS\Temp”.

Once this is done and your “OC4J Home” is restarted, your jps tool run as the “Local System User” will return (among others) the Java processes corresponding to your “OC4J Homes”. Moreover, under the directory “c:\WINDOWS\Temp\hsperfdata_SYSTEM” a new file will appear for each of these Java processes.

Fourth thing to do: the final monitoring architecture

As I have 3 Oracle Application Servers 10g installed on 3 different servers, my initial idea was to able to monitor them all from a remote PC using the “jstatd” tool on the servers and the “visualgc” or “jvisualvm” tools on the PC.

Running jstatd on a server is not very different from running jps. It has to be run as the “Local System User” with a policy file allowing the embedded RMI Server to be started (see jstatd documentation for more details):

grant codebase "file:${java.home}/../lib/tools.jar" {

In order to check that everything is ok, the jps tool can be used from a remote pc passing the host name or IP of the server.

jps -l <server_host_name_or_ip>

An even better way to do this and to have it automated is to install jstatd as a Windows Service as:

  • it will run with the needed user (Local System User)
  • it can be started automatically

Instructions on how to do install jstatd as a Windows Service can be found here. A brief summary follows:

  1. Get the instsrv.exe and srvany.exe tools for example from the Windows Server 2003 Resource Kit Tools.
  2. Run the following command to install the service:
    c:\<location>\instsrv.exe jstatd c:\<location>\srvany.exe
  3. Use a Windows Registry editor to create a key named “Parameters” under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\jstatd. Then inside the Parameters key create a new String value (REG_SZ) named Application containing the full path to jstatd.exe and the security parameter (policy file).
  4. Use the Windows Services management application to check out that the jstatd service is configured to run as the local system user.
  5. Start the jstatd service.

Once jstatd is installed as a Windows Service on every server, the “jvisualvm” tool can be used to connect to these servers (Remote Host) and monitor their instrumented JVM’s. The “visualgc” tool can either be embedded as a plug-in of the “jvisualvm” tool or be run independently against the various instrumented JVM’s on the various servers where jstatd is running.