Performance Testing

This section describes an overview of performance testing. This is a set of test varieties based on the ISO25010 quality characteristic “performance” or rather “performance efficiency” as the ISO standard calls it within to so called non-functionals.

[More info: functional and non-functional testing]

Performance efficiency is defined as performance relative to the amount of resources used under stated conditions. [ISO25010]

What is performance?

Performance relates to the duration of the execution of tasks by an IT system and the load such a system can handle. Since there are many different uses of IT systems, the performance expectations of the users will also differ. Examples of performance that are often important are the response time of a real-time system, the duration of a batch process and the number of users that can be handled simultaneously. When testing performance, there are various layers of an IT system involved. Each layer may have a specific effect on the overall performance as experienced by the user.

We distinguish the following varieties of performance testing:

Load/stress testing:
Used for exercising the system to perform the various expected loads on a component level as well as an end-user level. On a component level, performance testing will test for conforming to basic performance requirements (database transactions per second, API-calls per second etc.). At end-user level we test the performance in a real-life situation with users “hitting” the system from all possible interfaces in real life (mobile/web/workstation etc.). Stress testing aims at going beyond the regular load expectations/requirements. This may vary from looking ahead to an expected increase in load over the months or years to aiming for the system to break (in order to test failover/restore processes).

Performance testing varieties related to application architecture.

Functional performance testing:
The created functionality should be able to cope with a multi-user and/or heavy load situation. This may range from requiring a correct implementation of batch processes that will not impact each other (or other business processes) to application functionality that will correctly access data without conflicts in a multi-user situation (e.g. how to deal with locked records etc.). This variety of testing is mainly executed during development activities.

The ISO25010 standard defines three subcharacteristics for performance:

Time behavior
the degree to which the response and processing times and throughput rates of a product or system, when performing its functions, meet requirements.
Resource utilization
the degree to which the amounts and types of resources used by a product or system, when performing its functions, meet requirements.
Capacity
the degree to which the maximum limits of a product or system parameter meet requirements.

Performance testing will need to provide an answer on all three of these subcharacteristics.

What is performance testing

Performance testing establishes whether an IT system complies with the relevant performance requirements. Correctly dealing with performance requirements becomes even more important with performance-critical technology integrating in all aspects of our professional and personal life (i.e. cloud-based computing, mobile solutions and “the internet of things”).

In real life, requirements for performance are often not specified; for the business stakeholders it is evident that the performance must be good, but they are often not able to specify what, for them, defines “good”. During the refinement of user stories (or other requirements documents) the team members (for example, the operations people) should pay extra attention to non-functional requirements such as performance efficiency. And even when performance requirements are completely missing, it will be useful for the team to organize and perform performance testing.

The organizational aspects of performance testing

Perhaps the most important aspect of implementing successful performance testing is the organizational aspect. The IT organization (for instance in the form of a “performance expertise centre”) should be able to support organization-wide changes in design, development, testing and maintenance practices regarding system performance. This requires the capability to implement tools and methodologies that meet both organization-wide and project-specific needs and help define and manage performance requirements (both for sequential and high-performance IT delivery and also for maintenance).

In DevOps, performance testing should be able to support work processes in sprints as well as across sprints or teams relating to the end-to-end business process. Supporting this performance testing effort can be done via a structured performance testing approach, providing easy access to tools as well as expertise (specialist resource pool).

The technical aspects of performance testing

The technical aspects of performance testing have to answer questions of time behavior, resource utilization and capacity. The specific technical implementation is dependent on the available tools and infrastructure. The technical implementation of these tools is managed via the implementation of a load model, iteration model and performance metrics plan (for a detailed description see addendum):

Load model: The load model describes at a logical level how the test object will be loaded (user profiles) and how the time behavior will be measured. It consists of the performance requirement(s), description of the test object and the level of load to be generated.
Iteration model: The iteration model describes at a technical level how the test object will be loaded and how the capacity of the system will be measured.
Performance metrics plan: The performance metrics plan indicates which parts of the test environment are monitored as part of the test execution and defines the reporting structure for all performance metrics.

The detailed description of the organizational and technical aspects of a performance testing implementation is completed in the different steps of the relevant performance testing varieties for the organization in question.

Performance testing varieties

Assuring the right level of performance, including performance testing, is possible in different test varieties, with a big difference in the type of tools used, skill sets needed and, most importantly, the depth of testing and analysis.

The following sections describe the most important performance testing varieties (both static and dynamic) with an overview of relevant (but not necessarily all) of the activities related to performance of an IT system.

Designing for performance

For a long time, the main concern for performance testing was to gather requirements following the design phase and expand on the performance aspects of those requirements. In many performance test activities this resulted in at least knowing when performance was (extremely) bad. Design for performance puts the focus on designing for good performance by the different stakeholders. That means that, for instance infrastructure design choices confirm to best practices for performance.

Additional design choices are decided on through the analysis of the performance consequences of those decisions. Design for performance testing translates into a set of performance requirements that will later be used to validate those design choices.

The performance testing specialist is involved where necessary to provide input for the design and to make the resulting performance requirements as SMART as possible.

Performance activities and test varieties

Performance testing activities and deliverables:

(Initial) Load model:
- Performance requirements:
  Perform a product risk analysis that results in SMART performance requirements (instead of a generic claim that “performance is a high priority”)
- Test object:
  Describe the designed test object in one or more click-path or workflow descriptions, this will also allow the prioritization of certain workflows or parts of the test object for performance testing;
- Level of load:
  Get an early estimation of expected usage levels;
(Initial) Iteration model:
The need to be able to handle a realistic load necessitates design choices on everything from database usage to security aspects. The iteration plan translates the way the expected load will be arriving into the best way, this load can then be simulated with the available performance test tooling (or it can signal the need for additional tooling)
(Initial) Performance metrics plan:
The need to measure results and report based on the stated requirements requires the performance metrics plan to match and grow with the load and iteration model. The initial plan states the specific components to monitor in order to validate the stated requirements.

Developing for performance

At the most technical level, a project should look at the performance at a component (unit testing) as well as a system (integration testing) level. This results in working with platform-specific best practices in design and development. With different design choices (and performance consequences) for anything ranging from ERP based systems to portal or mobile solutions there is no one-size-fits-all solution.

Constant vigilance is required to keep up with new versions of development frameworks and the resulting performance impacts (both good as well as bad). This “develop for performance (testing)” approach is then validated for the first time in testing those specific components (web services/database access/etc.) for performance during the unit or unit-integration testing.

Performance testing activities & deliverables:

(Working) Load model:
Provides part of the requirements against which the development approach (best practices, both market as well as company specific) will be measured and the specific test scripts to test the resulting test objects;
(Working) Iteration model:
The specific technical needs resulting from the design and development choices may require platform (including database)-specific tooling and training;
(Working) Performance metrics plan:
In order to operate in the development environment, tool and implementation choices are made regarding monitoring tools and/or specifically designed stubs or simulation software for not yet available system components.

Testing for performance

In addition to component performance testing during development, there need to be specific moments for acceptance performance testing. Due to technical challenges, a single acceptance performance test will sometimes (wrongly) be the only performance test variety executed on a system. For example, if there is only limited time for all required transactions to be executed on a production-like testing environment.

If load and iteration models are finalized for even part of the final transactions to be tested, performance testing can start early. This can be as early as a first release of a prototype. Performance testing is also possible on database level (testing the performance of specific queries or stored procedures) or individual APIs. Based on the requirements derived from the total set of transactions (e.g. 10.000 visitors to a website result in 2.000 calls to a specific API) individual performance test scripts can already be run, and the results will be sent back to earlier phases in the project.

As more of the system or even a complete end-to-end business process becomes available, performance testing is executed as close to the production-like situation as possible. The entire performance test strategy decides how close the performance test will come to the expected load and mix of usage patterns for the expected group of end users.

Important activities when executing performance testing:

Network traffic capture and playback tooling:
Designing, building and maintaining the (production-like) test environment and using the available tools to create performance test scripts to match the designated user profiles
Multi-load generator setup:
Designing and implementing the performance scenario (mix of test scripts) to simulate the load on the system as defined in the load model to production-like levels.
Test lab set-up:
Design and maintain a lab with sufficient capacity or flexibility to run multiple applications in a test scenario (additional skills needed in network components, virtual machine management, network experience etc.)
Environment monitoring:
Running and maintaining similar tooling as used in the production environment with the performance test tooling able to tie into those monitoring results.

Monitoring for performance

Performance monitoring in the production environment has multiple goals. The primary goal is to safeguard business processes and provide early warnings of performance degradation. This is done by monitoring available resources throughout the IT infrastructure. This can be supported by running performance test scripts (for a very limited number of simulated users) in the production environment to monitor the end-user performance experience.

A second goal for running test scripts, and monitoring results in production is to provide feedback to earlier performance testing levels. Are the designed tests still an accurate representation of user behavior? This prevents the situation of a test and monitoring setup that no longer represents real-life usage. For instance, when more and more traffic is being generated from mobile devices, the resulting load can run in parallel alongside traditional (PC-based) browser usage in all earlier executed performance testing varieties. This results in updates of the load and iteration model.

If additional tools or approaches become available to monitor performance, this also results in a feedback to earlier test varieties. The performance metrics plan will need to be updated in order to keep providing reports/results that can be compared over the different performance test varieties.

Performance testing as part of the IT delivery lifecycle

Every organization or IT process has its own specific challenges and processes in place regarding the development process, infrastructure and testing approach. This results in either executing a single performance test variety, or a combination of activities. With the integration of performance testing in the IT delivery lifecycle, a number of improvements become possible that will allow for a more thorough level of performance testing:

Make performance a major part of the Definition of Done. This not only requires extra attention to design and development for performance; it also requires a testing infrastructure and tool setup that will provide the benefits of all relevant varieties of performance testing during sprints.
Schedule end-to-end performance testing in a timely manner regarding the IT systems release to allow for the last fixes.
Facilitate the feedback loop from end-to-end performance testing and production performance monitoring to any part of the DevOps information flow.
Dedicated (unit/component) performance acceptance tests integrated in a CI/CD pipeline.
Etc.

The major challenge is to unlock these performance testing capabilities for an organization. This can be done by temporary or part-time availability of experts from a specialized performance testing team to help other teams. Another aspect is to provide a reusable infrastructure and tool setup for all teams. This reusable infrastructure will either be part of the standard CI/CD pipeline, or a dedicated performance test pipeline can be provided.

A dedicated setup may be required due to the costs involved in running a production-sized performance testing environment. Even with the benefits of using cloud capacity that can be easily switched off or even removed, a greater level of control for the standard pipeline will then be useful. The costs involved in gearing up for extensive performance testing capacity can be significantly higher than a standard CI/CD pipeline. Triggering such actions will have to be a well thought-out decision instead of something anyone can trigger from the standard pipeline.

No matter which environment pipeline choices are made, the setup needs to provide a framework for quick definition and deployment of performance tests. By knowing the value of performance results compared to the expected full-scale production environment, such a centralized setup also allows for performance testing in limited test environments (the standard pipeline). If needed, extra tools can be made available which makes earlier testing possible such as service virtualization tools for simulating part of the final end-to-end situation.