Usability testing

What is usability testing?

Usability testing is a test variety based on the quality characteristic usability within to so called non-functionals.

[More info: functional and non-functional testing]

Definition
Usability is the degree to which a product or system can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use. [ISO25010 2011]

Usability testing is an important test variety for creating an intuitive system, leading to productive, efficient and satisfied users of that system. It is generally an iterative process where different techniques should be employed throughout all phases of a project. This way, the final product becomes optimally suited to the demands and the needs of the end users. It is generally better to start testing the usability earlier rather than later. If there is no possibility to test early, testing late in the process is highly preferable to no testing at all. Still, by testing early one can prevent unnecessary expenses.

Usability testing (also known as User eXperience testing or UX testing) covers a wide range of techniques that are applicable to different phases or iterations of a project or system lifecycle. For instance, personas can be used during requirements elicitation, while observation can take place as soon as there is a product that can be used by a user. This may be a mock-up during early stages and a partial or even complete product during acceptance. These techniques may be executed by usability specialists, such as a heuristic evaluation on a functional system, together with end users. Or by user researchers, such as an observation of users interacting with a prototype of a system. This can even be done in a group setting, such as a walkthrough of a design with (a focus group of) users, developers and designers.

It is advisable to start with the planning, specifically for usability testing, to determine which techniques will be used in particular phases of a project to cover all quality risks related to the usability of a system. When you work in sprints, you can make usability testing part of the sprint because there are usability testers in a team. It is also possible to work with a separate team of specialists and researchers to support other teams when they need it or proactively follow other sprints to provide advice and support.

Roles in usability testing

In usability there may be confusion about the term “tester”. The tester is often a regular user. Sometimes this user doesn’t even realize that they are testing. To prevent this confusing, we speak of the respondent and the end user. More about this will follow in the following sub-sections.

Why do we use real users for usability testing? Well, nothing can beat testing with real users. In usability testing, we distinguish four important roles that are described in the following sections, others may also be involved in usability testing, however, for example (but not limited to) product owners, stakeholders, developers or analysts.

End users

For most projects, the end users are the people who will actually use the system once it is realized. To add the most value to their experience with the system it is very important to involve them in the process. Usability testing is an excellent way to do this. Their input provides legitimacy to any statement about the usability of a system and helps acceptance within the organization. Therefore, any end user involvement will provide added value to the system. The data gathered with the involvement of end users will also validate choices that are made, making them data-driven instead of “guesses”.

Respondent

Once a person is selected to participate in a usability test, this person becomes a respondent. A respondent can be an end user, but this is not always necessary. Sometimes it is preferable to select a respondent who is not an end user. Regarding usability testing, when referring to a respondent, this is a specific person who participated in the test; when referring to an end user, this is a user in general terms and not to a specific person.

User researcher

The user researcher, also called UX researcher, engages in assessing and making recommendations to improve usability based on gathered data with usability testing. The user researcher is the professional who determines which techniques will be used in which phase. The user researcher will also coordinate, prepare or execute these techniques. They may also perform design activities, such as creating wire frames or other prototypes, or creating redesign proposals.

Designer

There are many different types of designers. In usability testing, there may be a role for the functional, graphic, interaction or user experience (UX) designer. They may have different tasks depending on their expertise and the techniques used. Designers may provide input for test preparation, use test results to create a redesign, participate in session-based techniques or even perform techniques themselves. A (UX-) designer can also perform tasks that a user researcher can perform. However, it is preferable to have a product tested on usability by someone who has not designed it, to prevent accumulating assumptions based on which earlier choices were made, and to prevent emotional attachment and bias. It is a misconception that testing with actual users is not necessary if a UX or interaction designer does their job correctly. Even with the best designers on staff, testing with actual users is always a key success factor.

Quality subcharacteristics

In usability, we have to pay attention to a great number of different attributes. The ISO25010 standard defines the following subcharacteristics for usability:

Appropriateness recognizability, operability, learnability, user error protection, user interface aesthetics and accessibility.

Usability test plan and usability testing techniques

A usability test plan may describe the product risks specifically concerning usability and which techniques will be used to cover these. Additionally, topics such as functionality, scoping, tooling, resourcing, planning, budget and reporting may be addressed. In usability testing a wide range of techniques can be used. The following sections contain a list of common static and dynamic testing techniques:

Static testing techniques for usability testing

Focus groups – an informal technique for group reviewing where users discuss a product. Due to its informal way of working, a very skilled moderator is required.
Card sorting – a preventative technique to generate input to a user-friendly design of a new system. (Future) users of the system get the task to make logical groups of paper cards that describe parts of the system. Card sorting is a good technique to create an intuitive structure, such as a navigation structure for a website, or to check if a current structure actually meets the users’ expectations.
Reverse card sorting – a reviewing technique to evaluate the structure of an application, also known as tree testing. Using at least 5 (future) users, each of them gets a specific task that they should perform by navigating through a mockup of the system-structure on paper cards.
Cognitive walkthrough – a formal technique for group reviewing to assess the ease of use of a product for a user. In this technique, personas are used to represent various typical types of users.
Consistency inspection – an informal technique for group reviewing of an interface design. This review can be performed by designers and/or usability experts.
Model-based inspection – a formal technique of individual reviewing of user manuals. In this technique, the tester creates a model of the use of the test object. This model is then reviewed to determine if it resembles the intended use of the test object.
Pluralistic walkthrough – a formal technique for group reviewing that uses a structured description of the process, for example in a use case description or a process flow diagram. This is specifically useful for testing paper prototypes in an early stage.

Dynamic testing techniques for usability testing

Interviews – this technique can be used to measure very specific user-experience-aspects that cannot be measured by any other technique. For example, pleasurability (does it bring pleasure to the user). Interviews differ based on three dimensions: Medium (in-person, telephone, chat, etc.), Structure (unstructured, fully structured, somewhere in between), closed versus open questions.
Interviews can often be combined with other techniques.
Heuristic evaluation – an informal technique in which usability experts perform a static test of an interface. The principles used for this test are called heuristics and are mostly listed on a checklist. A common checklist for this is “10 Usability Heuristics for User Interface Design” by Jakob Nielsen [Nielsen 1994].
Perspective based inspection – an informal individual technique where one or more inspectors assess a user interface design from various perspectives. One of the possible perspectives is to use personas.
Thinking out loud – an informal technique where the tester tells out loud what he is thinking, while using the test object.
Thinking out loud can often be combined with other techniques.
Observation – an informal individual technique where an observer observes the behavior of the respondent (user) while using the test object. Observations can be done in a natural environment (which is also called field research) or in a usability laboratory. More info on usability lab in the wiki.
Logging actual use – a tool-based technique to record the use of the test object by the tester. This technique can be used to support other techniques. The registered information can be analyzed to gain insight into what a user does or doesn’t do. An example of logging actual use is eye tracking whereby a camera with smart software registers the spot on the monitor that the user looks at.
A/B testing – a technique for testing two variants of a user interface (e.g. a website or a mobile app). The variants differ in one aspect, part of the users gets variant A, another part gets variant B and the use is logged. When studying the log, the tester can determine which variant has the best results, so that this variant will be implemented or used for further development.
If more than two variants are used, this technique is called multivariate testing.
Prototype testing – testing a prototype of the system. There are 3 types of prototype: vertical prototype (all functionality of a subset of the system), horizontal prototype (partial functionality of the total system) and scenario prototype (partial functionality for a subset of the system). Depending on the purpose of testing a choice for one of the prototypes is made.

Environments to perform usability tests

Usability Laboratory – In the laboratory the person who tests the test object (often a real-life user) is usually separated from the observer(s) using a one-way-mirror. This way, the respondent is not distracted by the observers. In the usability lab many of the aforementioned techniques can be applied; separately, combined or one following the other.

Real-world environment – The user is observed while using an operational system, or a new (not yet released version of a) system, in their own regular work environment. The user is in his normal situation and will act most normal.

Personas

A persona is an archetype of a user of the system. A persona represents the goals and tasks of a specific group of users together with their personal characteristics. Although a persona is fictitious, their behavior is based on knowledge about real users. Several personas together represent (a subset of) the user community. Using personas can be combined with multiple of the above mentioned techniques.

Artifacts

A usability plan may describe what the product risks specifically concerning usability are and which techniques will be used to cover these. Additionally topics such as scoping, tooling, resourcing, planning, budget or reporting may be addressed.

Other artifacts may depend on the technique being used. Many techniques will result in findings, often compiled into a report. Also proposals for redesign of certain system parts may be included.

Some techniques by nature result in specific artifacts, such as personas, prototype testing and observation. Other techniques may yield quantitative data which may be statistically analyzed for objective conclusions on the usability of a system, such as surveys, reverse card sorting or A/B testing.

Success factors

Involve users

The most essential success factor is whether or not users are involved. User focus may be achieved to some extend by experienced UX designers or user researchers, however, it cannot replace data gathered by testing with actual end users of the system. They are the only people who determine the added value of the system since they will be using it. Users are the main focus in certain techniques, such as card sorting, observation or thinking out loud, or provide their point of view in other techniques, such as walkthroughs, focus groups or personas.

Start early

Applying inspection techniques to the requirements or the design of a system at an early stage of a project will prevent faults. If not, faults found after the system has been developed will probably severely impact the project, causing excessive and unnecessary amounts of rework and delays. By then such rework often needs to be planned for another iteration or release. Or, when critical, the rework will delay the release all together. As described earlier, it is best to use several techniques throughout a project so that the usability of the product can be optimized in an iterative process, regardless of the project methodology being used. In DevOps, be aware not to overdo it. Creating large reports may not be effective, using light-weight techniques is preferable in this situation.

Test often

Generally, testing often is preferred to testing less often with more respondents. If you have a budget to perform a test with ten respondents, it would be preferable to test twice with five respondents, sometimes even dividing the ten respondents over three tests, rather than testing once with the whole group. There may be circumstances when testing with a larger group is preferable, for example when you have to use limited (external) resources. In any case, always consider what is best in the specific situation.

Observing

While it is always good to ask the user for their opinion, it is even better to observe users as they interact with a system, or a prototype of that system. Users may behave differently than predicted, since they are often not fully aware of their own needs. When users develop a routine in their work, they perform their tasks partially or wholly without even thinking about it. This creates blind spots for users concerning the way they work and consequently how a system should work for them. A common pitfall during observations is to interact with a respondent too much. As a result, respondents may start to rely on guidance during test execution or provide comments that they think are socially desirable rather than their own opinions. If it is possible to leave the respondent alone while still observing, it is important to weigh the pros and cons of staying with the respondent versus leaving the respondent alone. When needed, this can be shifted while performing the test. However, it is highly preferable to assess this correctly in advance as shifting during the test will influence the data.

A special note with regards to observing is that not only the user must be observed but also the people from the DevOps team that are present to witness the user. The reactions of the people that created the system can reveal a lot about misunderstandings about user expectations and user behavior.

User focus

A user focus within the project (and even more so in the entire organization) helps keep the attention directed towards the specific demands and needs of the end users. The entire project team should be aware that the users of its product have a unique perspective. They will have needs and demands that may not be described in the requirements and some may even be subconscious. Also, project team members are often not representative of end users, who may not share their IT domain knowledge and don’t work with the product on a daily basis. A seemingly minor obstacle may become a major frustration when encountered multiple times a day. An awareness of these differences in perspectives will help a project team keep their focus on the end users.