Experience-Based Approach: Quality Hunting

Delivering software with the right quality at the right moment is only possible if there is a clear view of what quality level is needed to get the pursued business value (which is elaborated
in the requirements). Also, the people involved need a clear view of what the actual quality level currently is. Static testing and dynamic testing, by people in IT delivery teams, contribute to the
understanding of the current quality level. However, part of the image of quality depends on the impression the stakeholders have from their personal experience. To support building such
personal experience, you can organize a collaborative activity with a gamification approach, so-called quality hunting (note: this is not the same as bug hunting).

What is Quality Hunting?

Quality hunting is a gamified event where quality hunting teams (typically half the size of an Agile team, so about 4 people) compete to give the most valuable assessment of the quality level of their test object in a limited timeframe.

The main goal of quality hunting is to get a shared view of the quality level of the test object. By having multiple quality hunting
teams, consisting of stakeholders and members of IT delivery teams, competing to get the best understanding of the current quality level, these teams are challenged to come up with fresh and innovative ways to assess the quality and to give a good insight into one or more aspects of the quality level.

Crowd testing — Figure: Quality hunting (a gamified approach to providing information about quality).

During the quality hunting event each team documents their findings in a similar way as the debriefing information is presented in exploratory testing. When anomalies are found during the quality hunt, these are handled according to the anomaly management process of the organization.

The information gathered by multiple teams usually complements each other and extends the information that was obtained with earlier scenario-based testing and other quality engineering activities.
The stakeholders benefit from quality hunting because, in a short timeframe, many ideas about, and new angles for, relevant quality aspects are investigated, so the stakeholders get a broad and varied overview of the quality level of their system.

How to Organize Quality Hunting

The game element in quality hunting consists of a (usually small) reward that is awarded to the team that gives the most informative report and/or the most specific angle to the quality. Quality hunting is about getting the most valuable information about the quality level of the software, which is a different and broader activity than bug hunting which is a gamified testing approach that only motivates finding faults and failures. Quality hunting is organized during a pipeline step related to testing the overall business process as part of the manual testing.
The collaborative way of working promotes interaction between the members of the quality hunting team, but also (while discussing the results of the quality hunt) between the different teams. This way, quality hunting contributes to the culture of collaboration and learning of the Agile mindset and the DevOps culture.

Quality Hunting is Part of Experience-Based Testing

In the TMAP books Neil’s quest for quality [Boersma 2014] and Quality for DevOps teams [Marselis 2020], we described how test design is divided into coverage-based testing (using test design techniques to create test scenarios) and experience-based testing (which uses people’s experience to design tests).
In experience-based testing so far, we distinguished four different approaches. We now add quality hunting as the fifth approach to experience-based testing.

Five approaches for experience-based testing.

The five approaches to experience-based testing are (from left to right in the figure): checklists, error guessing, exploratory testing, quality hunting and crowd testing.

Using GenAI in Quality Hunting

In amplified quality engineering we both elaborate on Quality Engineering WITH GenAI and Quality Engineering FOR GenAI. In this section, we look into how GenAI can support you in preparing and performing a quality hunt (QE WITH GenAI).
Generative AI tools can support in all sorts of activities, so the examples below are in no way intended to be a complete list.

A quality hunting team can start their activities with creating a list of test ideas to have an abundance of possibilities to vary their testing during the quality hunt. Involving a GenAI tool in the brainstorm for test ideas will support the creativity of the team members and helps in quickly registering the test ideas. The test ideas can be twofold; they can be about what aspects of quality are of interest, and about how to assess such aspects.
During the quality hunt the team needs to keep track of what they did and what information they found about the quality level. GenAI tools (for example dedicated AI agents) are very useful in keeping track of all this information in a structured and accessible way (for example by automatically making screen shots, or by preparing an anomaly report of a failure that was pointed out by a team member).
GenAI tools and/or Agents can be instructed to support the quality hunting team in assessing specific quality characteristics. For example, by automatically scanning the system for known security vulnerabilities (for example based on the OWASP top 10 for LLMs [OWASP 2025]).
At the end of a quality hunt, the teams need to summarize and report their assessment of the quality level. GenAI tools (especially LLMs) are very good at summarizing the previously gathered information and even generating graphs and charts to present the quality status in a clear and easy-to-understand way.
GenAI tools can observe the way of working of the quality hunting team and learn how to repeat an assessment of the same quality aspects with only minor support of people (one or two “human experts in the lead”) to perform very efficient regression tests.

Using Quality Hunting to Assess GenAI-based Solutions

In this section, we elaborate on how quality hunting can be used to assess the quality level of a GenAI-based solution (QE FOR GenAI).
Below are a few examples:

GenAI is intrinsically non-deterministic, meaning the same prompt can result in different answers during different tries. Therefore, testing in just one run does not provide the level of confidence needed to support the decision to go live with a (new version of) a GenAI-based system.
In a quality hunting event, the team can use a “how many out of 5” approach, meaning they do a similar test 5 times and determine how many times the result meets the expectation. They must decide beforehand how many times the result must be right to consider it a pass.
The quality hunting team can assess specific quality characteristics that are especially of interest for GenAI-based systems.
A quality characteristic that was specifically defined for AI is “humaneness”. Since deciding whether the behavior of a system is humane or not is not a simple pass/fail evaluation, this is particularly suitable for evaluation in a quality hunting event.
Another important quality characteristic is usability. It is quite hard to assess the level of usability by designing and elaborating test cases, group those test cases in test scenarios and execute these tests manually or automated. For a quality hunting team that consists of various people with different backgrounds (both business and IT), it is quite easy to assess the usability. So, quality hunting is particularly suite to assess the usability.
Functional accuracy of a system is important. This goes for regular IT systems, but it is even more important for GenAI based IT systems. Especially because GenAI tends to always give a convincingly looking answer, even if it actually doesn’t know the answer and is merely confabulating. A quality hunting team composed of people with diverse backgrounds is more likely than an average testing team to distinguish between true and false information provided by the GenAI system.