One complaint against computer-generated test cases is that they differ from those designed by humans. Somehow, computer-generated test cases have a different feel to them, and it is sometimes difficult for humans to grasp what is the crux or focal point of a test case produced by a model-based test generator, such as Conformiq Designer™. But why do the tests look different? And does it matter? Are human-designed test cases inherently better than computer-generated ones, or vice versa?
The structure and feel of a human-designed test case set is the result of aiming to fulfill multiple goals at once. One is test coverage, i.e. the (perceived or estimated) capability of the designed test set to actually spot potential faults. In other words, it is the efficiency of the test set in probing the system under test at various function points. Another goal is understandability and maintainability, i.e. the capability of human operators to later update and modify the test set while also understanding its structure and content. A third design goal is low redundancy, i.e. avoidance of test cases that do not contribute to test coverage in any significant manner. To achieve this goals, a human test designer employs his or her cognitive system that is very different from that of a computer. The human approach to design problem like test design problem is hierarchical and plan-driven, as humans have to handle larger problems by splitting them into subproblems and smaller tasks. In short, the human approach to test case design is driven by the structure of human brain and the different goals of (manual) test case design.
The difference to model-based test generation (by computers) is two-fold. First, the “cognitive system” of a computer generating tests is an algorithm, and there is quite a lot of freedom in how test generation algorithms can be designed. A test generation algorithm does not need to mimic the structure of the human cognitive system and it can still produce test sets that achieve the actual qualitative and quantitative goals set for them. For example, the Conformiq Designer algorithms handle the system specification as a whole, and optimize the produced test sets from a global perspective, a very hard task for humans. Second, a computer-generated test set does not need to be understandable and maintainable in the same way as a human-designed one, because the test set will be also maintained by the algorithm and not by a human operator. Therefore computers can generate test sets that humans could not efficiently manage. And this is one of the strengths of the model-based test generation approach. Because test sets do not need to be optimized for human understanding, they can be optimized rigorously for coverage, non-redundancy, and testing efficiency. Taking the human out of the loop makes the test sets stronger when it comes to their core purpose, i.e. discovering potential faults.
Because of this, it is a fallacy to evaluate computer-generated test cases by comparing them to human-designed test sets and considering any difference to be a problem. Computer-designed test sets differ because they can. And that makes all the difference.