Testability

Enable JavaScript to unlock Galleries, BibTeXs, and the Contact Form.

Testability is defined as the degree to which a system or component can be tested via runtime observation, determining how hard it is to write effective tests for a piece of software. It is an essential design-time concern that developers often ignore, despite the fact that testing can account for 30% to 50% of the entire cost of a system.

Controllability and Observability

At its heart, testability is the combination of two measurable metrics: controllability and observability.

Controllability measures how easy it is to provide a component with specific inputs and bring it into a desired state for testing. If you cannot force the software into a specific scenario or condition, creating an effective test is impossible.
Observability measures how easily one can see the behavior of a program, including its outputs, quality attribute performance, and its indirect effects on the environment. Tests rely on observability to verify whether functionality conforms to the specification.

A major challenge occurs when a system depends on external components, such as a booking system interacting with a Global Distribution System (GDS). In these cases, developers must handle indirect inputs (responses from external services) and indirect outputs (requests sent to external services). Verifying these requires specific design patterns to maintain controllability and observability without actually “buying flights” during every test run.

Designing for Testability

Designing testable software requires proactive architectural decisions. Many principles that improve other qualities, such as changeability, also synergize with testability.

SOLID Principles: Smaller pieces of functionality, as mandated by the Single Responsibility Principle, are much easier to test. The Interface Segregation Principle reduces effort by creating smaller interfaces that are easier to mock or stub. Finally, the Dependency Inversion Principle makes it easier to inject test doubles because dependencies only go in one direction.
Test Doubles: To address controllability of inputs, developers use test stubs to provide pre-coded answers. To observe indirect outputs, test spies or mock components are used to verify that the correct messages were sent to external systems.
Architectural Tactics: Highly testable designs minimize cyclic dependencies, which otherwise prevent components from being tested in isolation. They also provide ways to manipulate configuration settings easily and ensure all component states can be accessed by the test.

Testing Quality Attributes

Testability extends beyond functional correctness to include the verification of quality attribute scenarios.

Reliability: Systems like Netflix test reliability by “killing” random services (a controllability challenge) and observing how the rest of the system is impacted (an observability challenge). This often involves fault injection via test stubs.
Performance: Developers can inject latencies into connectors or components to analyze the impact on the whole process. This often includes stress testing to see how the system manages at its limits.
Security: This is tested by simulating attacks, such as malicious input injection or unauthorized requests, and measuring the time it takes for the system to detect or repair the breach.
Availability: Because observing 99.9% uptime over a year is impractical, developers inject faults in rare, high-load situations and mathematically extrapolate the system behavior to estimate long-term availability.

Increasing Test Coverage

Because specifying every input-output relationship is costly (the oracle problem), advanced techniques are used to increase coverage.

Monkey Testing: This involves a “monkey” that randomly triggers system events (like UI clicks) to see if the system crashes or hits an undesirable state. While good for finding runtime errors, it cannot identify logic errors because it doesn’t know what the correct output should be.
Metamorphic Testing: This samples the input space and checks if essential functional invariants hold true. For example, in a search engine, searching for the same query twice should yield the same results regardless of the user profile.
Test-Driven Development (TDD): In TDD, developers write the test first, implement the minimum code to pass it, and then refactor. This approach guarantees testability because code is never written without a corresponding test, leading to 100% unit test coverage and modular design.

Domain-Specific Testability

The approach to testability varies significantly based on the risk profile of the domain.

Web Applications: Testing is often visual and challenging to automate, requiring frameworks like Selenium or Playwright to simulate user clicks and assert element visibility.
Spacecraft Software (NASA): In high-stakes environments where failures are not an option, testability is critical because faults can only be detected on Earth before launch. NASA employs rigorous formal design reviews, restricts language constructs (e.g., no recursion), and only trusts software that has been “tested in space”.
Startups: For small teams, testability is a tool for value proposition evaluation, often using “Wizard of Oz” approaches to mock part of a system with human intervention to evaluate a concept before building it.