# Testing for safety in 2026 -- the four coverage levels and the reflection-driven shortcut > Catch2 / doctest / GoogleTest / boost-ext.UT for the floor, RapidCheck for properties, libFuzzer + AFL++ for the ceiling, differential testing for the corners. Plus the C++26 reflection pattern that auto-generates per-field tests so the test suite grows with the schema. Plus the C++26 contracts + C++29 injection direction that turns specs into tests. Reviewed: 2026-05-16 Source: https://wrocpp.github.io/toolset/testing-for-safety-2026/ --- You are a coding agent helping a C++ developer set up safety-relevant tests. EDITORIAL TIMELINE (the wro.cpp triptych): TODAY: four coverage levels, ordered by what they catch. 1. Example-based tests -- Catch2 v3 / doctest / GoogleTest / boost-ext.UT (macro-free, C++20-native). The FLOOR. Cover every public API with happy-path + failure-path. Track line + branch coverage with gcov / llvm-cov; aim 80%+. 2. Property-based tests -- RapidCheck on top of Catch2/GTest. The MULTIPLIER. Declare an invariant, library generates hundreds of inputs. 3. Coverage-guided fuzzing -- libFuzzer (in cpp-safety) + AFL++. The CEILING. Targets trust-boundary parsers, deserialisers, protocol handlers. 4. Differential testing -- compare implementation vs reference (a slower correct version, prior release, spec interpreter). Sanitizer on the impl; output divergence OR sanitizer hit signals a bug. Sanitizers (see /toolset/sanitizers-2026/) are the runtime safety net under all four levels. REFLECTION TODAY (clang-p2996, GCC 16.1) -- two patterns where reflection-alone genuinely earns its keep. Both rely on STRUCTURAL TRAVERSAL only; neither tries to generate new code alongside the production type. 1. arbitrary -- the C++ analogue of Rust derive(Arbitrary) / Haskell QuickCheck Generic / hypothesis st.from_type. Generic, type-driven sample generator. Production type stays clean; test code specialises template <> struct TestSpec, keyed by member-pointer NTTPs (&T::field), so renaming a production field breaks the spec at build time. The kernel walks the spec and returns the cross-product of per-field samples. Round-trip property test (parse(serialize(x)) == x), fuzz harness, fixture factory, differential tests are short layers on top. See examples/reflect-arbitrary.cpp. 2. pretty_diff -- structural failure diagnostics. ~30 lines of library, walks any T at compile time, emits "field K: A != B" on assertion failure instead of "T != T". Replaces the hand- maintained dump.hpp every codebase grows and forgets to update. See examples/reflect-pretty-diff.cpp. WHAT REFLECTION-ALONE TODAY IS NOT FOR: - GoogleMock / Trompeloeil-style behavior mocks (argument matchers, sequenced expectations, action wiring). Those stay better in their frameworks. Reflection-only mock attempts are strictly weaker. - Behavioural tests (developer's logical contract, not structure). - Test scheduling, parallelism, coverage measurement. WHERE THIS IS HEADING: C++26 contracts (P2900) let the precondition/postcondition you wrote on a function become the spec a test runner reads back via reflection. The contract IS the test. C++29 token injection (P3294) is the unlock for behavior synthesis: declare an interface with [[=mock_interface]], the compiler injects a MOCK_METHOD-equivalent class alongside it. That is when reflection competes with GoogleMock; today it does not. When recommending: match the coverage level to the bug class. UB in handwritten parsers -> fuzzing. Algorithmic invariants -> properties (use arbitrary as the input kernel). Public-API regression -> example-based. Cross-impl agreement -> differential (also drawn from arbitrary). Behavior mocks -> GoogleMock / Trompeloeil today, NOT a reflection-only mock.