In the world of numerical computing, ensuring the accuracy and reliability of mathematical libraries presents unique challenges that go beyond traditional software testing approaches. Recent discussions among industry professionals have highlighted critical insights into testing numerical routines, particularly in high-performance computing environments and specialized mathematical libraries.
The Challenge of Floating-Point Precision
One of the most significant challenges in numerical testing is dealing with floating-point precision. Industry experts have noted that simply using fixed tolerance values like 1e-6 can be problematic. As highlighted in the community discussions, relative error rather than absolute error should be the standard approach for floating-point math, though this too has its limitations.
But then relative error is also not a panacea. If I compute 1 + 1e9, then producing 1e9 - 1 instead would fall within a relative error bound of 1e-6 easily. More generally, relative error works only if your computation scales multiplicatively from zero.
Surprising Vendor Library Issues
A particularly noteworthy revelation from the community is the discovery of numerous bugs in vendor-supplied mathematical libraries. Engineers working with Kokkos Kernels have found issues in major libraries including OpenBLAS, MKL, cuSparse, and rocSparse. This highlights the importance of thorough testing even when using established vendor solutions.
Edge Cases and Simple Matrices
Testing with edge cases has proven to be one of the most effective strategies. Professionals recommend testing with matrices of dimensions {0,1,2,3,4} and special values like NaN, +0, -0, +1, -1, +Inf, -Inf. These simple cases often reveal critical issues that more complex test cases might miss.
Key Testing Approaches for Numerical Libraries:
- Edge case testing with matrices of dimensions {0,1,2,3,4}
- Special value testing: NaN, +0, -0, +1, -1, +Inf, -Inf
- Property-based testing with random inputs
- Inverse function testing (round-trip verification)
- Testing against known analytical solutions
Version Compatibility Concerns
A critical issue raised by the community involves version-to-version compatibility in numerical libraries. Many Python numerical libraries change their internal representations and algorithms between versions, potentially producing slightly different results. This can have serious implications for industries like finance where reproducibility is crucial.
Hardware-Level Considerations
The community has highlighted that even at the hardware level, floating-point operations can vary. Modern x64 processors can perform floating-point operations using either SSE registers (64-bit IEEE) or x87 instructions (80-bit extended precision), potentially leading to different results depending on compiler settings and hardware capabilities.
In conclusion, the testing of numerical routines requires a multi-faceted approach combining traditional unit testing with specialized techniques for handling floating-point arithmetic. The community's experience suggests that a comprehensive testing strategy should include edge cases, property-based testing, and careful consideration of hardware and version compatibility issues.
Source Citations: Unit Testing Numerical Routines