Validation Methodology: Comparing Python QUAL2K with the Legacy Excel Model

The Python QUAL2K engine is validated by running identical inputs through both the legacy EPA QUAL2K Excel/VBA model and the new Python implementation, then comparing all output values. This approach proves numerical equivalence without relying on field data (which tests model calibration, not implementation correctness).

Validation Pipeline

Test Basins

Validation covers multiple river basins of varying complexity:

ModelLocationReachesPoint SourcesDiffuse SourcesFeatures Tested
ParaluzBrazil200Basic hydraulics, headwater-only, BOD/DO decay
BolderColorado, USA5021Large model, many reaches, full constituent set
Rio CaucaColombia1732Tropical river, nitrification-dominant DO dynamics
Rio MeléndezColombia810Auto-calibration, urban river, WWTP effluent

Comparison Metrics

For each output variable at each reach, three metrics are computed:

  • Absolute difference: |Python − Excel|
  • Relative difference (%): 100 × |Python − Excel| / Excel (when Excel ≠ 0)
  • Maximum error: the largest relative % across all reaches

Pass/Fail Criteria

Variable CategoryPass ThresholdRationale
Hydraulics (U, H, B, Ac)< 0.1%Deterministic equations, exact match expected
DO, CBOD< 1.0%ODE integration differences (Euler vs analytical)
Nutrients (N, P)< 2.0%Cascading rate interactions can amplify small differences
Pathogens< 1.0%Simple first-order, exact match expected
Temperature< 0.5%Direct input, minimal computation

Known Differences

Running the Validation Suite

The validation suite can be run locally via the Python CLI:

python validate_all_models.py

This processes all test basins and produces a summary report with pass/fail status and maximum errors per variable. See the Model Comparisons page for actual results.