Validation Methodology: Comparing Python QUAL2K with the Legacy Excel Model

The Python QUAL2K engine is validated by running identical inputs through both the legacy EPA QUAL2K Excel/VBA model and the new Python implementation, then comparing all output values. This approach proves numerical equivalence without relying on field data (which tests model calibration, not implementation correctness).

Validation Pipeline

Figure 1: Automated validation pipeline

Test Basins

Validation covers multiple river basins of varying complexity:

Model	Location	Reaches	Point Sources	Diffuse Sources	Features Tested
Paraluz	Brazil	2	0	0	Basic hydraulics, headwater-only, BOD/DO decay
Bolder	Colorado, USA	50	2	1	Large model, many reaches, full constituent set
Rio Cauca	Colombia	17	3	2	Tropical river, nitrification-dominant DO dynamics
Rio Meléndez	Colombia	8	1	0	Auto-calibration, urban river, WWTP effluent

Comparison Metrics

For each output variable at each reach, three metrics are computed:

Absolute difference: |Python − Excel|
Relative difference (%): 100 × |Python − Excel| / Excel (when Excel ≠ 0)
Maximum error: the largest relative % across all reaches

Pass/Fail Criteria

Variable Category	Pass Threshold	Rationale
Hydraulics (U, H, B, Ac)	< 0.1%	Deterministic equations, exact match expected
DO, CBOD	< 1.0%	ODE integration differences (Euler vs analytical)
Nutrients (N, P)	< 2.0%	Cascading rate interactions can amplify small differences
Pathogens	< 1.0%	Simple first-order, exact match expected
Temperature	< 0.5%	Direct input, minimal computation

Known Differences

Running the Validation Suite

The validation suite can be run locally via the Python CLI:

python validate_all_models.py

This processes all test basins and produces a summary report with pass/fail status and maximum errors per variable. See the Model Comparisons page for actual results.