Journal of Hydrometeorology | Newman et al. 
The concepts of model benchmarking, model agility, and large-sample hydrology are becoming more prevalent in hydrologic and land-surface modeling. As modeling systems become more sophisticated, these concepts have the ability to help improve our modeling capabilities and understanding. In this paper, their utility is demonstrated with an application of the physically based Variable Infiltration Capacity (VIC). We implement VIC for a sample of 531 basins across the contiguous USA, incrementally increase model agility, and perform comparisons to a benchmark. The use of a large-sample set allows for statistically robust comparisons and subcategorization across hydroclimate conditions. Our benchmark is a calibrated, time-stepping, conceptual hydrologic model. This model is constrained by physical relationships such as the water balance, and complements purely statistical benchmarks due to the increased physical realism, and permits physically motivated benchmarking using metrics that relate one variable to another (e.g. runoff ratio).
We find that increasing model agility along the parameter dimension, as measured by the number of model parameters available for calibration, does increase model performance for calibration and validation periods relative to less agile implementations. However, as agility increases, transferability decreases, even for a complex model such as VIC. The benchmark outperforms VIC in even the most agile case when evaluated across the entire basin set. However, VIC meets or exceeds benchmark performance in basins with high runoff ratios (greater than ~0.8), highlighting the ability of large sample comparative hydrology to identify hydroclimatic performance variations.
*Full text can be found [here](http://journals.ametsoc.org/doi/pdf/10.1175/JHM-D-16-0284.1).*