Model evaluation: Top-level page in hiearchy on model evaluation.
Experimental data sets: Top-level page in hierarchy on data sets.
This page summarises in list form some pitfalls that persons working with quality assurance of dispersion models should pay attention to. The list does not pretend to be comprehensive, but only highlights some problems.
The list is a spin-off from COST 732.
Feel free to elaborate on the list or add more pitfalls to it. ---
General advice when working with evaluation data setsEdit
Some seemingly trivial issues concerning units frequently lead to errors: Make sure that parameters are in the units you think they are! Be careful, and further use common sense and plausibility checks. In particular, conventions concering time can cause problems: Are indicated times in UTC, in local time, or in Daylight Saving Time? Is a time stamp referring to the beginning or the end of the sampling period?
Validation (comparison of model performance against data set)Edit
- If data quality is not carefully ensured, erroneous conclusions concerning model performance can be drawn.
- Statistical metrics should not stand alone as the result of model validation. They can conceal important information. It is recommended to perform exploratory analysis of data - i.e. inspect graphs of various kinds. This is explained very well in the "BOOT User's Guide" by Chang and Hanna (2005), available for download through the web site of the Model Validation Kit, http://www.harmo.org/kit/download.asp
- A basic assumption is that for an ideal model predicted values should match measured values. Consider carefully whether this assumption is justified! As an example, it is not trivial whether a peak concentration along a measuring arc in a field experiment should match a modelled maximum along the arc. There are issues of spatial resolution and of stochastic variability.
- When the ratio of predicted to measured concentration is considered, pay attention to the treatment of small measured values.
- For steady-state models different choices of convergence criteria can lead to different results.
- Most CFD models operate for neutral stability conditions only.
- Be careful that your model does not internally generate negative concentrations.
- Due to large CPU time requirements, calculations using a CFD model are often performed for one value of wind speed only. Eventually, the results are scaled to other values of wind speeds assuming a perfect "Reynolds Number independence". This can be erroneous in the case of low wind speeds in real life conditions. Several factors, such as the traffic produced turbulence in street canyons, large horizontal wind meandering, and thermal stability, can significantly influence the flow and dispersion conditions at low wind speeds. In the case of air quality assessment in urban areas, the low wind speed conditions are crucial, because they normally result in the highest ground level concentrations.
- A common problem is correct representation of ground level sources in grid-point models. Road traffic should preferably be represented by a volume source with the vertical extension determined by the height of the mixing zone behind moving vehicles. In the case of models that represent the ground level sources as flux boundary condition or always assign the emissions to the lowest numerical grid layer, unrealistically high/low street-level concentrations will be predicted if the lowest layer is much smaller/larger than the actual height of the volume source.
Wind tunnel data Edit
Most often, wind tunnel data are restricted to neutral stability conditions.