Simulations of current climate conditions serve to evaluate the performance of RCMs. Since the SAR, a vast number of such simulations have been conducted (McGregor, 1997; Appendices 10.1 to 10.3). These fall into two categories, RCMs driven by observed (or "perfect") boundary conditions and RCMs driven by GCM boundary conditions. Observed boundary conditions are derived from Numerical Weather Prediction (NWP) analyses (e.g., European Centre for Medium Range Weather Forecast (ECMWF) reanalysis, Gibson et al. 1997; or National Center for Environmental Prediction (NCEP) reanalysis, Kalnay et al., 1996). Over most regions they give accurate representation of the large-scale flow and tropospheric temperature structure (Gibson et al., 1997), although errors are still present due to poor data coverage and to observational uncertainty. The analyses may be used to drive RCM simulations for short periods, for comparison with individual episodes, or over long periods to allow statistical evaluation of the model climatology. Comparison with climatologies is the only available evaluation tool for RCMs driven by GCM fields, with the caveats applied to GCM validation concerning the influence of sample size and decadal variability (see Sections 10.2, 10.3, and 10.4). Despite these, relatively short simulations (several years) can identify major systematic RCM biases if they yield departures from observations significantly greater than the observed natural variability (Machenhauer et al., 1996, 1998; Christensen et al., 1997; Jones et al., 1999).
Often a serious problem in RCM evaluation is the lack of good quality high-resolution observed data. In many regions, observations are extremely sparse or not readily available. In addition, only little work has been carried out on how to use point measurements to evaluate the grid-box mean values from a climate model, especially when using sparse station networks or stations in complex topographical terrain (e.g., Osborn and Hulme, 1997). Most of the observational data available at typical RCM resolution (order of 50 km) is for precipitation and daily minimum and maximum temperature. While these fields have been shown to be useful for evaluating model performance, they are also the end product of a series of complex processes, so that the evaluation of individual model dynamical and physical processes is necessarily limited. Additional fields need to be examined in model evaluation to broaden the perspective on model performance and to help delineate sources of model error. Examples are the surface energy and water fluxes.
Despite these problems, the situation is steadily improving in terms of grid-cell climatologies (Daly et al., 1994; New et al., 1999, 2000; Widman and Bretherton, 2000), with various groups developing high-resolution regional climatologies (e.g., Christensen et al., 1998; Frei and Schär, 1998). In addition, regional programs such as the Global Energy and Water Cycle Experiment (GEWEX) Continental-Scale International Program (GCIP) have been designed with the purpose of developing sets of observation databases at the regional scale for model evaluation (GCIP, 1998).
Ideally, experiments using analyses of observations to drive the RCMs should precede any attempt to simulate climate change. The model behaviour, with realistic forcing, should be as close as possible to that of the real atmosphere and experiments driven by analyses of observations can reveal systematic model biases primarily due to the internal model dynamics and physics.
A list of published RCM simulations driven by analyses of observations is given in Appendix 10.1. Many of these studies present regional differences (or biases) of seasonally or monthly-averaged surface air temperature and precipitation from observed values. They indicate that current RCMs can reproduce average observations over regions of size 105 to 106 km2 with errors generally below 2°C and within 5 to 50% of observed precipitation, respectively (Giorgi and Shields, 1999; Small et al., 1999a,b; van Lipzig, 1999; Pan et al., 2000). Uncertainties in the analysis fields, used to drive the models, and, in the observed station data sets, should be considered in the interpretation of these biases.
Various RCM intercomparison studies have been carried out to identify different or common model strengths and weaknesses, over Europe by Christensen et al. (1997), over the USA by Takle et al. (1999), and over East Asia by Leung et al. (1999a). For Europe a wide range of performance was reported, with the better models exhibiting a good simulation of surface air temperature (sub-regional monthly bias in the range ±2°C), except over south-eastern Europe during summer. For the USA, a major finding was that the model ability to simulate precipitation episodes varied depending on the scale of the relevant dynamical forcing. Organised synoptic-scale precipitation systems were well simulated deterministically, while episodes of mesoscale and convective precipitation were represented in a more stochastic sense, with less degree of agreement with the observed events and among models. Over East Asia, a major factor in determining the model performance was found to be the simulation of cloud radiative processes.
Other reports in this collection