In this chapter, we (as the authors of this chapter) have attempted a two-pronged approach to evaluation. As is traditional, we discuss how well models simulate the mean seasonal climate for a number of variables (i.e., the average for a given season taken over many simulated years). Since the characterisation of a climate state includes its variability, we also describe simulated climate variability over a range of time-scales. In addition, we discuss aspects of the variability in the behaviour of specific phenomena. Evaluation of the performance of global models in specific geographical regions is the subject of Chapter 10.
We use a wide range of "observations" in order to evaluate models. However, often the most useful source for a particular variable is a product of one of the reanalysis projects (most commonly that of the National Centers for Environmental Prediction (NCEP) (Kalnay et al., 1996) or from the European Centre for Medium-Range Weather Forecasts (ECMWF) (Gibson et al., 1997)). Although products from a data assimilation system are not direct "observations" (they are the outcome of a combination of observed and model data), the global grided nature and high time resolution of these products makes them extremely useful when their accuracy is not in question. Some additional useful products from reanalysis are not, in fact, the result of a direct combination of observed and model data but are in fact the outcome of model integration and hence must be used with caution. It is important to note that the various variables available are not all of the same quality and, especially for data-sparse regions, implicitly contain contributions from the errors in the underlying model (see also Chapter 2). The overall quality of reanalysis products is continually assessed at regular International Reanalysis Workshops.
Recent discussions by Randall and Wielicki (1997), Shackley et al. (1998 and 1999), Henderson-Sellers and McGuffie (1999) and Petersen (2000) illustrate many of the confusions and uncertainties that accompany attempts to evaluate climate models especially when such models become very complex. We recognise that, unlike the classic concept of Popper (1982), our evaluation process is not as clear-cut as a simple search for "falsification". While we do not consider that the complexity of a climate model makes it impossible to ever prove such a model "false" in any absolute sense, it does make the task of evaluation extremely difficult and leaves room for a subjective component in any assessment. The very complexity of climate models means that there are severe limits placed on our ability to analyse and understand the model processes, interactions and uncertainties (Rind, 1999). It is always possible to find errors in simulations of particular variables or processes in a climate model. What is important to establish is whether such errors make a given model "unusable" in answering specific questions.
Two fundamentally different ways are followed to evaluate models. In the first, the important issues are the degree to which a model is physically based and the degree of realism with which essential physical and dynamical processes and their interactions have been modelled. This first type of evaluation is undertaken in Chapter 7. (We discuss the related aspects of the numerical formulation and numerical resolution in Section 8.9.) In the second, there are attempts to quantify model errors, to consider the causes for those errors (where possible) and attempts to understand the nature of interactions within the model. We fully recognise that many of the evaluation statements we make contain a degree of subjective scientific perception and may contain much "community" or "personal" knowledge (Polanyi, 1958). For example, the very choice of model variables and model processes that are investigated are often based upon the subjective judgement and experience of the modelling community.
The aim of our evaluation process is to assess the ability of climate models to simulate the climate of the present and the past. Wherever possible we will be concentrating on coupled models, however, where necessary we will examine the individual model components. This assessment then acts as a guide to the capabilities of models used for projections of future climate.
Other reports in this collection