Steve Easterbrook (University of Toronto)

Title: Constructive and External Validity for Climate Modeling

Discussion of validity of scientific computational models tend to treat “the model” as a unitary artifact, and ask questions about its fidelity with respect to observational data, and its predictive power with respect to future situations. For climate modeling, both of these questions are problematic, because of long timescales and inhomogeneities in the available data. Our ethnographic studies of the day-to-day practices of climate modelers suggest an alternative framework for model validity, focusing on a modeling system rather than any individual model. Any given climate model can be configured for a huge variety of different simulation runs, and only ever represents a single instance of a continually evolving body of program code. Furthermore, its execution is always embedded in a broader social system of scientific collaboration which selects suitable model configurations for specific experiments, and interprets the results of the simulations within the broader context of the current body of theory about earth system processes. We propose that the validity of a climate modeling system can be assessed with respect to two criteria: Constructive Validity , which refers to the extent to which the day-to-day practices of climate model construction involve the continual testing of hypotheses about the ways in which earth system processes are coded into the models, and External Validity , which refers to the appropriateness of claims about how well model outputs ought to correspond to past or future states of the observed climate system. For example, a typical feature of the day-to-day practice of climate model construction is the incremental improvement of the representation of specific earth system processes in the program code, via a series of hypothesis-testing experiments. Each experiment begins with a hypothesis (drawn from current or emerging theories about the earth system) that a particular change to the model code ought to result in a predicable change to the climatology produced by various runs of the model. Such a hypothesis is then tested empirically, using the current version of the model as a control, and the modified version of the model as the experimental case. Such experiments are then replicated for various configurations of the model, and results are evaluated in a peer review process via the scientific working groups who are responsible for steering the ongoing model development effort. Assessment of constructive validity for a modeling system would take account of how well the day- to-day practices in a climate modeling laboratory adhere to rigorous standards for such experiments, and how well they routinely test the assumptions that are built into the model in this way. Similarly, assessment of the external validity of the modeling system would take account of how well knowledge of the strengths and weaknesses of particular instances of the model are taken into account when making claims about the scope of applicability of model results. We argue that such an approach offers a more coherent approach to questions of model validity, as it corresponds more directly with the way in which climate models are developed and used.