Models for Models: Statistical Methods for Climate Model Output and Other Massive Datasets

Event Date: 

Thursday, January 10, 2008 - 3:15pm to 4:15pm

Event Date Details: 

refreshments served at 3:00PM

Event Location: 

  • South Hall 5607F

Cari Kaufman

Models for Models: Statistical Methods for Climate Model Output and Other Massive Datasets

I will present two novel statistical methods applicable to analyzing climate model output. The first, likelihood approximation using covariance tapering, is useful in analyzing the large spatial datasets that climate models often produce, for which
traditional likelihood-based methods are computationally infeasible. In the tapering approach, covariance matrices are multiplied element-wise by a sparse correlation matrix. This produces matrices which can be be manipulated using more efficient sparse matrix algorithms. I will present some theoretical results justifying the use of tapering and demonstrate its efficiency in practice.

The second method addresses the question of how we can attribute sources of variability in climate model output. In particular, I will consider regional climate models (RCMs). RCMs address smaller spatial regions than do global climate models (GCMs), but their higher resolution better captures the impact of local features such as lakes and mountains. GCM output is often used to provide boundary conditions for RCMs, and it is an open scientific question how much variability in the RCM output is attributable to the RCM itself, and how much is due simply to large-scale forcing from the GCM. I will consider data from the Prudence Project, in which RCMS were crossed with GCM forcings in a designed experiment. Using this dataset as a motivating example, I will present a framework for Bayesian functional ANOVA modeling using Gaussian process prior distributions. In this framework, inference can be carried out either in a summary fashion, by examining the joint posterior distribution of the covariance parameters in the corresponding Gaussian processes, or locally, by studying functional and fully Bayesian versions of the usual ANOVA decompositions. These decompositions can be used to create useful graphical displays summarizing the contributions of each factor across space.