Next: Spline Smoothing with Correlated Up: Smoothing Spline Regression Models Previous: Partial Spline Models

## Smoothing Spline ANOVA Models

Consider model () with being a function of multivariate variables . Each variable itself could be a vector. Assume that , where is an arbitrary domain. Then is a function of .

Suppose that we want to use the RKHS to model the effect of variable . Denote as the projection operator onto in . Then

 (15)

where elements 's are main effects, 's are two factor interactions, and so on.

() is just the simplest form of the so-called SS ANOVA decomposition. The classical ANOVA models are special cases with all variables being discrete. In general, suppose that we want to use the RKHS to model the effect of variable where 's are basis functions for the null space . Denote as the projection operator onto in . Then expansion of the following equation

 (16)

provides the general form of SS ANOVA decomposition. () decomposes in the tensor product space into orthogonal and interpretable components. Which decomposition to use depends on prior knowledge and the purpose of a study. It is more precise to think of SS ANOVA decompositions as a powerful technique rather than as some specific models. See Wahba (1990), Gu and Wahba (1993a), Gu and Wahba (1993b), Wahba et al. (1995), Wang et al. (1995), Wang et al. (1997), Wang (1998a), Wang and Wahba (1998) and references there for more details. Similar to the classical ANOVA, usually a model space is a subspace containing lower order components. After a model is chosen, we can regroup and write the model space as
 (17)

where is a finite dimensional space containing functions which are not going to be penalized, and is a RKHS with reproducing kernel . The estimate of is the minimizer of
 (18)

where is the orthogonal projection of onto in . Let and . The solution to () is
 (19)

where and are solutions to () with replaced by . Smoothing parameters can be estimated similarly using GCV, GML and UBR methods (Wahba, 1990). The Fortran subroutine dmudr.r in RKPACK was developed to solve equations () and estimate the smoothing parameters for . In our ASSIST package, the function dmudr serves as an intermediate interface between S and the driver dmudr.r.

ssr can also be used to fit SS ANOVA models. Basis functions can be specified as before using the formula argument. Reproducing kernels can be specified using the argument rk as a list of expressions.

Example 7 Consider , and . Functional data are a typical example of this case (Ramsay and Silverman, 1997). Suppose that we want to model the effect using a one-way ANOVA effect model with , and the effect using a linear spline . Define two projection operators

 (20)

We have

Construction () of a polynomial spline is used. Construction () may also be used to derive a similar SS ANOVA decomposition. SS ANOVA model () can be fitted by
   ssr(y~1, rk=list(shrink1(t1),linear(t2),rk.prod(shrink1(t1),linear(t2))))

where rk.prod is a function in our library calculating the product of two reproducing kernels.

Suppose that instead of a linear spline, we want to model effect using a cubic spline

We have

Since , this SS ANOVA model can be fitted by
    ssr(y~I(t2-.5), rk=list(shrink1(t1),cubic(t2),
rk.prod(shrink1(t1),kron(t2-.5)),
rk.prod(shrink1(t1),cubic(t2))))

where the function kron in our library calculates the reproducing kernel for the space .

Example 8 Consider , and , a case with two continuous covariates. If we model both covariates using linear splines

Thus

This SS ANOVA model can be fitted by
    ssr(y~1, rk=list(linear(t1),linear(t2),rk.prod(linear(t1),linear(t2))))

If we want to model both variables using cubic splines

 (21)

We have

SS ANOVA model () can be fitted by
   ssr(y~I(t1-.5)+I(t2-.5), rk=list(cubic(t1),cubic(t2),
rk.prod(kron(t1-.5),cubic(t2)),
rk.prod(cubic(t1),kron(t2-.5)),
rk.prod(cubic(t1),cubic(t2))))


For the purpose of model building and inference, one may want to construct Bayesian confidence intervals for combinations of components in the model space (). Gu and Wahba (1993b) provided formulae to calculate posterior covariances for any combination of components. Denote as components in the null space , and as the component in space , . Let be a sequence of 0's and 1's. The utility function predict calculates posterior means and standard deviations for the combination . Multiple combinations can be computed simultaneously. For example, after fitting the SS ANOVA model () and saving into an object, say ssrfit, then one may calculate the posterior mean and standard deviations for the smooth-smooth interaction and the total interaction by

    predict(ssrfit,terms=c(0,0,0,0,0,0,0,1))
predict(ssrfit,terms=c(0,0,0,0,0,1,1,1))

These two statements can be combined into one
    predict(ssrfit,terms=matrix(c(0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1),
ncol=2,byrow=T))

An object of class "bCI" is returned from this predict function, and the generic function plot can be used to plot these combinations with Bayesian confidence intervals. See help file of plot.bCI for details. predict function can also be used to calculate predicted values at any given points.

Next: Spline Smoothing with Correlated Up: Smoothing Spline Regression Models Previous: Partial Spline Models
Yuedong Wang 2004-05-19