next up previous
Next: Spline Smoothing with Correlated Up: Smoothing Spline Regression Models Previous: Partial Spline Models

Smoothing Spline ANOVA Models

Consider model ([*]) with $f$ being a function of multivariate variables $t_1, \cdots, t_d$. Each variable $t_k$ itself could be a vector. Assume that $t_k \in {\cal T}_k$, where ${\cal H}_k$ is an arbitrary domain. Then $f$ is a function of $\mbox{\boldmath$t$}=(t_1,\cdots,t_d) \in {\cal T}={\cal T}_1 \otimes \cdots \otimes {\cal T}_d$.

Suppose that we want to use the RKHS ${\cal H}^{(k)}=\{ 1^{(k)} \} \oplus {\cal H}_1^{(k)}$ to model the effect of variable $t_k$. Denote $P^k$ as the projection operator onto $\{ 1^{(k)} \}$ in ${\cal H}^{(k)}$. Then

$\displaystyle f$ $\textstyle =$ $\displaystyle [\prod_{k=1}^d (P^k+I-P^k)]f$  
  $\textstyle =$ $\displaystyle \sum_{ B \subseteq \{1,\cdots,d \} }[ \prod_{k \in B} (I-P^k)
\prod_{k \in B^c} P^k f]$  
  $\textstyle =$ $\displaystyle \mu + \sum_{k=1}^d f_k(t_k) + \sum_{k<l}
f_{kl}(t_k,t_l) + \cdots + f_{1,\cdots,d}(t_1,\cdots,t_d) ,$ (15)

where elements $f_k$'s are main effects, $f_{jk}$'s are two factor interactions, and so on.

([*]) is just the simplest form of the so-called SS ANOVA decomposition. The classical ANOVA models are special cases with all variables being discrete. In general, suppose that we want to use the RKHS ${\cal H}^{(k)}={\cal H}_0^{(k)} \oplus {\cal H}_1^{(k)}=
\{ \phi_1^{(k)} \} \oplus \cdots \oplus
\{ \phi_{m_k}^{(k)}\} \oplus {\cal H}_1^{(k)}$ to model the effect of variable $t_k$ where $\phi_j^{(k)}$'s are basis functions for the null space ${\cal H}_0^{(k)}$. Denote $P_j^k$ as the projection operator onto $\{ \phi_j^{(k)} \}$ in ${\cal H}^{(k)}$. Then expansion of the following equation

$\displaystyle f = [\prod_{k=1}^d (P_1^k+\cdots+P_{m_k}^k+I-\sum_{j=1}^{m_k}P_j^k)] f$     (16)

provides the general form of SS ANOVA decomposition. ([*]) decomposes $f$ in the tensor product space ${\cal H}^{(1)} \otimes \cdots \otimes {\cal H}^{(k)}$ into orthogonal and interpretable components. Which decomposition to use depends on prior knowledge and the purpose of a study. It is more precise to think of SS ANOVA decompositions as a powerful technique rather than as some specific models. See Wahba (1990), Gu and Wahba (1993a), Gu and Wahba (1993b), Wahba et al. (1995), Wang et al. (1995), Wang et al. (1997), Wang (1998a), Wang and Wahba (1998) and references there for more details. Similar to the classical ANOVA, usually a model space is a subspace containing lower order components. After a model is chosen, we can regroup and write the model space as
$\displaystyle {\cal H}={\cal H}_0\oplus \sum_{k=1}^p{\cal H}_{k} ,$     (17)

where ${\cal H}_0=\mbox{span} \{\phi_1(\mbox{\boldmath$t$}), \cdots, \phi_M(\mbox{\boldmath$t$}) \}$ is a finite dimensional space containing functions which are not going to be penalized, and ${\cal H}_{k}$ is a RKHS with reproducing kernel $R_{k} (\mbox{\boldmath$s$}, \mbox{\boldmath$t$})$. The estimate of $f$ is the minimizer of
$\displaystyle \frac{1}{n}\sum_{i=1}^n(y_i-L_if)^2 + \lambda \sum_{k=1}^p
\theta_k^{-1} \vert\vert P_{k}f\vert\vert^2,$     (18)

where $P_k$ is the orthogonal projection of $f$ onto ${\cal H}_{k}$ in ${\cal H}$. Let $\xi_{ki}(\mbox{\boldmath$t$})=P_1 L_{i(.)} R_k(\mbox{\boldmath$t$},.)$ and $\Sigma_k=\{<\xi_{ki}, \xi_{kj}>\}_{i, j=1}^n$. The solution to ([*]) is
$\displaystyle \hat{f}(\mbox{\boldmath$t$})=\sum_{i=1}^M d_i \phi_i(\mbox{\boldmath$t$}) +\sum_{j=1}^n c_j
(\sum_{k=1}^p\theta_k\xi_{kj}(\mbox{\boldmath$t$})),$     (19)

where $\mbox{\boldmath$c$}$ and $\mbox{\boldmath$d$}$ are solutions to ([*]) with $\Sigma$ replaced by $\sum_{k=1}^p\theta_k\Sigma_k$. Smoothing parameters $\lambda/\theta_1,\cdots,\lambda/\theta_p$ can be estimated similarly using GCV, GML and UBR methods (Wahba, 1990). The Fortran subroutine dmudr.r in RKPACK was developed to solve equations ([*]) and estimate the smoothing parameters for $p \ge 1$. In our ASSIST package, the function dmudr serves as an intermediate interface between S and the driver dmudr.r.

ssr can also be used to fit SS ANOVA models. Basis functions $\phi_1,\cdots,\phi_M$ can be specified as before using the formula argument. Reproducing kernels $R_1(\mbox{\boldmath$s$},\mbox{\boldmath$t$}),\cdots,R_p(\mbox{\boldmath$s$},\mbox{\boldmath$t$})$ can be specified using the argument rk as a list of expressions.

Example 7 Consider $d=2$, $t_1 \in {\cal T}_1=\{ 1, \cdots, K \}$ and $t_2 \in {\cal T}_2 = [0,1]$. Functional data are a typical example of this case (Ramsay and Silverman, 1997). Suppose that we want to model the $t_1$ effect using a one-way ANOVA effect model with ${\cal H}^{(1)}=R^K=\{ 1 \} \oplus \{ g:~\sum_{t_1=1}^K g(t_1) = 0 \}$, and the $t_2$ effect using a linear spline ${\cal H}^{(2)} = W_1([0,1])=\{ 1 \} \oplus
\{ g \in W_1([0,1]):~\int_0^1 g(t_2) dt_2=0 \}$. Define two projection operators

\begin{eqnarray*}
P_1^1f &=& \sum_{t_1=1}^K f(t_1,t_2) / K, \\
P_1^2f &=& \int_0^1 f(t_1,t_2) dt_2.
\end{eqnarray*}

Then ([*]) leads to
$\displaystyle f(t_1,t_2) = \mu + s_1(t_1) + s_2(t_2) + ss_{12}(t_1,t_2).$     (20)

We have

\begin{eqnarray*}
{\cal H}_0&=&\mbox{span} \{ 1 \}, \\
R_1((s_1,s_2), (t_1,t_2)...
...2) )&=& [I_{[s_1=t_1]}-1/K][(k_1(s_2) k_1(t_2) + k_2(s_2-t_2)] .
\end{eqnarray*}

Construction ([*]) of a polynomial spline is used. Construction ([*]) may also be used to derive a similar SS ANOVA decomposition. SS ANOVA model ([*]) can be fitted by
   ssr(y~1, rk=list(shrink1(t1),linear(t2),rk.prod(shrink1(t1),linear(t2))))
where rk.prod is a function in our library calculating the product of two reproducing kernels.

Suppose that instead of a linear spline, we want to model $t_2$ effect using a cubic spline

\begin{displaymath}
{\cal H}^{(2)} = W_2([0,1])=\{ 1 \} \oplus \{ k_1 \} \oplus ...
...,1]):~\int_0^1 g(t_2) dt_2=\int_0^1 g^{\prime}(t_2) dt_2=0 \}.
\end{displaymath}

Define an additional projection operator

\begin{displaymath}
P_2^2f = [\int_0^1 (\partial f / \partial t_2) dt_2] k_1 .
\end{displaymath}

Then ([*]) leads to

\begin{eqnarray*}
f= \mu + s_1(t_1) + \beta k_1(t_2) + s_2(t_2)+sl_{12}(t_1,t_2) +ss_{12}(t_1,t_2) .
\end{eqnarray*}

We have

\begin{eqnarray*}
{\cal H}_0&=&\mbox{span} \{ 1, k_1(t_2) \}, \\
R_1((s_1,s_2),...
..._2) )&=& (I_{[s_1=t_1]}-1/K)[k_2(s_2) k_2(t_2) - k_4(s_2-t_2)] .
\end{eqnarray*}

Since $k_1(t_2)=t_2-.5$, this SS ANOVA model can be fitted by
    ssr(y~I(t2-.5), rk=list(shrink1(t1),cubic(t2),
                            rk.prod(shrink1(t1),kron(t2-.5)),
                            rk.prod(shrink1(t1),cubic(t2))))
where the function kron in our library calculates the reproducing kernel for the space $\mbox{span} \{ k_1(t_2) \}$.

Example 8 Consider $d=2$, $t_1 \in {\cal T}_1=[0,1]$ and $t_2 \in {\cal T}_2 = [0,1]$, a case with two continuous covariates. If we model both covariates using linear splines

\begin{eqnarray*}
{\cal H}^{(1)} &=& W_1([0,1])=\{ 1 \} \oplus
\{ g \in W_1([0,...
...=\{ 1 \} \oplus
\{ g \in W_1([0,1]):~\int_0^1 g(t_2) dt_2=0 \},
\end{eqnarray*}

then ([*]) leads to

\begin{displaymath}
f(t_1,t_2) = \mu + s_1(t_1) + s_2(t_2) + ss_{12}(t_1,t_2).
\end{displaymath}

Thus

\begin{eqnarray*}
{\cal H}_0&=&\mbox{span} \{ 1 \}, \\
R_1((s_1,s_2), (t_1,t_2)...
...s_1) k_1(t_1) + k_2(s_1-t_1)][k_1(s_2) k_1(t_2) + k_2(s_2-t_2)].
\end{eqnarray*}

This SS ANOVA model can be fitted by
    ssr(y~1, rk=list(linear(t1),linear(t2),rk.prod(linear(t1),linear(t2))))
If we want to model both variables using cubic splines

\begin{eqnarray*}
{\cal H}^{(1)} &=& W_2([0,1])=\{ 1 \} \oplus \{ k_1 \} \oplus ...
...[0,1]):~\int_0^1 g(t_2) dt_2=\int_0^1 g^{\prime}(t_2) dt_2=0 \},
\end{eqnarray*}

then ([*]) leads to
$\displaystyle f(t_1,t_2)= \mu + \alpha k_1(t_1) + \beta k_1(t_2) + s_1(t_1) + s_2(t_2)
+ ls_{12}(t_1,t_2) + sl_{12}(t_1,t_2) + ss_{12}(t_1,t_2).$     (21)

We have

\begin{eqnarray*}
{\cal H}_0&=&\mbox{span} \{ 1, k_1(t_1), k_1(t_2) \}, \\
R_1(...
...(s_1) k_2(t_1) - k_4(s_1-t_1)][k_2(s_2) k_2(t_2) - k_4(s_2-t_2)]
\end{eqnarray*}

SS ANOVA model ([*]) can be fitted by
   ssr(y~I(t1-.5)+I(t2-.5), rk=list(cubic(t1),cubic(t2),
                                    rk.prod(kron(t1-.5),cubic(t2)),
                                    rk.prod(cubic(t1),kron(t2-.5)),
                                    rk.prod(cubic(t1),cubic(t2))))

For the purpose of model building and inference, one may want to construct Bayesian confidence intervals for combinations of components in the model space ([*]). Gu and Wahba (1993b) provided formulae to calculate posterior covariances for any combination of components. Denote $f_{0v}(\mbox{\boldmath$t$}) = d_v \phi(\mbox{\boldmath$t$}),~v=1,\cdots,M$ as $M$ components in the null space ${\cal H}_0$, and $f_{k}(\mbox{\boldmath$t$})=\sum_{i=1}^n c_i\theta_{k}\xi_{ik}(\mbox{\boldmath$t$})$ as the component in space ${\cal H}_k$, $k=1,\cdots,p$. Let $\delta_j,~j=1,\cdots,(M+p)$ be a sequence of 0's and 1's. The utility function predict calculates posterior means and standard deviations for the combination $\sum_{i=1}^M \delta_i f_{0i}+\sum_{k=1}^p \delta_{M+k}f_{k}$. Multiple combinations can be computed simultaneously. For example, after fitting the SS ANOVA model ([*]) and saving into an object, say ssrfit, then one may calculate the posterior mean and standard deviations for the smooth-smooth interaction $ss_{12}$ and the total interaction $ls_{12}(t_1,t_2) + sl_{12}(t_1,t_2) + ss_{12}(t_1,t_2)$ by

    predict(ssrfit,terms=c(0,0,0,0,0,0,0,1))
    predict(ssrfit,terms=c(0,0,0,0,0,1,1,1))
These two statements can be combined into one
    predict(ssrfit,terms=matrix(c(0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1),
                                ncol=2,byrow=T))
An object of class "bCI" is returned from this predict function, and the generic function plot can be used to plot these combinations with Bayesian confidence intervals. See help file of plot.bCI for details. predict function can also be used to calculate predicted values at any given points.


next up previous
Next: Spline Smoothing with Correlated Up: Smoothing Spline Regression Models Previous: Partial Spline Models
Yuedong Wang 2004-05-19