next up previous
Next: Generalized Smoothing Spline Models Up: Smoothing Spline Regression Models Previous: Smoothing Spline ANOVA Models

Spline Smoothing with Correlated Random Errors and/or Unequal Weights

So far we have assumed that random errors are independent with equal variances. In this section, we consider model ([*]) with model space for $f$ given in ([*]), and $\mbox{\boldmath$\epsilon$}=(\epsilon_1, \cdots,
\epsilon_n)^T \sim N (0, \sigma^2 W^{-1})$. When $W$ is completely known, we transform the data into iid case and fit as in previous sections. In the following we assume that $W$ depends on a parsimonious set of parameters $\mbox{\boldmath$\tau$}$.

We estimate $f$ as the minimizer of the following penalized weighted least squares

$\displaystyle (\mbox{\boldmath$y$}-\mbox{\boldmath$f$})^T W (\mbox{\boldmath$y$...
...oldmath$f$})+n\lambda \sum_{k=1}^p
\theta_k^{-1} \vert\vert P_{k}f\vert\vert^2.$     (22)

Again, the solution has the form ([*]) (Wang, 1998b).

Correlated random errors may have a profound effect on methods for selecting smoothing parameters such as GCV, GML and UBR (Wang, 1998b). Extensions of GCV, GML and UBR methods with correlated random errors were developed in Wang (1998b). It was found that the extended GML method has the best overall performance. Therefore, we concentrate on the extended GML method here. Fixing $\lambda=1$, the extended GML method estimates smoothing parameters $\mbox{\boldmath$\theta$}$ in ([*]) together with the variance-covariance parameters $\mbox{\boldmath$\tau$}$ as the minimizers of

$\displaystyle GML(\mbox{\boldmath$\theta$},\mbox{\boldmath$\tau$})=\frac{\displ...
...}}
{\displaystyle[{\mbox det}^+
(W(I-A(\mbox{\boldmath$\theta$})))]^{1/(n-M)}}.$     (23)

To compute the minimizers of ([*]), we use the connection between smoothing spline models and linear mixed-effects models (LMM) (Opsomer et al., 2001; Wang, 1998b). Let $\Sigma_k=Z_k Z_k^T$, where $Z_k$ is a $n\times
m_k$ matrix with $m_k=\mbox{rank}(\Sigma_k)$. Consider the following LMM

$\displaystyle \mbox{\boldmath$y$}=T\mbox{\boldmath$d$}+ \sum_{k=1}^p Z_k \mbox{\boldmath$b$}_k + \mbox{\boldmath$\epsilon$},$     (24)

where $\mbox{\boldmath$b$}_k$'s are random effects, $\mbox{\boldmath$b$}_k\sim N(0, \sigma^2 \theta_k I_{m_k}/n)$ with $I_{m_k}$ being the identity matrix of order $m_k$, and $\mbox{\boldmath$b$}_k$ are mutually independent and are independent of $\mbox{\boldmath$\epsilon$}$. It is easy to show that the restricted maximum likelihood (REML) estimate of the variance components $\mbox{\boldmath$\theta$}$ and $\mbox{\boldmath$\tau$}$ in model ([*]) are the minimizers of ([*]). Gu and Wahba (1991) noticed this connection and indicated that RKPACK can be used to fit LMMs ([*]) with independent random errors. Here the opposite is done: we use software for LMMs to fit smoothing spline models with correlated random errors. We first calculate $Z_k$ through Choleski decomposition using the function chol.new. Then we calculate the GML (REML) estimates of $\mbox{\boldmath$\theta$}$ and $\mbox{\boldmath$\tau$}$ using the function lme in the nlme library (Pinheiro and Bates, 2000). Finally we transform the data and call dsidr.r in RKPACK to calculate $\mbox{\boldmath$c$}$ and $\mbox{\boldmath$d$}$ in ([*]) with the smoothing parameters fixed at these GML estimates. All these steps are performed internally.

Two options in ssr can be used to specify the correlation and variance structures for Gaussian data. The first option, correlation, specifies the correlation structure of random errors. It inputs a corMatrix class as in nlme. Available correlation structures include corSymm, corAR1, corCAR1, corARMA, corExp, corLin, corGaus, corSpher, corSpatial. See the document of nlme (Pinheiro and Bates, 2000) for a complete list. New structures can be added through facilities provided in nlme. The following statement fits a thin plate spline with an exponential correlation structure

   ssr(y~t1+t2, rk=tp.pseudo(cbind(t1,t2)), corr=corExp(form=~t1+t2), 
       spar=``m'')

The second option, weights, specifies the variance structure for the variance function of the random errors in the form of a varFunc class as in nlme. Available varFunc classes include varFixed, varIdent, varPower, varExp and varComb. See Pinheiro and Bates (2000) for details. New varFunc classes representing user-defined variance functions can be added. For example, the following statement fits a cubic spline with fixed weights (assuming a vector of w has been created)

   ssr(y~t, rk=cubic(t), weights=w)


next up previous
Next: Generalized Smoothing Spline Models Up: Smoothing Spline Regression Models Previous: Smoothing Spline ANOVA Models
Yuedong Wang 2004-05-19