next up previous
Next: The ssr Function Up: General Smoothing Spline Regression Previous: General Smoothing Spline Regression


Model and Estimation

The general smoothing spline regression (SSR) model with one variable assumes that (Wahba, 1990)

$\displaystyle y_i=L_i f + \epsilon_i, \ \ \ \ i=1, \cdots, n,$     (1)

where $y_i$'s are univariate responses; $f$ is an unknown function of an independent variable $t$ with $t$ belonging to an arbitrary domain ${\cal T}$ and $f\in{\cal H}$, a given Reproducing Kernel Hilbert Space (RKHS); $L_i's$ are bounded linear functionals on ${\cal H}$; and $\epsilon_i$'s are random errors with $\epsilon_i\stackrel{iid}{\sim} \mbox{N}(0, \sigma^2)$. Note that $t$ may be a vector. For most applications, $L_i$'s are evaluation functionals at design points: $L_if=f(t_i)$.

Suppose that

$\displaystyle {\cal H}={\cal H}_0\oplus{\cal H}_1,$     (2)

where ${\cal H}_0$ is a finite dimensional space with basis functions $\phi_1(t), \cdots, \phi_M(t)$, and ${\cal H}_1$ is a RKHS with reproducing kernel $R_1 (s, t)$. See Aronszajn (1950) and Wahba (1990) for more information about RKHS. The estimate of $f$, $\hat{f}_{\lambda}$, is the minimizer of the following penalized least squares
$\displaystyle \frac{1}{n}\sum_{i=1}^n(y_i-L_if)^2 + \lambda \vert\vert P_1f\vert\vert^2,$     (3)

where $P_1$ is the orthogonal projection operator of $f$ onto ${\cal H}_1$ in ${\cal H}$, and $\lambda$ is a smoothing parameter controlling the balance between goodness-of-fit measured by the least squares and departure from the null space ${\cal H}_0$ measured by $\vert\vert P_1f\vert\vert^2$. Note that functions in ${\cal H}_0$ are not penalized.

Let $\mbox{\boldmath$y$}=(y_1,\cdots,y_n)^T$. Define $\xi_i(t)= L_{i(\cdot)} R_1 (t,\cdot)$, $T_{n\times
M}={\{L_i\phi_v\}_{i=1}^n}_{v=1}^M$ and $\Sigma=\{<\xi_i,
\xi_j>\}_{i, j=1}^n$. Given $\lambda$, the solution to ([*]) has the form (Wahba, 1990)

$\displaystyle \hat{f}_{\lambda}(t)=\sum_{i=1}^M d_i \phi_i(t) +\sum_{j=1}^n c_j \xi_j(t),$     (4)

where the coefficients $\mbox{\boldmath$d$}=(d_1, \cdots, d_M)^T$ and $\mbox{\boldmath$c$}=(c_1, \cdots, c_n)^T$ are solutions to
$\displaystyle (\Sigma + n\lambda I)\mbox{\boldmath$c$}+ T\mbox{\boldmath$d$}$ $\textstyle =$ $\displaystyle \mbox{\boldmath$y$},$  
$\displaystyle T^T\mbox{\boldmath$c$}$ $\textstyle =$ $\displaystyle 0 .$ (5)

The Fortran subroutine dsidr.r in RKPACK was developed to solve equations ([*]) (Gu, 1989). In our ASSIST package, the S function dsidr serves as an intermediate interface between S and the driver dsidr.r.


next up previous
Next: The ssr Function Up: General Smoothing Spline Regression Previous: General Smoothing Spline Regression
Yuedong Wang 2004-05-19