The dataset, downloaded from the website
`http://www.psych.mcgill.ca/faculty/ramsay/` `fda.html`,
includes mean `monthly temperatures` at 35 Canadian weather `stations`.
The left panel in Figure shows the plots of mean
`temperature` from 4 selected weather `stations`.

Ramsay and Silverman (1997) and Ramsay and Li (1998) used this dataset to illustrate the usefulness of Functional Data Analysis (FDA). Among others, they addressed the following two questions: how to measure the departure from a sinusoidal model and how to register the curves for further analyses. In the following, we approach these questions with SNM models.

We assume that mean `temperatures` over `month` at all weather
`stations` share a common shape function. Variation between
`stations` are modeled by vertical shift, vertical scale and
horizontal shift parameters. That is, we assume that

where is the

> canada.fit <- snm(temp~b1+exp(b2)*f(month-alogit(b3)), func=f(u)~list(~sin(2*pi*u)+cos(2*pi*u)-1,lspline(u,type=''sine0'')), fixed=list(b1~1), random=list(b1+b2+b3~1), cor=corAR1(), groups=~station, data=canadaTemp.dat, control=list(rkpk.control=list(limnla=c(-4,0)),prec.out=0.005)) > summary(canada.fit) ... AIC BIC logLik 1529.772 1607.5 -745.6426 Random effects: Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1) Level: station Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr b1 5.94427125 b1 b2 b2 0.30031948 -0.745 b3 0.06308751 -0.013 -0.462 Residual 1.48044625 Correlation Structure: AR(1) Formula: ~1 | station Parameter estimate(s): Phi 0.7258356 Fixed effects: list(b1 ~ 1) Value Std.Error DF t-value p-value b1 1.690498 0.6188728 385 2.731576 0.0066 GCV estimate(s) of smoothing parameter(s): 0.0001000000 Equivalent Degrees of Freedom (DF): 10.24364 Converged after 5 iterationsThe resulting fits are plotted in the left panel of Figure for the four selected stations. Fits for other stations are similar. The estimated correlation coefficient is with an approximate confidence interval . Thus there is evidence that serial correlation does exist in the data. Figure shows the overall fit of the common curve and its projections on subspaces (sinusoidal part) and (remaining part) together with their Bayesian confidence intervals. The small departure from sinusoidal form is statistically significant, especially in Spring and Fall. Our conclusion here is comparable to that of Ramsay and Silverman (1997), but their approach is on individual stations and does not provide a formal test.

By removing the horizontal shift in each curve, we can easily align all curves. Therefore our method can be used for curve registration.

grid <- NULL for (i in 1:35) { grid <- c(grid,seq(0,1,len=40)+ alogit(canada.fit$coef$random$station[70+i])+.5) } p.canada.2 <- predict(canada.fit, newdata=data.frame(month=grid, station=rep(1:35,rep(40,35))))The right panel in Figure displays the fitted curves for all 35 stations after aligning them together by removing shifts.

**Below is new**

Ramsay and Silverman (1997) also fitted various functional linear
models (FLM) to this data set. The functional data can appear in the FLM
as (i) the dependent variable, (ii) the independent variable, or
(iii) both. We now show that those FLM are special cases of
general smoothing spline models. Therefore they can be fitted by
the `ssr` function.

To investigate geographical differences in the pattern of the
annual tempreture function, Ramsay and Silverman (1997)
divided Canada into four meteorological zones: Atlantic, Pacific,
Continental and Arctic. Then they considered the following FLM
(Model (9.1) in Ramsay and Silverman (1997))

where and correspond to climate zones Atlantic, Pacific, Continental and Arctic respectively, is the th

Now consider expected temperature as a function of both `zone`
and `time` :
.
Suppose that we want to model `zone` effect using an
one-way ANOVA model and `time` effect using a periodic spline. Then
we have the following SS ANOVA decomposition

where is a constant, and are the main effects of

> temps <- matrix(scan(``monthtemp.dat'',0), 35, 12, byrow=T) > atlindex <- c(1,2,3,4,5,6,7,8,9,10,11,13,14,16) > pacindex <- c(25,26,27,28,29) > conindex <- c(12,15,17,18,19,20,21,22,23,24,30,31,35) > artindex <- c(32,33,34) > t <- seq(0.5, 11.5, 1)/12 > tgrid <- seq(0,1,len=50) > x <- apply(temps,1,sum) > n <- 35*12 > temp <- as.vector(t(temps[c(atlindex,pacindex,conindex,artindex),])) > grp <- as.factor(rep(1:4,12*c(14,5,13,3))) > tempdata <- data.frame(temp=temp,t=rep(t,35),grp=grp) > canada4 <- ssr(temp ~ grp, spar="m",data=tempdata, rk=list(periodic(t),rk.prod(periodic(t),shrink1(grp)))) > summary(canada4) Smoothing spline regression fit by GML method Call: ssr(formula = temp ~ grp, rk = list(periodic(t), rk.prod(periodic(t), shrink1(grp))), data = tempdata, spar = "m") Coefficients (d): (Intercept) grp2 grp3 grp4 4.787500 2.809167 -5.204167 -16.651389 GML estimate(s) of smoothing parameter(s) : 0.2622870 0.9971152 Equivalent Degrees of Freedom (DF): 23.74366 Estimate of sigma: 3.762337 Number of Observations: 420 > grid <- list(grp=as.factor(rep(1:4,rep(50,4))),t=rep(tgrid,4)) > p.canada4 <- predict(canada4,newdata=expand.grid(t=tgrid,grp=as.factor(1:4)), terms=c(1,1,1,1,0,1))

Figures and display the estimated zone effects and the fits for four regions. They are very similar to Figures 9.1 and 9.2 in Ramsay and Silverman (1997). In addition to the estimates, we can provide confidence intervals.

To investigate the possible replationship between total annual
precipitation and the temperature function, Ramsay and Silverman (1997)
considered the following FLM (Model (10.3))

where is the logarithm of total annual precipitation at station , is a constant parameter, is the temperature function at station , is a unknown weight function, and are random errors. The goal is to estimate the weight function . Model () is an example of case (ii).

Without loss of the generality, suppose that the time interval has been transformed into . That it, . It is reasonable to assume that is a smooth periodic function. Specifically, we model using cubic periodic spline space . Define linear functionals . are bounded when which we assume to be true. Then model () is a partial spline model.

Denote
where
.
Let be the reproducing kernel (RK) matrix for . From the
definition in Section , the th elements of the matrix

where is the RK of , and represnet middle point of month , is the temperature of month at station , , and . We used simple summations to approximate the integrals. More accurate approximations can be used. We also ignored a scaling constant () since it can be absorbed by the smoothing parameter.

> prec <- matrix(scan(``monthprec.dat'',0), 35, 12, byrow=T) > y <- log(apply(prec,1,sum)) > R <- temps %*% periodic(t) %*% t(temps) > canada5 <- ssr(y ~ x, rk=R) > summary(canada5) Smoothing spline regression fit by GML method Call: ssr(formula = y ~ x, rk = R, spar = "m") Coefficients (d): (Intercept) x 6.433069274 0.006122233 GML estimate(s) of smoothing parameter(s) : 0.006772994 Equivalent Degrees of Freedom (DF): 4.062615 Estimate of sigma: 0.3588121 Number of Observations: 35

The estimated weight function is represented by

where is an estimate of . To compute at grid points of size ,

where is a vector of 1's of size , , , is the RK of , is a matrix of evaluated at all combinations of and , is a matrix with each row consists monthly tempretures at each station, and .

Bayesian confidence intervals are not available for non-evaluational functionals. Therefore we use the following bootstrap procedure to compute confidence intervals. We used simple percentile intervals. Other approaches may be used (Wang and Wahba, 1995).

> S <- periodic(tgrid,t)%*%t(temps) > f <- canada5$coef$d[2] + S%*%canada5$coef$c > nboot <- 999 > fb <- NULL > for (i in 1:nboot) { yb <- canada5$fit+sample(canada5$resi,35,replace=T) bfit <- ssr(yb ~ x, rk=R) fb <- cbind(fb,bfit$coef$d[2] + S%*%bfit$coef$c) } > lb <- apply(fb,1,quantile,prob=.025) > ub <- apply(fb,1,quantile,prob=.975)

We used the default GCV method to select the smoothing parameter in the above computation. We repeated the computation with the GML method to select the smoothing parameter. Figure displays the estimated regression weight functions. It is interesting to note that the estimates with GCV and GML choices of the smoothing parameter are similar to the upper left plot and the lower right plot in Figure 10.3 of Ramsay and Silverman (1997). As indicated in Figure 10.6 of Ramsay and Silverman (1997) that it is difficult to get a precise estimate of the smoothing parameter due to the small sample size. Interestingly that the GCV method picks an estimate of the smoothing parameter that is close to the lower bound and the GML method picks an estimate that is close to the upper bound. Wide confidence intervals are the consequence of the small sample size ().

To investigate the dependence of the complete log precipitation
profile on the complete tempreture profile,
Ramsay and Silverman (1997) considered the following FLM

where is the logarithm of annual precipitation profile at station , plays the part of an intercept in the standard regression, is the temperature function at station , is a unknown weight function at time , and are random errors processes. The goal is to estimate and . Model () is an example of case (iii).

The expected annual precipitation profile depends on two non-parametric
functions: and .
Again, we assume that , and and are
smooth periodic functions. Specifically, we assume that
,
and
. Equivalently,
we use the following SS ANOVA decomposition for and :

(92) |

Again, we discretize the annual precipitation profile by 12 months.
Model () becomes

where is the logarithm of precipitation in month at station , , , and . For simplicity, we assume random errors are independent. Models with correlated random errors can be fitted similarly.

Let
,
,
,
,
,

,
and
.

Then model () can be written in a matrix form

We have two parametric terms, and , and four non-parametric terms, , , and , in model (). To fit this model, we need to find RK matrices for four non-parametric terms. It is not difficult to see that the RK matrices for , and are , , , where is the RK of , , represents a matrix of evaluated at all combinations of , and is a matrix defined in (). To compute the RK matrix for , we define the linear functional and assume that it is bounded. Since the RK of a tensor product space is the product of the two RK's, the RK of is . Then elements of the repreducing kernel matrix for is

Thus the RK matrix for is , where rk.prod prepresents elementwise product of two matrices.

> prec <- matrix(scan(``monthprec.dat'',0), 35, 12, byrow=T) > y <- as.vector(t(log(prec))) > xx <- rep(x,rep(12,35)) > R1 <- kronecker(matrix(1,35,35),periodic(t)) > R2 <- kronecker(R,matrix(1,12,12)) > R3 <- kronecker(x%*%t(x),periodic(t)) > R4 <- rk.prod(R1,R2) > canada6 <- ssr(y ~ xx, rk=list(R1,R2,R3,R4)) > summary(canada6) Smoothing spline regression fit by GCV method Call: ssr(formula = y ~ xx, rk = list(R1, R2, R3, R4)) Coefficients (d): (Intercept) xx 4.745601623 0.003035296 GCV estimate(s) of smoothing parameter(s) : 1.727184e+03 3.630885e-02 1.416582e+07 4.597862e-04 Equivalent Degrees of Freedom (DF): 60.52553 Estimate of sigma: 0.3549322 Number of Observations: 420

We now show how to compute estimated functions evaluated at grid points. We stacked all observations as . Denote the corresponding months for as . Denote as the total number of observations. Then the estimated functions are represented by

where is the integer part of . To compute , , at grid points and for and respectively,

where , , and . The interaction is a bivariate function. Thus we evaluate it at a bivariate grid :

where is a matrix with . We also compute bootstrap confidence intervals.

> S1 <- kronecker(t(rep(1,35)),periodic(tgrid,t)) > S2 <- kronecker(S,t(rep(1,12))) > S3 <- kronecker(t(x),periodic(tgrid,t)) > S4 <- NULL > for (i in 1:50) {for (j in 1:50) S4 <- rbind(S4,S1[i,]*S2[j,])} > alpha <- canada6$coef$d[1] + n*canada6$lambda[1]*S1%*%canada6$coef$c > beta0 <- canada6$coef$d[2] > beta1 <- n*canada6$lambda[2]*S2%*%canada6$coef$c > beta2 <- n*canada6$lambda[3]*S3%*%canada6$coef$c > beta12 <- n*canada6$lambda[4]*S4%*%canada6$coef$c > beta <- beta0 + rep(beta1,rep(50,50)) + rep(beta2,50) + beta12 > nboot <- 99 > alphab <- betab0 <- betab1 <- betab2 <- betab12 <- betab <- NULL > for (i in 1:nboot) { yb <- canada6$fit+rnorm(n,sd=canada6$sig) bfit <- ssr(yb ~ xx, rk=list(R1,R2,R3,R4)) alphab <- cbind(alphab,bfit$coef$d[1] + n*bfit$lambda[1]*S1%*%bfit$coef$c) bb0 <- bfit$coef$d[2] bb1 <- n*bfit$lambda[2]*S2%*%bfit$coef$c bb2 <- n*bfit$lambda[3]*S3%*%bfit$coef$c bb12 <- n*bfit$lambda[4]*S4%*%bfit$coef$c betab0 <- c(betab0,bb0) betab1 <- cbind(betab1,bb1) betab2 <- cbind(betab2,bb2) betab12 <- cbind(betab12,bb12) betab <- cbind(betab, bb0 + rep(bb1,rep(50,50)) + rep(bb2,50) + bb12) } > lba <- apply(alphab,1,quantile,prob=.025) > uba <- apply(alphab,1,quantile,prob=.975) > lbb1 <- apply(betab1,1,quantile,prob=.025) > ubb1 <- apply(betab1,1,quantile,prob=.975) > lbb2 <- apply(betab2,1,quantile,prob=.025) > ubb2 <- apply(betab2,1,quantile,prob=.975) > lbb12 <- apply(betab12,1,quantile,prob=.025) > ubb12 <- apply(betab12,1,quantile,prob=.975) > lbb <- apply(betab,1,quantile,prob=.025) > ubb <- apply(betab,1,quantile,prob=.975)

Figure displays the estimates of and . The scale, as well as the GCV estimates of the smoothing parameters ( ), indicate that the interaction is not small. Figures and display splices of the estimated interaction function and their 95% bootstrap confidence intervals with one variable fixed approximately at the middle points of 12 months. Figures and display splices of the estimated function and their 95% bootstrap confidence intervals with one variable fixed approximately at the middle points of 12 months. Figure displays estimates of , and functions.

We also fit the data using the GML choice of the smoothing parameter.

> canada6.1 <- ssr(y ~ xx, rk=list(R1,R2,R3,R4),spar="m") > summary(canada6.1) Smoothing spline regression fit by GML method Call: ssr(formula = y ~ xx, rk = list(R1, R2, R3, R4), spar = "m") Coefficients (d): (Intercept) xx 4.7327087 0.0030902 GML estimate(s) of smoothing parameter(s) : 7.195469e-01 4.852421e-02 3.633548e+04 3.613019e+00 Equivalent Degrees of Freedom (DF): 27.24519 Estimate of sigma: 0.3733665 Number of Observations: 420

The GML estimates of smoothing parameters indicates that both the interaction and the main effect are small. Figure displays the estimates of and . Figure displays estimates of , and functions.