TESTING HOMOGENEITY IN CLUSTERED (LONGITUDINAL) COUNT DATA REGRESSION MODEL WITH OVER-DISPERSION

Event Date: 

Wednesday, February 10, 2010 - 3:15pm

Event Date Details: 

Refreshments served at 3:00 PM

Event Location: 

  • South Hall 5607F

Dr. Sudhir Paul (UCSB Pstat)

Title: TESTING HOMOGENEITY IN CLUSTERED (LONGITUDINAL) COUNT DATA REGRESSION MODEL WITH OVER-DISPERSION

Abstract: Clustered (longitudinal) count data arise in many bio-statistical practices in which a number of repeated count responses are observed on a number of individuals. The repeated observations may also represent counts over time from a number of individuals. One important problem that arises in practice is to test homogeneity within clusters (individuals) and between clusters (individuals). As data within clusters are observations of repeated responses, the count data may be correlated and/or over-dispersed. Jacqmin-Gadda and Commenges (1995) derive a score test statistic HS by assuming a random intercept model within the framework of the generalized linear mixed model and a score test statistic HT using the generalized estimating equation (GEE) approach (Liang and Zeger, 1986; Zeger and Liang, 1986). They further show that the two tests are identical when the covariance matrix assumed in the GEE approach is that of the random-eects model. In each of these cases they deal with (a) the situation in which the dispersion parameter is assumed to be known and (b) the situation in which the dispersion parameter is assumed to be unknown. The second situation, however, is more realistic as will be unknown in practice. For over-dispersed count data with unknown over-dispersion parameter we derive two score tests by assuming a random intercept model within the framework of (i) the negative binomial mixed eects model, and (ii) the double extended quasi-likelihood mixed eects model (Lee and Nelder, 2001). These two statistics are much simpler than the statistic obtained from the statistic HS derived by Jacqmin-Gadda and Commenges (1995) under the framework of the over-dispersed generalized linear model. The rst statistic takes the over-dispersion more directly into the model and therefore is expected to do well when the model assumptions are satisfied and the other statistic is expected to be robust. Simulations show superior level property of the statistics derived under the negative binomial and double extended quasi-likelihood model assumptions. Further, a score test is developed to test for over-dispersion in the generalized linear mixed model and some simulations are conducted. A data set is analyzed and a discussion is given. (abstract in pdf)