Seminars
 
 

2006-07 seminars 2005-06 seminars

Seminars 2007-2008    


Monday, July 30th @ 2:00pm
South Hall 5607F

Refreshments served 1:45 pm

Mathew Penrose, University of Bath, UK

A survey of Random Geometric Graphs

The random geometric graph G(n,r) has vertex set given by n independent uniformly distributed points in the d-dimensional unit cube, with an edge connecting each pair of points distant at most r apart. We discuss asymptotic properties of these graphs as n becomes large with r=r(n) becoming small. Particular limit regimes of interest are those where the mean degree remains constant, and when it grows logarithmically. Topics we hope to cover (in varying levels of detail) include: clique number, chromatic number, connectivity, minimum degree, size of the largest component, spatial central limit theorems
large deviations. Some of the material covered is in the book `Random Geometric Graphs' by M. Penrose, Oxford University Press 2003.


WEDNESDAY October 3, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Stephane Villeneuve (Toulouse, France, visiting PSTAT)

Title: Optimal dividend policy and growth option

We analyse the interaction between dividend policy and investment decision in a growth opportunity of a liquidity constrained firm. This leads us to study a mixed singular control/optimal stopping problem for a diffusion that we solve quasi-explicitly establishing connection with an optimal stopping problem. We characterize situations where it is optimal to postpone dividend distribution in order to invest at a subsequent date in the growth opportunity. We show that uncertainty and liquidity shocks have ambiguous effect on the investment decision.


WEDNESDAY October 10, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Helgi Tomasson (University of Iceland, visiting PSTAT)

Some Computational Aspects for Inference on Diffusion Processes

The theory of diffusion processes is fundamental for modern mathematical-finance. Real data are assumed to be observations of a continuous-time process at discrete time-points. The statistical toolbox for financial data is briefly reviewed. A computer program, written in R, for approximation of the likelihood function for some simple processes is shown. The approximation is based on a Taylor-expansion of the Kolmogorov-forward equation in the spirit of Ait-Sahalia(1999, 2002). Properties of maximum-likelihood estimators are illustrated by simulation. Some aspects of applying the approximation to Bayesian inference and statistical-surveillance (change-point-detection) are discussed


WEDNESDAY October 17, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Hyekyung Min (postdoc, PSTAT)

Title: A Stochastic Control Model of Optimal Dividend and Capital Financing

The stochastic control model, introduced by Peura and Keppo (2006), is considered for valuing a firm whose capital evolves according to Brownian motion with a drift. The firm controls the flow of capitals not only by paying out the dividends but also by raising the capital in the presence of fixed cost (K) and delay (D). A solution to this control problem is obtained by solving a system of quasi-variational inequalities. It is shown that a unique solution exists for all values of K >= 0 and D > 0. The asymptotic behavior of the optimal dividend and capital issue barriers, and the ruin probability and the expected lifetime of the firm following the optimal policy will be discussed.


Wednesday, Oct 24th,
Dr. Giulia Barbati, Department of Public Health and Microbiology, University of Torino, Italy

Source Separation algorithms applied to cerebral signals

Physiological activity in the brain can be evaluated by means of non-invasive electrophysiological techniques like electroencephalogram, EEG, and magnetoencephalogram, MEG: such instruments are able to obtain cerebral processing measures with the optimal time resolution. The crucial problem is then to gain access to the inner neural code starting from the extra-cranial recordings: cerebral signals related to significant activity are mixed and embedded in unstructured noise and in other physiological artefacts, non relevant to the desired observation. To deal with this problem, a statistical-based approach has been recently introduced, based on exploiting statistical properties of sources composed in the observed signals without any assumption about the biophysical model underlying the recorded signals. In particular, the “Independent Component Analysis” (ICA) model assumes that sources are statistically mutually independent; to extract them from the mixture a measure of non-gaussianity is maximized (i.e. kurtosis).
In this talk we discuss how the ICA assumptions fit with the complex and interconnected cerebral networks activity and we describe our newly proposed “Functional Source Separation” (FSS) algorithm, conceived as a generalization of the basic ICA model. Some physiological constraints, defined on the expected temporal behaviour of the cerebral sources of interest are added to the maximization of kurtosis producing a multi-objective cost function that exploits simultaneously global statistical features and functional source properties. Some of the ICA and FSS applications from our previous studies are revised and discussed in order to provide applications examples. Our conclusions are that the ICA algorithm could be successfully applied for specific issues, in
particular for artefacts removal, but the proposed multi-objective FSS cost-function provide a more general framework to estimate cerebral activity of interest.

References
Hyvarinen A., Karhunen J., Oja E. (2001). Independent Component Analysis. Wiley.
Makeig S, Debener S, Onton J, Delorme A. (2004). Mining event-related brain dynamics. Trends Cogn Sci. 2004 May;8(5):204-10.
Barbati G., Sigismondi R., Zappasodi F., Porcaro C., Graziadio S., Valente G., Balsi M., Rossini P.M., Tecchio F., (2006) Functional Source Separation from Magnetoencephalographic Signals, Hum Brain Mapp. Dec;27(12):925-34.


Wednesday, October 31, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
David Hinkley, Univ. of California, Santa Barbara.

BOOTSTRAP DIAGNOSTICS

Over the past two decades bootstrap methods have become increasingly widely used in statistical applications. Apparently "easy" solutions for a wide range of statistical problems, together with wide availability of cheap/fast computing has seduced many researchers into using resampling (bootstrap) methods for just about anything. But the validity and accuracy of bootstrap methods do require some assumptions. This talk will describe methods for checking some of the assumptions for specific contexts, and for some cases will describe appropriate modifications of bootstrap methods to make them more reliable. [This is joint work with Angelo Canty, Anthony Davison and Valerie Ventura.]


CANCELLED
MONDAY, Nov 5th, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Dr. Emanuele Taufer, University of Trento, Italy

Tests for Exponentiality Based on Characterizations

There is considerable literature on the problem of testing for exponentiality. The reasons are many-fold and chief among these are: the watershed role played by the exponential distribution in reliability and survival analysis, its nice mathematical properties, as well as the availability of several characterizations. In the talk, some test statistics based on characterizations will be reviewed. Characterization based on spacings and constancy of the mean residual life will be considered in detail. After discussing the construction of test statistics for the hypothesis of exponentiality, the relevant asymptotic theory will be discussed. To assess the performance of the proposed procedures, the approximate Bahadur slope as well as Monte Carlo power studies against several common alternatives to exponentiality will be provided. Power comparisons with other classical and recent tests for exponentiality, show that in most cases the test procedures proposed compare very well with their competitors.


 

Wednesday, Nov 14, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Zhen-Qing Chen, Mathematics, U. of Washington, Seattle.

Discrete Approximations to Reflected Brownian Motion

In this talk, we will present three discrete or semi-discrete approximation schemes for reflected Brownian motion on bounded Euclidean domains. For a class of bounded domains $D$ in $R^n$ that includes all bounded Lipschitz domains and the von Koch snowflake domain, we show that the laws of both discrete and continuous time simple random walks on $D\cap 2^{-k} Z^n$ moving at rate $2^{-2k}$ with stationary initial distribution converge weakly in the space $D([0, 1], R^n)$, equipped with the Skorokhod topology, to the law of the stationary reflected Brownian motion on $D$. We further show that the following ``myopic conditioning'' algorithm generates, in the limit, a reflected Brownian motion on any bounded domain $D$. For every integer $k\geq 1$, let $\{X^k_{j2^{-k}}, j=0, 1, 2, \dots \}$ be a discrete time Markov chain with one-step transition probabilities being the same as those for the Brownian motion in $D$ conditioned not to exit $D$ before time $2^{-k}$. We prove that the laws of $X^k$ converge to that of the reflected Brownian motion on $D$. These approximation schemes give not only new ways of constructing reflected Brownian motion but also implementable algorithms to simulate reflected Brownian motions.
Joint work with Krzysztof Burdzy.


Wednesday, Nov. 28, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Dr. Phillip M. Feldman (Systems Performance, Northrop Grumman Space Technology)

A Random Sample of Applied Statistics Problems from Northrop Grumman

Abstract: The talk will begin with a brief overview of the Space Technology sector of Northrop Grumman and some background about the life cycle of space systems. This will be followed by discussion of two or three open statistical problems that arose in connection with engineering projects.


Monday, Dec 10, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Huyên PHAM, Université Paris 7, Denis Diderot.

Hedging and pricing with execution delay.

We consider impulse control problems in finite horizon for diffusions with decision lag and execution delay. The new feature is that our general framework deals with the important case when several consecutive orders may be decided before the effective execution of the first one.
This is motivated by financial applications in the trading of illiquid assets such as hedge funds.
We show that the value functions for such control problems satisfy a suitable version of dynamic programming principle in finite dimension, which takes into account the past dependence of state process through the pending orders. The corresponding Bellman partial differential equations (PDE) system is derived, and exhibit some peculiarities on the coupled equations, domains and boundary conditions. We prove a unique characterization of the value functions to this nonstandard PDE system by means of viscosity solutions. We then provide an algorithm to find the value functions and the optimal control. This implementable algorithm involves backward and forward iterations on the domains and the value functions, which appear in turn
as original arguments in the proofs for the boundary conditions and uniqueness results. Finally, we give several numerical experiments illustrating the impact of execution delay on trading strategies and on option pricing.


Monday, January 7th, 2:30pm, Refreshments served at 2:15pm
Bo Li, University Corporation for Atmospheric Research, UCAR, Boulder, Colorado

Nonparametric Assessment of Properties of Space-Time Covariance Functions and its Application in Paleoclimate Reconstruction

Abstract:
We propose a unified framework for testing a variety of assumptions commonly made for covariance functions of stationary spatio-temporal random fields. The methodology is based on the asymptotic normality of space-time covariance estimators. We focus on tests for full symmetry and separability in this talk, but our framework naturally covers testing for isotropy and Taylor's hypothesis. Our test successfully detects the asymmetric and nonseparable features in Irish wind speed data. We perform simulation experiments to evaluate our test and conclude that our method is reliable and powerful for assessing common assumptions on space-time covariance functions. An interesting application of these testing approaches is in paleoclimate reconstruction, a crucial problem for understanding climate change. We report our reconstruction based on hierarchical models in a Bayesian framework, and show the necessity of identifying the properties of covariance functions in a further study.


Friday, January 11th, 3:15-4:15PM, South Hall 5607F,refreshments served at 3:00PM
Cari Kaufman

Models for Models: Statistical Methods for Climate Model Output and Other Massive Datasets

I will present two novel statistical methods applicable to analyzing climate model output. The first, likelihood approximation using covariance tapering, is useful in analyzing the large spatial datasets that climate models often produce, for which
traditional likelihood-based methods are computationally infeasible. In the tapering approach, covariance matrices are multiplied element-wise by a sparse correlation matrix. This produces matrices which can be be manipulated using more efficient sparse matrix algorithms. I will present some theoretical results justifying the use of tapering and demonstrate its efficiency in practice.

The second method addresses the question of how we can attribute sources of variability in climate model output. In particular, I will consider regional climate models (RCMs). RCMs address smaller spatial regions than do global climate models (GCMs), but their higher resolution better captures the impact of local features such as lakes and mountains. GCM output is often used to provide boundary conditions for RCMs, and it is an open scientific question how much variability in the RCM output is attributable to the RCM itself, and how much is due simply to large-scale forcing from the GCM. I will consider data from the Prudence Project, in which RCMS were crossed with GCM forcings in a designed experiment. Using this dataset as a motivating example, I will present a framework for Bayesian functional ANOVA modeling using Gaussian process prior distributions. In this framework, inference can be carried out either in a summary fashion, by examining the joint posterior distribution of the covariance parameters in the corresponding Gaussian processes, or locally, by studying functional and fully Bayesian versions of the usual ANOVA decompositions. These decompositions can be used to create useful graphical displays summarizing the contributions of each factor across space.


MONDAY Jan 14, 3:15-4:15PM, South Hall 5607F,refreshments served at 3:00PM
Marek Rutkowski (U. of South Wales)

Pricing and Hedging of Convertible Bonds with Credit Risk

In our works [3]-[6], we attempt to shed more light on mathematical modeling of convertible bonds, thus continuing the previous research presented, for instance, in Andersen and Bu®um [1], Ayache et al. [2], Davis and Lischka [7], Kallsen and KÄuhn [8], and Kwok and Lau [9].

In [3], we consider the problem of the decomposition of a convertible bond into a bond component and an option component. This decomposition is indeed well established in the case of an `exchange option', when the conversion can only occur at maturity, and there are no put or call clauses. However, it was not previously studied in the general case of a defaultable convertible bond with call and/or put covenants.

In [4], we specify the valuation results for a defaultable game option (in particular, a convertible bond) to the context of default risk model based on the hazard process. The approach is based on the reduction of the information flow from the full ¯ltration to the reference ¯ltration. Our main existence result for hedging strategies in a hazard process set-up can be informally stated as follows: under the assumption that a related doubly re°ected BSDE admits a solution under some risk-neutral measure, the state-process multiplied by the default indicator process is the minimal super-hedging price up to a sigma martingale cost process.

The associated hedging strategies are subsequently analyzed by means of a martingale decomposition of a solution to the related doubly reflected BSDE. It is worth stressing that these decompositions are by no means arti¯cial. On the contrary, they arise naturally in the context of a Markovian framework, which is studied in some detail in the follow-up paper [5]. Under a rather general speci¯cation of the in¯nitesimal generator of a driving Markov factor process, we develop in [5] the variational inequality approach to pricing and hedging of a defaultable game option.

In [6], we consider a Markovian diffusion set-up with default. In this model, we show that a doubly reflected BSDE related to the convertible security has a solution, and we provide the related super-hedging strategy. Moreover, we characterize the price of a convertible security in terms of a viscosity solution of the associated variational inequality and we prove the convergence of a suitable approximation scheme.

References
[1] Andersen, L. and Buffum, L.: Calibration and implementation of convertible bond models. Journal of Computational Finance 7 (2004), 1-34
[2] Ayache, E., Forsyth, P. and Vetzal, K.: Valuation of convertible bonds with credit risk. The Journal of Derivatives, Fall 2003.
[3] Bielecki, T.R., Crepey, S., Jeanblanc, M. and Rutkowski, M.: Arbitrage pricing of defaultable game options with applications to convertible bonds. Forthcoming in Quantitative Finance.
[4] Bielecki, T., Crepey, S., Jeanblanc, M. and Rutkowski, M.: Valuation and hedging of defaultable options in a hazard process model. Submitted.
[5] Bielecki, T.R., Crepey, S., Jeanblanc, M. and Rutkowski, M.: Defaultable options in a Markovian intensity model of credit risk. Forthcoming in Mathematical Finance.
[6] Bielecki, T.R., Crepey, S., Jeanblanc, M. and Rutkowski, M.: Convertible bonds in a defaultable diffusion model. Submitted.
[7] Davis, M. and Lischka, F.: Convertible bonds with market risk and credit risk. In: Applied Probability, R. Chan et al., eds., American Mathematical Society/International Press, 2002, pp. 45-58.
[8] Kallsen, J. and KÄuhn, C.: Convertible bonds: ¯nancial derivatives of game type. In: Exotic Option Pricing and Advanced Levy Models, A. Kyprianou et al., eds., Wiley, 2005, pp. 277-288.
[9] Kwok, Y. and Lau, K.: Anatomy of option features in convertible bonds. Journal of Futures Markets 24 (2004), 513-532.


Wednesday, Jan. 16, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Peter Jagers(Chalmers University of Technology, SWEDEN)

On the Path to Extinction

Short abstract: Populations can certainly die out for divers reasons, the most basic probably being through stably unsufficient reproductive power (whatever the ground for that may be). Even in this case there is an abundance of paths to extinction. Still, if the starting population is large, a simple and beautiful pattern emerges, where random and determistic effects are of roughly the same order of magnitude. We describe this path and the time to extinction of large subcritical branching populations, and discuss whether mathematically 'large' could be biologically 'small' (=threatened)


Friday, January 18th, South Hall 5607F, 2:00PM, refreshments served at 1:45PM.
Kobi Abayomi from Duke University,NC

Copula Based Independent Component Analysis

We propose a parametric version of Independent Component Analysis (ICA) via Copulas - families of multivariate distributions that join univariate margins to multivariate distributions. Our procedure exploits the role for copula models in information theory and in measures of association, specifically: the use of copulae densities as parametric mutual information, and as measures of association on the rank statistics.


Friday, January 18th, South Hall 5607F, 3:15PM, refreshments served at 3:00PM
Jan Vecer, Department of Statistics, Columbia University

Tradeable Measures of Risk

The main idea of this talk is to introduce Tradeable Measures of Risk as an objective and model independent way of
measuring risk. The present methods of risk measurement, such as the standard Value-at-Risk supported by Basel II, are based on subjective assumptions of future returns. In order to achieve an objective measurement of risk, we introduce a concept of Realized Risk which we define as a directly observable function of realized returns. Predictive assessment of the future risk is given by Tradeable Measure of Risk - the price of a contract which pays its holder the Realized Risk for a certain period. Our definition of the Realized Risk payoff includes a Weighted Average of Ordered Returns, with the following special cases: the worst return, the empirical Value-at-Risk, and the empirical mean shortfall. When Tradeable Measures of Risk of this type are priced and quoted by the market (even over-the-counter, or traded internally within a financial institution), one does not need a model to calculate values of a risk measure since it will be observed directly from the market. We use an option pricing approach to obtain dynamic pricing formulas for these contracts, where we make an assumption about the distribution of the returns. We also discuss the connection between Tradeable Measures of Risk and the axiomatic definition of Coherent Measures of Risk, and provide some convergence results.


Tuesday, January 22nd, South Hall 5607F, 10:00AM, refreshments served at 9:45am.
Mike Ludkovski

Optimal Stopping and Optimal Switching for Hidden Markov Models

We study optimal stopping and optimal switching problems for hidden Markov chains with Poissonian information structures. In our model, the controller maximizes expected rewards that depend on an unobserved Markovian environment with information collected through a (compound) Poisson observation process. Examples of such systems arise in investment timing, reliability theory, sequential tracking, and economic policy making. We solve the problem by performing Bayesian updates of the posterior likelihoods of the unobservable and studying the resulting optimization problem for a piecewise-deterministic process. We then prove the dynamic programming principle and explicitly characterize an optimal strategy. We also provide an efficient numerical scheme and illustrate our results with several computational examples.

This is based on joint work with Semih Sezer and Erhan Bayraktar (U of M).


Wednesday, January 23rd, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Tanzy Love

Discovery of Latent Patterns with Hierarchical Bayesian Mixed-Membership Models and the Issue of Model Choice

Model choice is a major methodological issue in the explosive growth of data-mining models involving latent structure for clustering and classification, especially because models often have different parameterizations and very different specifications and constraints. Here, we work from a general formulation of hierarchical Bayesian mixed-membership models and present several model specifications and variations, both parametric and nonparametric, in the context of learning the number of latent groups and associated patterns for clustering units. We elucidate strategies for comparing models and specifications by producing novel analyses of the following two data sets: (1) a corpus of scientific publications from the Proceedings of the National Academy of Sciences; (2) data on functionally disabled American seniors from the National Long Term Care Survey.
Our specifications make use of both text and references to narrow the choice of the number of latent topics in our publications data, in both parametric and nonparametric settings. Our findings also bring new insights regarding latent topics compared with earlier analyses.


Friday, January 25rd, South Hall 5607F, 2:15 PM, Refreshments served at 2:00 PM
Guilherme V. Rocha

Designing Penalty Functions for Grouped and Hierarchical Selection

Extracting useful information from high-dimensional data is an important focus of today's statistical research and practice.
Penalized loss function minimization has been shown to be effective for this task. Quasi-norms on model parameters are frequently used as a penalty. Classical examples are AIC and BIC where the L0 quasi-norm (model dimension) is used as
a penalty.
More recently, penalization by the L1-norm (lasso) has enjoyed a lot of attention. L1-penalized estimates are cheaper to compute (convex optimization) and lead to more stable model estimates than their L0 counterparts.
In this talk, I will present the Composite Absolute Penalties (CAP) family of penalties. CAP penalties allow given grouping and hierarchical relationships between the predictors to be expressed. They are built by defining groups of variables and combining the properties of norm penalties at the across group and within group levels. Grouped selection occurs for nonoverlapping groups. Hierarchical variable selection is reached by defining groups with particular overlapping patterns.
Under easily verifiable assumptions, CAP penalties are convex: an attractive property from a computational standpoint. Within this subfamily, unbiased estimates of the degrees of freedom (df) exist so the regularization parameter is selected without cross-validation.
Simulation results show that CAP improves on the predictive performance of the LASSO for cases with $p>>n$ and misspecified groupings.
This is joint work with Peng Zhao and Bin Yu.


Wednesday, February 13th, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Christopher Paciorek(Harvard School of Public Health)

Mapping Ancient Forests: Bayesian Inference for Forest Composition Using the Fossil Pollen Proxy Record

Ecologists are interested in understanding changes in tree species abundances and spatial distributions over thousands of years since the last glacial maximum. To estimate forest composition and investigate how much information is available from fossil pollen eposited in lake sediments, we build a Bayesian spatio-temporal hierarchical model that predicts forest composition in southern New England, USA, based on fossilized pollen. The critical relationships between abundances of taxa in the pollen record and abundances in actual vegetation are estimated using modern data and data from colonial records, for which both pollen and direct vegetation data are available. For these time periods, the model relates pollen and vegetation data to a latent multivariate spatial process representing forest composition, which allows estimation of several key parameters. For time periods in the past, we use only pollen data and the estimated model parameters to make predictions and assess uncertainty about the latent spatio-temporal process over the last 2000 years. A new graphical assessment of feature significance helps to infer which spatial patterns are reliably estimated. The modeling involves a complex hierarchical model that integrates disparate data sources. I will discuss a variety of issues arising in such models and the practical strategies we used to address them. I will also emphasize the importance of understanding which aspects of the data inform which aspects of the model.


WEDNESDAY Feb. 20, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Kiros Berhane (Keck School of Medicine, University of Southern California)

Functional-based Multi-level Modeling of Multiple Longitudinal Outcomes: with applications to environmental epidemiology

Flexible multi-level models are proposed to allow for cluster specific smooth estimation of growth curves, in a mixed-effects modeling format that includes subject-specific random effects on the growth parameters. Attention is then focused on models that examine between-cluster comparisons of the effects of an ecologic covariate of interest (e.g., air pollution) on nonlinear functionals of growth curves (e.g. maximum rate of growth). A Gibbs sampling approach is used to get posterior mean estimates of nonlinear functionals along with their uncertainty estimates. A second-stage ecologic random effects model is used to examine the association between a covariate of interest (e.g., air pollution) and the nonlinear functionals. A unified estimation procedure is presented along with its computational and theoretical details. This work is further extended to allow for modeling of multiple outcomes via a latent variable approach in order to connect several outcomes from a subject. The models are motivated by, and illustrated with, lung function, asthma and air pollution data from the Southern California Children's Health Study.


WEDNESDAY Feb. 27, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM
Amy Braverman(Jet Propulsion Laboratory, California Institute of Technology)

Massive Data Set Analysis for NASA's Atmospheric Infrared Sounder

NASA's Atmospheric Infrared Sounder (AIRS) has been collecting large quantities of remote sensing data about the vertical structure of Earth's atmosphere since AIRS was launched aboard the Aqua spacecraft in mid-2002. These data pose a classic problem in the analysis of massive data sets: how do we understand the relationships among fine-scale phenomena within their global context? We answer that question here by partitioning the data on a coarse spatio-temporal grid, and estimating the multivariate distribution of the data within each grid cell. Then, we look for patterns in the evolution of those distributions as functions of space and time, and ultimately tie them back to physical phenomena generating the data sets. Quantifying this evolution is challenging because the data are high dimensional, and the distributions are complex. We attack the problem using the Wasserstein distance between distributions as a measure of similarity among grid cells' data, and therefore as a measure of similarity between the underlying physical processes. We close with some thoughts on how this strategy might be applied in other problems where massive data sets arise.

 


WEDNESDAY Markch 5, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Christopher Barr

Voronoi-type estimators for spatial intensity

A wide range of methods use Voronoi diagrams to estimate conditional intensity of an inhomogeneous Poisson point process. The inverse cell area (herein referred to as the Voronoi estimator) has been used as a simple, fully non-parametric estimator in neuroscience and astrophysics. Voronoi diagrams have also been used to build flexible prior distributions, and develop optimal quadrature approximations for psuedo-likelihood based approaches. The present work systematically investigates fundamental properties of the Voronoi estimator for inhomogeneous intensity. Known to be unbiased in the homogeneous case, we prove the Voronoi estimator is also approximately ratio unbiased in the inhomogeneous case, and that its bias goes to zero exponentially as conditional intensity increases. Simulation studies show the sampling distribution is well approximated by the inverse gamma model, but generally has high variance. Two additional Voronoi-type estimators (one based on the centroidal Voronoi diagram, the other using k-means clustering) are presented and offer more stable results.


WEDNESDAY April 9, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Dr. Richard Sowers, Department of Mathematics at University of Illinois at Urbana-Champaigny

A propagation-of-chaos type result in stochastic averaging

Stochastic averaging goes back to Khasminskii in the 1960's. The standard result is that, given a separation of scales, one can find effective dynamics for slow components. We investigate the motion of two particles in such a system, in particular in a randomly-perturbed twist map.  The nub of the issue is how two points escape from a 1-1 resonance zone. Results of Pinsky and Wihstutz indicate that there is a third scale at work, which we can use to study the escape from resonance.


WEDNESDAY April 16, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Dr. Suojin Wang, Department of Statistics at the Texas A&M University

A New Semiparametric Procedure for Matched Case-Control Studies with Missing Covariate Data

In this talk we consider an easy-to-use semiparametric method for analyzing matched case-control data when one of the covariates of interest is partially missing. Missing covariate information in matched case-control study may create bias and reduce efficiency of the parameter estimates. In order to cope with this situation we propose a robust approach which is comprised of estimating some functionals of the distribution of the partially missing covariate using a kernel regression technique in a conditional likelihood framework. The large sample properties of the proposed estimator are investigated and the asymptotic normality is obtained. A simulation study is carried out to assess the performance of the proposed method in terms of robustness and efficiency. The proposed method is also applied to a real dataset which motivates this work.


THURSDAY April 17, South Hall 5607F, 3:30 PM-4:45PM

Steve Snapinn,  Vice President, Global Biostatistics and Epidemiology, Amgen

Some Statistical Problems in the Pharmaceutical Industry

It is certainly an interesting time in the pharmaceutical industry. There are greater concerns regarding healthcare costs and drug safety than ever before, and the productivity of the industry with respect to developing new drugs is considerably lower than it was a decade ago. Statisticians can play a major role in addressing these issues. In this presentation I'll begin by giving a overview of Amgen, drug development, and the role of a statistician in the drug industry. Next, I'll describe a few specific statistical issues of interest: 1) Evaluation of safety in clinical trials and in post-marketing surveillance. 2) Responder analyses, or dichotomization of a continuous variable. 3) Non-inferiority trials.
 


WEDNESDAY April 23, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Annie Qu, Department of Statistics at the Oregon State University

Efficient aggregate unbiased estimating functions approach for correlated data with missing at random

We develop a consistent and highly efficient marginal model for missing at random data using an estimating function approach. Our
approach differs from inverse weighted estimating equations and the imputation method, in that our approach does not require estimating the probability of missing or impute the missing response based on assumed models. The proposed method is based on an aggregate unbiased estimating function approach which does not require the likelihood function; however, it is equivalent to the score equation if the likelihood is known. The aggregate unbiased approach is based on a larger class of estimating functions than the pattern-unbiased approach. Therefore, the most efficient estimating function based on the aggregate unbiased approach is more efficient than in pattern-unbiased approaches. We provide comparisons of the three approaches using simulated data and also an HIV data example. This is joint work with Bruce Lindsay and Lin Lu. 


WEDNESDAY April 30, 2008 - SOBEL LECTURE [Event Flyer]
South Hall 5607F
3:15 PM, Refreshments served at 3:00 PM


Dr. Michael Newton, Departments of Statistics and of Biostatistics and Medical Informatics,  University of Wisconsin-Madison

Dirichlet orderings, differential expression, and gene sets

In genomics, and possibly other domains of high-dimensional statistics, it can be useful to know the probabilities that a length-n Dirichlet distributed random vector attains each of its n! possible orderings.  Each ordering event is equivalent to an event regarding independent negative-binomial random variables, and this finding guides a computational approach via dynamic programming. Dirichlet ordering probabilities are central to a new clustering method for multi-group microarray data analysis, which I will discuss and demonstrate in several examples. Time permitting I will also discuss statistical elements in  the related problem of gene set enrichment.


[CANCELLED]WEDNESDAY May 14, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM [CANCELLED]

Dorota M. Dabrowska, University of California, Los Angeles

Bivariate proportional hazard regression models

In this talk I will discuss application of marked point processes in real time towards defining bivariate proportional hazard models and estimation of their parameters in the presence of monotone and non-monotone censoring. The approach applies to both single and multi-type models.


WEDNESDAY May 21, South Hall 5607F, 3:15 PM, Refreshments served at 3:00 PM

Chi-hong Tseng, University of California, Los Angeles

Non-parametric Estimation of a Survival Function with Two-stage Design Studies

The two-stage design is popular in epidemiology studies and clinical trials due to its cost effectiveness. Typically, the first stage sample contains cheaper and possibly biased information, while the second stage validation sample consists of a subset of subjects with accurate and complete information. In this paper, we study estimation of a survival function with right-censored survival data from a two-stage design. A non-parametric estimator is derived by combining data from both stages. We also study its large sample properties and derive pointwise and simultaneous confidence intervals for the survival function. The proposed estimator effectively reduces the variance and finite-sample bias of the Kaplan~VMeier estimator solely based on the second stage validation sample. Finally, we apply our method to a real data set from a medical device postmarketing surveillance study.


 
 
Site Map | Home
Statistics & Applied Probability
University of California
Santa Barbara, California 93106-3110
(805) 893-2129
South Hall 5607A