Yuedong
Wang (UCSB)
Title:
Nonparametric Nonlinear Regression Models
Abstract: Almost all of the current nonparametric regression
methods such as smoothing splines, generalized additive models and
varying coefficients models assume a linear relationship when
nonparametric functions are regarded as parameters. In this talk
we present a general class of nonparametric nonlinear models that
allow nonparametric functions to act nonlinearly. They arise in
many fields as either theoretical or empirical models. We propose
new estimation methods based on an extension of the Gauss-Newton
method to infinite dimensional spaces and the backfitting
procedure. We extend the generalized cross validation and the
generalized maximum likelihood methods to estimate smoothing
parameters. Connections between nonlinear nonparametric models and
nonlinear mixed effects models are established. Approximate
Bayesian confidence intervals are derived for inference. We will
also present a user friendly R function for fitting these models.
The methods will be illustrated using two real data
examples.
WEDNESDAY
October 15, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Professor
Sreenivas
Jammalamadaka (UCSB)
Title:
Directional Statistics -what is it?
Abstract: The talk will provide a general introduction to this novel area of
statistics, where the observations are directions. After discussing some
applications, new descriptive measures as well as statistical models will be
introduced for such data. Problems of inference will be briefly outlined.
WEDNESDAY
October 29, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Greg
Ridgeway
Senior Statistician
Acting Director, RAND Safety & Justice Research Program
RAND Corporation, Santa Monica, CA
Title:
Racial Profiling Analysis
Abstract: Several studies and high profile incidents around the
nation involving police and minorities have brought the issue of
racial profiling to national attention. While civil rights issues
continue to arise in other areas such as offers of employment, job
promotions, and school admissions, the issue of race disparities
in traffic stops seems to have garnered much attention in recent
years. Many communities, and at times the U.S. Department of
Justice, have asked law enforcement agencies to collect and
analyze data on all traffic stops. Data collection efforts,
however, so far have outpaced the development of methods that can
isolate the effect of race bias on officers' decisions to stop,
cite, or search motorists.
In this talk I will describe a test for detecting race bias in the
decision to stop a driver that does not require explicit, external
estimates of the driver risk set. Second, I'll describe an
internal benchmarking methodology for identifying potential
problem officers.
Lastly, I will describe methods for assessing racial disparities
in citation, searches, and stop duration. I will present results
from my
studies of the Oakland (CA), Cincinnati, and New York City Police
Departments.
Bio: Greg Ridgeway (Ph.D. Statistics, University of Washington,
Seattle) is a Senior Statistician at the RAND Corporation in Santa
Monica, CA and is the Acting Director of RAND's Safety and Justice
Research Program and Director of RAND's Center on Quality
Policing, charged with managing RAND's portfolio of work on
policing, crime prevention, courts, corrections, and public and
occupational safety. His applied research has addressed illegal
firearm markets, gang formation, drug treatment programs, racial
profiling, and policing. In 2005, he received a commendation from
the ATF Los Angeles Field Division and the Attorney General of
California for "Contributions to Reducing Firearms Related
Crimes in Los Angeles." In 2007 his paper with Jeff Grogger
on a test
for racial bias in traffic stops received the American Statistical
Association's Outstanding Statistical Application Award.
WEDNESDAY
November 5, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Michael Ludkovski
(UCSB)
Title: Optimal Risk Sharing under Distorted Probabilities
Abstract: We study optimal risk sharing among n agents endowed with distortion
risk measures. Risk sharing under third-party constraints is also considered. We obtain an explicit formula for Pareto optimal
allocations. In particular, we find that a stop-loss or deductible risk
sharing is optimal in the case of two agents and several common distortion functions. This extends recent result of Jouini et al. (2006)
to the problem with unbounded risks and market frictions.
In the first part of my talk I will give a brief survey of distortion risk measures and its relation to other risk preferences. I will then
discuss recent research and open problems.
WEDNESDAY
November 12, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Qing Zhou (UCLA)
Title: Reconstructing the Energy Landscape of a Distribution from Monte
Carlo Samples
Abstract:
Defining the energy function as the negative logarithm of the density, we
explore the energy landscape of a distribution via the tree of sublevel
sets of its energy. This tree represents the hierarchy among the connected
components of the sublevel sets. We propose ways to annotate the tree so
that it provides information on both topological and statistical aspects
of the distribution, such as the local energy minima (local modes), their
local domains and volumes, and the barriers between them. We develop a
computational method to estimate the tree and reconstruct the energy
landscape from Monte Carlo samples simulated at a wide energy range of a
distribution. This method can be applied to any arbitrary distribution on
a space with defined connectedness. We test the method on multimodal
distributions and posterior distributions to show that our estimated trees
are accurate compared to theoretical values. When used to perform Bayesian
inference of DNA sequence segmentation, this approach reveals much more
information than the standard approach based on marginal posterior
distributions.
WEDNESDAY
November 19, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Ping Ma
Assistant Professor, Department of Statistics
Beckman Fellow, Center for Advanced Study
Faculty Member, Institute for Genomic Biology
University of Illinois at Urbana-Champaign
Title:
A Journey to the Center of the Earth
Abstract: At a depth of ~2890 km, the core-mantle boundary (CMB) separates turbulent
flow of liquid metals in the outer core from slowly convecting, highly
viscous mantle silicates. The CMB marks the most dramatic change in dynamic
processes and material properties in our planet, and accurate images of the
structure at or near the CMB -- over large areas -- are crucially important
for our understanding of present day geodynamical processes and the
thermo-chemical structure and history of the mantle and mantle-core system.
In addition to mapping the CMB we need to know if other structures exist
directly above or below it, what they look like, and what they mean (in
terms of physical and chemical material properties and geodynamical
processes). Detection, imaging, (multi-scale) characterization, and
understanding of structure (e.g., interfaces) in this remote region have
been -- and are likely to remain -- a frontier in cross-disciplinary
geophysics research. We will discuss the statistical problems and challenges
in imaging the CMB through generalized Radon transform.
THURSDAY
November 20, South Hall 5607F, 12:30 PM, Refreshments served at 12:15 PM
James Hardin, Department of Epidemiology and Biostatistics at the University of
South Carolina
Title:
An Overview of the Sandwich Variance Estimator
Abstract: We will examine the history, development, players, and
application of the so-called sandwich estimate of the variance. In
describing this estimator, we pay attention to the applications that
have appeared in the literature and examine the nature of the problems
for which this estimator is used. Examples will be shown to highlight
the robustness property and a detailed derivation for two stage models
will highlight the relationship to the Murphy-Topel variance estimate.
We briefly discuss various adjustments to the estimate for use with
small samples as well as discussing the interpretation of results. This
discussion will include some mathematical details, but will focus
largely on the interpretation for application in research.
WEDNESDAY
January 7, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Dr. Daniel Merl (Duke University Department of Statistical Science)
Title:
Nonparametric mixtures of nonparametric mixtures for detecting cell subtypes in flow cytometry
Abstract: Flow cytometry is a high throughput experimental methodology for
measuring the expression of surface proteins on hundreds of thousands
to millions of individual cells. Identification of distinct cellular
subtypes on the basis of these multivariate expression patterns plays
an important role in adjuvant vaccine design, for which the goal is to
elicit the strongest possible immune response. Due to the sparse and
highly non-Gaussian nature of flow cytometric data, identification and
quantification of cellular subtypes has traditionally (and perhaps
astonishingly) been accomplished through manual gating based on 2-d
projections. Bayesian nonparametrics provides a flexible,
model-based, predictive framework for multivariate non-Gaussian
density estimation and classification. However, most existing
nonparametric methods assume the fundamental mixture components to be
of some standard distributional form that are individually
insufficient to describe variation in the cell subtypes. I present a
novel hierarchical mixture model, a nonparametric mixture of
nonparametric mixtures, that enables automatic registration of an
unknown number non-Gaussian components, each of which is itself a
mixture of an unknown number of basis distributions. I will discuss
inferential methods capable of exploiting high performance computing
clusters, and apply the methodology to assess treatment efficacy in an
adjuvant vaccine trial data set.
FRIDAY January 9, South Hall 5607F, 2-3 PM, Refreshments served at 3:00
PM
Dr. Donatello Telesca (Department of Biostatistics at the University of Texas, M.D. Anderson Cancer Center)
Title:
Modeling Dependent Expression Data
Abstract: We consider modeling dependent high throughput expression data arising
from different molecular interrogation technologies. Dependence
between molecules is introduced via the explicit consideration of
informative prior information associated with available pathways,
representing known biochemical regulatory processes. The important
features of the proposed methodology are the ease of representing typical
prior information on the nature of dependencies, model-based parsimonious
representation of the signal as an ordinal outcome, and the use of
coherent probabilistic schemes over both, structure and strength of
the conjectured dependencies. As part of the inference we reduce the
recorded data to a trinary response
representing underexpression, average expression and overexpression.
Inference in the described model is
implemented through Markov chain Monte Carlo (MCMC) simulation, including
posterior simulation over conditional dependence and independence. The
latter involves a variable dimensional parameter space. We
use a reversible jump MCMC scheme. The motivating example are data from
ovarian cancer patients.
MONDAY
January 12, South Hall 5607F, 2PM-3PM, Refreshments served at 3:00
PM
Mr. Jarad Niemi (Duke University)
Title:
Computational approaches for general state-space models
Abstract: State-space models are widely used for analysis of time
series data in fields such as biology, finance, epidemiology, and
others. In a Bayesian context, simultaneous state and fixed parameter
estimation are performed using either Markov chain Monte Carlo if the
data are collected in batch or Sequential Monte Carlo when the data
are analyzed in real-time. I will discuss developments in these fields
that exploit the use of mixtures of distributions. In the MCMC
context, filtering and smoothing is accomplished using mixtures that
provide Metropolis proposals for the entire latent state series. In
the SMC context, approximating filtered distributions for fixed
parameters provides a means to regenerate parameter draws and
combining this with sufficient statistic methods enriches the class
for which those methods can be used. We draw on motivating examples
from biology and finance to illustrate the methodologies.
WEDNESDAY
January 14, South Hall 5607F, 2PM-3PM, Refreshments served at 3:00
PM
Dr. Elizabeth C. Mannshardt-Shamseldin (Duke University)
Title:
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian
Prediction Methods for Non-linear Predictands
Abstract: The need often arises in spatial settings to perform a data transformation to
achieve a stationary process and/or variance stabilization. The transformation
may be a non-linear transformation, and the desired predictand may be
multivariate in that it is necessary to interpolate predictions at multiple
sites. We assume the underlying spatial model is a Gaussian random field with
a parametrically specified covariance structure, but that the predictions of
interest are for multivariate nonlinear functions of the Gaussian field. This
induces new complications in the spatial interpolation known as kriging. For
instance, it is no longer possible to derive the predictive distribution
function in closed form. A difficulty that arises with traditional kriging
methods is the fact that the standard formula for the mean squared prediction
error does not take into account the estimation of the covariance parameters.
This generally leads to underestimated prediction errors, even if the model is
correct. Smith and Zhu (2004) establish a second-order expansion for
predictive distributions in Gaussian processes with estimated covariances.
Here, we establish a similar expansion for multivariate kriging with non-linear
predictands.
Bayesian methods provide a possible resolution to errors encountered through
employing frequentist estimation techniques for obtaining spatial parameters.
An important property of Bayesian methods is the ability to deal with the
uncertainty in a particular model. Here we explore a Laplace approximation to
Bayesian techniques that provides an alternative to common iterative Bayesian
methods, such as Markov Chain Monte Carlo. The main results are asymptotic
formulae for a general, non-linear predictand for the expected length of a
Bayesian prediction interval, which has possible applications in network
design, and for the coverage probability bias, which can lead to the
development of a matching prior.
WEDNESDAY
January 14, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Prof. David Aldous (UC Berkeley)
Title:
When Knowing Early Matters: Gossip, Percolation and Nash Equilibria
Abstract: Continually arriving information is communicated through a network of $n$ agents, with the value of information to the $j$'th recipient being a decreasing function of $j/n$, and communication costs paid by recipient. Regardless of details of network topology and communication costs, the social optimum policy is to communicate arbitrarily slowly. But selfish agent behavior leads to Nash equilibria which (in the $n \to \infty$ limit) may be efficient (Nash payoff $=$ social optimum payoff) or wasteful ($0 < $ Nash payoff $<$ social optimum payoff) or totally wasteful (Nash payoff $=0$). We study the cases of the complete network (constant communication costs between all agents), the grid with only nearest-neighbor communication, and the grid with communication cost a function of distance. Many variant problems suggest themselves.
The main technical tool is analysis of the associated first passage percolation process (representing spread of one item of information) and in particular its ``window width", the time interval during which most agents learn the item.
THURSDAY,
January 15, South Hall 5607F, 3:30-4:30 PM, Refreshments served at 3:15
PM
Dr. Michael Lavine (University of Massachusetts Amherst)
Title:
Subjective Likelihood
Abstract: We describe a problem in physical oceanography
in which we want to create a spatio-temporal model. However, there is no
plausible sampling density p(data|parameter). We solved the problem by
presenting simple data sets to the expert to learn how she changes her
prior into her posterior. From these simple data sets we infer the
likelihood function. (There is still no p(data|parameter).) Then we
apply that likelihood function to the large, spatio-temporal data set.
Reference: "Subjective Likelihood for the Assessment of Trends in the
Ocean's Mixed-Layer Depth, with Comments and Rejoinder", JASA (2007), 102,
771--787.
The interest lies in the foundations: how we handled the problem of no
p(data|parameter).
WEDNESDAY
January 21, South Hall 5607F, 3:15 PM, Refreshments served at 3:00
PM
Dr. Amanda Hering (Texas A&M University)
Title:
Powering up with Space-Time Wind Forecasting
Abstract: The technology to harvest electricity from wind energy is now
advanced enough to make
entire cities powered by it a reality. High-quality short-term forecasts
of wind speed are vital to
making this a more reliable energy source. Gneiting et al. (2006) have
introduced a model for
the average wind speed two hours ahead based on both spatial and temporal
information. The
forecasts produced by this model are accurate, and subject to accuracy,
the predictive distribution is sharp, i.e., highly concentrated around its
center. However, this model is split into nonunique regimes based on the
wind direction at an off-site location. This paper both generalizes
and improves upon this model by treating wind direction as a circular
variable and including it in the model. It is robust in many experiments,
such as predicting at new locations. We compare this with the more common
approach of modeling wind speeds and directions in the Cartesian space and
use a skew-t distribution for the errors. The quality of the predictions
from all of these models can be more realistically assessed with a loss
measure that depends upon the power curve relating wind speed to power
output. This proposed loss measure yields more insight into the true value
of each model’s predictions.
TUESDAY Feb. 17, South Hall 5607F, 3:30PM-4:30PM, Refreshments served at 3:15
PM
Prof. Haya Kaspi (Israeli Institute of Technology, Haifa, Israel)
Title:
Measure Valued Processes in the Asymptotic Approximation of Many Servers Queues
Abstract: The lecture focuses on queueing systems with many servers serving in parallel, where
the arrival process into the system is a quite general counting process, the service times of various
customers are i.i.d. random variables with general distribution and are independent of the arrival process,
and the number of servers N is large. A primary motivation for studying such systems is that they arise
as models for telephone call centers. While most research to date on such systems assumes that the
service time is exponentially distributed, a fact which makes the number of customers in the system
a Markov process, statistical analysis of large service stations performed recently have shown that the
service times are typically non exponential but rather Lognormal or Weibul distributed. An extension of
the exponentially distributed service times to phase type service distribution by Puhalski and Reiman,
lead to a Markov process with a finite dimensional state descriptor. The general service time assumption
lead us to represent the Markovian dynamics of the system in terms of a process that describes the total
number of customers in the system, as well as a measured valued process that keeps track of the ”ages”
(the time in service) of the various customers in service. In the call center application, it is natural
to consider an asymptotic approximation in the limit, as the number of servers and the arrival rate go
to infinity and the mean traffic intensity increases to 1, in such a way that the limiting probability of
a positive queue is strictly between 0 and 1. This asymptotic regime is often referred to as the QED
(Quality and Efficiency Driven) regime that was introduced in the seminal paper by Halfin and Whitt
in 1981, and dealt with such systems with exponentially distributed service time. Fluid (first order)and
diffusion (second order) approximations of the pair consisting of the number of customers in the system
and the measure valued process described above, in heavy traffic as N ! 1 will be discussed in this
lecture.
WEDNESDAY March 4, South Hall 5607F, 3:15 PM, Refreshments
served at 3:00 PM
Prof. Henry Schellhorn (Claremont Graduate University)
Title:
An Algorithm for the Pricing of Path-Dependent American Options Using
Malliavin Calculus
Abstract: We propose a recursive scheme to calculate backward the values of conditional
expectations of functions of path values of Brownian motion. This scheme is
based on the Clark-Ocone formula in discrete time. We construct an algorithm
based on our scheme to efficiently calculate the price of American options
on securities with path-dependent payoffs. Our algorithm can be combined
with regression-based Monte Carlo methods, like the Longstaff-Schwartz
algorithm. In this case, our algorithm remedies the decrease of performance
experienced by regression-based methods when the number of basis functions,
or regressands, needs to be quite large, because of path-dependence.
WEDNESDAY April 15, South Hall 5607F, 3:15 PM, Refreshments
served at 3:00 PM
Prof. Guillaume Bonnet (University of California Santa Barbara)
Title: Nonlinear Stochastic PDEs for Highway Traffic Flows: Theory and Calibration to Traffic Data
Abstract:
Highway traffic flows are generally modeled by partial differential
equations (PDEs). These models are used by traffic engineers for road
design, planning or management. However, they often fail to capture
important features of empirical traffic flow studies, particularly at
small scales. In this talk, I will propose a fairly simple
stochastic model in the form of a nonlinear stochastic partial
differential equation(SPDE) with random coefficients driven by a Poisson
random measure. I will discuss the well posedness of the proposed
equation as well as the corresponding inverse problem that I will
illustrate by its calibration to high resolution traffic data from
highway 101 in Los Angeles.
WEDNESDAY April 22, South Hall 5607F, 3:15 PM, Refreshments
served at 3:00 PM
Prof. Douglas Steigerwald (UCSB Economics Department)
Title: SUBSAMPLE TESTS FOR REGIME SWITCHING
Abstract:
Models of regime switching do not satisfy the standard
assumptions that govern the large sample behavior of
test statistics. Research focuses on likelihood ratio
tests and the most recent advances, due to Cho and
White (2007), yield a limit distribution for the
likelihood ratio test that depends on specified
intervals for the coefficients that vary over regimes.
As the limit distribution is not standard, Cho and
White obtain critical values from a numeric
approximation that requires explicit specification
of these coefficient intervals. As researchers
may lack knowledge of the correct coefficient interval,
we study how misspecification of the interval impacts
numerically approximated critical values and the
resultant power of likelihood ratio tests. We find
that the power of likelihood ratio tests is sensitive
to the coefficient interval specified in the numeric
approximation from Cho and White. To eliminate power
losses that arise from coefficient interval
misspecification we use subsamples, which do not
require explicit specification of the coefficient
interval. We compare the likelihood ratio test,
based on subsampled critical values, with two other
tests and find large size-adjusted power gains for
the likelihood ratio test. (Joint work between Prof Douglas Steigerwald and Benjamin Hansen)
WEDNESDAY May 6, South Hall 5607F, 3:15 PM, Refreshments
served at 3:00 PM
Prof. F. Gregory Ashby (Department of Psychology,
University of California Santa Barbara)
Title:
A Neurocomputational Theory of Context Learning During Skill Acquisition
Abstract: When learning a new skill, it is vital that we also learn the context in which that skill is relevant. In this talk I will describe a neurocomputational theory of how such context learning is mediated in the brain. Skill learning is known to depend on a major subcortical structure called the striatum. The new theory proposes that a key component of context learning during skill acquisition is provided by cholinergic interneurons in the striatum known as TANs (i.e., Tonically Active Neurons). Evidence suggests that the TANs exert a tonic inhibitory influence over striatal output neurons that prevents the execution of any striatal-dependent action. The TANs learn to pause to rewarding contexts, and this pause releases the striatal output neurons from inhibition, thereby facilitating the learning and expression of striatal-dependent behaviors. When the context changes, the TANs cease to pause, thereby protecting striatal learning from decay in non-rewarding environments. In the computational version of this theory, neural units in the relevant brain regions are each modeled by two coupled differential equations ? one that models fast changes in membrane potential and a second that models slow changes in the activation and inactivation of various intracellular ion channels. Learning is modeled via a biologically detailed form of reinforcement learning. The model accounts for some key single-cell recording and behavioral results. For example, the model accounts for a number of well-known learning phenomena (e.g., fast reacquisition following extinction, spontaneous recovery), and offers new interpretations of some classic societal problems (e.g., why bad habits are so difficult to break; why recidivism from drug-dependency treatment programs is so high).
WEDNESDAY May 20, South Hall 5607F, 3:15 PM, Refreshments
served at 3:00 PM
Prof. Marco Frittelli (University of Milano, Italy)
Title: Conditional certainty equivalent and representation of risk measures
Abstract: In the framework of dynamic indifference pricing, we study the conditional version of the classical notion of the certainty equivalent. This concept leads to the investigation of quasi convex maps and their dual representation.