Wednesday, May 18, Engineering Sciences Building (ESB) 1001, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Emmanuel Candès (Stanford University)
Title: Around the Reproducibility of Scientific Research in the Big Data Era: The Knockoff Filter
The big data era has created a new scientific paradigm: collect data first, ask questions later. When the universe of scientific hypotheses that are being examined simultaneously is not taken account, inferences are likely to be false. The consequence is that follow up studies are likely not to be able to reproduce earlier reported findings or discoveries. This reproducibility failure bears a substantial cost and this talk is about new statistical tools to address this issue. In the last two decades, statisticians have developed many techniques for addressing this look-everywhere effect, whose proper use would help in alleviating the problems discussed above. This lecture will discuss some of these proposed solutions including the Benjamin-Hochberg procedure for false discovery rate (FDR) control and the knockoff filter, a method which reliably selects which of the many potentially explanatory variables of interest (e.g. the absence or not of a mutation) are indeed truly associated with the response under study (e.g. the log fold increase in HIV-drug resistance). This is joint work with Rina Foygel Barber (University of Chicago)
Wednesday, May 11, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Sang-Yun Oh (PSTAT-UCSB)
Title: Extreme scale graphical model selection and an application in data-driven discovery
Abstract: Graphical Models is a useful framework for representing conditional dependencies in multivariate data. Model selection in this setting, also called structure learning in computer science, refers to recovering conditional dependency structure often by solving an optimization problem. This talk will give an overview of two on-going efforts related to graphical model selection. First part of the talk will describe our proposed approach to “scale-up” an existing model selection method to work in extremely high dimensional settings. Our approach is a combination of proximal gradient-based optimization algorithm and communication-avoiding sparse-dense matrix multiplication algorithm. The linear algebra routine is specifically designed for distributed memory parallel computing environment to further increase scalability. Second part will describe a data-driven discovery process for parcellating brain into functional regions in which the proposed algorithm plays an important role. Preliminary results from analysis of resting state functional MRI scans will also be discussed.
Wednesday, May 4, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Michael Ludkovski & Jimmy Risk (PSTAT-UCSB)
Title: Gaussian Process Models in Actuarial Science
Abstract: Gaussian processes (GP) offer a flexible framework for nonparametric regression. Originating in the machine learning context, they are quickly becoming the tool of choice for a variety of response surface modeling problems. This introductory talk will consist of two parts and two speakers. In the first part, Mike will give an introduction to GP regression and discuss its advantages compared to traditional tools. In particular, we will discuss smoothing of noisy observations, and trend/residuals decomposition. A motivating example will be presented for mortality rate estimation and forecasting. In the second part, Jimmy will focus on the problem of efficient valuation of deferred annuities under stochastic mortality. This is an example of a nested simulation problem, where GP surrogates offer an effective, data-driven approximation strategy.
Wednesday, April 27, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Emanuele Taufer (University of Trento)
Title: Goodness-of-fit tests for multivariate stable distributions based on the empirical characteristic function
Abstract: We consider goodness-of-fit testing for multivariate stable distributions. The proposed test statistics exploit a characterizing property of the characteristic function of these distributions and are affine invariant and consistent under some conditions. The asymptotic distribution is derived under the null hypothesis as well as under local and fixed alternatives. Conditions for an asymptotic null distribution free of parameters are provided. Affine invariance of the test statistics and computational issues are discussed in detail. Simulations show that with proper choice of the user parameters involved, the new tests lead to powerful omnibus procedures for the problem at hand. Joint work with Simos Meintanis and Joseph Ngatchou-Wandji.
Wednesday, April 20, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Sudipto Banerjee (UCLA)
Title: On Dynamic Nearest-Neighbor Gaussian Process Models for High-Dimensional Spatial-Temporal Datasets
Abstract: With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatial-temporal process models have become widely deployed statistical tools for researchers to better understanding the complex nature of spatial and temporal variability. However, fitting hierarchical spatial-temporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. In this talk, I will present two approaches for constructing well-defined spatial-temporal stochastic processes that accrue substantial computational savings. Both these processes can be used as "priors" for spatial-temporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that can be exploited as a dimension-reducing prior embedded within a rich and flexible hierarchical modeling framework to deliver exact Bayesian inference. Both these approaches lead to Markov chain Monte Carlo algorithms with floating point operations (flops) that are linear in the number of spatial locations (per iteration). We compare these methods and demonstrate its use in inferring on the spatial-temporal distribution of ambient air pollution in continental Europe using spatial-temporal regression models with chemistry transport models. Joint work with Abhirup Datta and Andrew O. Finley.
Wednesday, April 13, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Mathieu Laurière (Université Paris Diderot)
Title: Mean field type control with congestion
Abstract: The theory of mean field type control, developed mainly by Bensoussan, Frehse and Yam, aims at describing the behaviour of a large number of interacting agents using a common feedback. A type of problems that have raised a lot of interest recently concerns congestion effects. These problems model situations in which the cost of displacement of the agents increases in the regions where the density is large (as, for instance, in crowd motion). I will explain how we obtain existence and uniqueness of suitably defined weak solutions for a system of partial differential equations arising in this setting. The solutions are characterized as the optima of two optimal control problems in duality. Moreover, to solve this problem numerically, a method based on an augmented Lagrangian approach will be presented, and numerical results will be provided.
Wednesday, April 6, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: György Terdik (University of Debrecen Hungary)
Title: A new covariance function for spatio-temporal data analysis with applications
Abstract: Time Series methods are often used for the analysis of data coming from various fields, such as economics, finance, environment and medical sciences, atmospheric pollution and sensor networking etc. One of the important objects in time series analysis is to develop methods for understanding the underlying dynamics of the given data, and based on this knowledge to find suitable linear (or non linear) time series models which could be used for obtaining optimal forecasts. There is an extensive literature on model building,forecasting of time series. Many times the data we come across is not only a function of time t, but also a function of the spatial locations. The problem of interest here is that of finding an estimate of unobservable data at a given location when spatial data at neighboring locations are available.. The estimation requires knowledge of spatial covariance function (or a suitable spatial model). Our object here is to obtain a class of spatio-temporal covariance functions and use the function obtained to obtain optimal forecasts. To achieve these objectives, we use Discrete Fourier Transforms of the data rather than the data itself which will substantially reduces the number of arithmetic operations required to compute the statistics. Let us denote the measurement at time t∈Z, at the location s by Z(s,t). We assume s∈R2. Here Z is the set of integers and R is the real line. Suppose we have an observation regularly collected at several locations, say m locations, and at n time points. Our first object is to validate the data at location, using the data we collected in the neighborhood of it. To achieve this objective, we need a spatio-temporal covariance function, and in this paper we derive an expression for such a function when the process satisfies a partial second order stochastic differential equation. In order to obtain this covariance function we assume the process is spatially, temporally second order stationary and also it is isotropic. Joint work with T. Subba Rao.
Keywords: Complex Stochastic Partial Differential Equations, Covariance Functions, Discrete Fourier Transforms, ,Spatio-Temporal Processes, Prediction (Kriging), Frequency Variogram.
Wednesday, March 9, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Frank Norbert Proske (UiO)
Title: Strong Existence and higher order differentiability of stochastic flows of fractional Brownian motion driven SDE's with singular drift
Abstract: In this talk we present a new method for the construction of strong solutions of SDE's with merely integrable drift coefficients driven by a multidimensional fractional Brownian motion with Hurst parameter H < 1/2. Furthermore, we prove the rather surprising result of the higher order Fréchet differentiability of stochastic flows of such SDE's in the case of a small Hurst parameter. In establishing these results we use techniques from Malliavin calculus combined with new ideas based on a "local time variational calculus". We expect that our general approach can be also applied to the study of certain types of stochastic partial differential equations as e.g. stochastic conservation laws driven by rough paths.
Friday, March 4, South Hall 5607F, 2:00-3:30 p.m.; refreshments served at 1:45 p.m.
Speaker: Martin Lysy (University of Waterloo)
Title: Model comparison and assessment for single particle tracking in biological fluids
Abstract: State-of-the-art techniques in passive particle-tracking microscopy provide high-resolution path trajectories of diverse foreign particles in biological fluids. For particles on the order of 1 micron diameter, these paths are generally inconsistent with simple Brownian motion. Yet, despite an abundance of data confirming these findings and their wide-ranging scientific implications, stochastic modeling of the complex particle motion has received comparatively little attention. Even among posited models, there is virtually no literature on likelihood-based inference, model comparisons, and other quantitative assessments. In this article, we develop a rigorous and computationally efficient Bayesian methodology to address this gap. We analyze two of the most prevalent candidate models for 30 second paths of 1 micron diameter tracer particles in human lung mucus: fractional Brownian motion (fBM) and a Generalized Langevin Equation (GLE) consistent with viscoelastic theory. Our model comparisons distinctly favor GLE over fBM, with the former describing the data remarkably well up to the timescales for which we have reliable information.
Wednesday, March 2, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: S. Rao Jammalamadaka (UCSB)
Title: On the Robustness of Bayes Predictions in Linear Models
Abstract: We consider the prediction problem for linear regression models with elliptical or spherically symmetric errors, a special case of which is the multivariate t-distribution with heavy tails. It is shown that the Bayes prediction density under the elliptical errors assumption is exactly the same as that for normally distributed errors when the prior information is objective or in the conjugate family. Thus assuming that the errors have a normal distribution when the true distribution is indeed elliptical, will not lead to incorrect predictive inferences. This extends some earlier work of Arnold Zellner and others.
Wednesday, February 17, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Joong-Ho (Johann) Won (Seoul National University)
Title: Computational approaches in data mining and portfolio selection
Abstract: This talk consists of two parts. In the first part, I will share my experience with the use of high-performance computing (HPC) in high-dimensional data mining problems, which are well known to be difficult both theoretically and computationally. I advocate parallelization as a practical solution to mitigate the computational difficulties, and show that a fair amount of parallelism can be achieved with small efforts by using commodity HPC systems. Success and failure stories of adopting graphics processing units (GPUs) for the fused lasso sparse regression and Hadoop MapReduce for graph algorithms are discussed. In the second part, I will discuss a use of numerical optimization in financial portfolio selection problems in the presence of parameter uncertainty. Robust optimization is employed to explicitly incorporate a model of parameter uncertainty in the problem formulation, and optimizes for the worst-case scenario. This part of the talk considers robust mean-variance portfolio selection involving a trade-off between the worst-case utility and the worst-case regret, or the largest difference between the best utility achievable under the model and that achieved by a given portfolio. I will show that while optimizing for the worst-case utility may yield an overly pessimistic portfolio, optimizing for the worst-case regret may result in a complete loss of robustness. Robust trade-off portfolio compromises these two extremes, enabling more informative selections. I will show that, under a widely used ellipsoidal uncertainty model, the entire optimal trade-off curve can be found via solving a series of semidefinite programs (SDPs), which are computationally tractable. I then extend the model to handle a union of finitely many ellipsoids, and show that trade-off analysis under this quite general uncertainty model also reduces to a series of SDPs. For more general uncertainties, I propose an iterative algorithm based on the cutting-set method.
Wednesday, February 10, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Speaker: Amir Dembo (Stanford University)
Title: Walking within growing domains: recurrence versus transience
Abstract: When is simple random walk on growing in time d-dimensional domains recurrent? For domain growth which is independent of the walk, we review recent progress and related universality conjectures about a sharp recurrence versus transience criterion in terms of the growth rate. We compare this with the question of recurrence/transience for time varying conductance models, where Gaussian heat kernel estimates and evolving sets play an important role. We also briefly contrast such expected universality with examples of the rich behavior encountered when monotone interaction enforces the growth as a result of visits by the walk to the current domain's boundary. This talk is based on joint works with Ruojun Huang, Ben Morris, Yuval Peres and Vladas Sidoravicius.
Wednesday, January 20, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Davar Khoshnevisan (University of Utah)
Title: Dissipation and High Disorder
Abstract: We consider the "parabolic Anderson model" with delta initial function. This is a linear, infinite system of stochastic differential equations that arise in a vast number of physical models. The solution of this system models, among other things, the particle density of of an infinite system of independent random walks, which replicate [give birth] to new random walks according to a common [independent] space-time white noise environment, starting with one particle at the origin at time 0. We show that, when the underlying random walks move in 1 or 2 dimensions, the total number of particles vanishes as time goes to infinity. By contrast, in dimensions 3 or greater there is phase transition: If the variance of the noise is sufficiently high, then the total number of particles vanishes; and if the noise variance is not sufficiently high, then the total number of particles tends to a non-trivial random variable. This talk is based on joint work with Le Chen, Michael Cranston, and Kunwoo Kim.
Wednesday, January 6, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Nils Detering (University of Munich)
Title: Bootstrap percolation in inhomogeneous, directed random graphs and financial contagion
Abstract: Bootstrap percolation is a process that is used to model the spread of an infection on a given graph. In the model considered each vertex is equipped with an individual threshold. As soon as the number of infected neighbors exceeds that threshold, the vertex gets infected as well and remains so forever. We perform a thorough analysis of bootstrap percolation on a novel model of directed and inhomogeneous random graphs, where the distribution of the edges is specified by assigning two distinct weights to each vertex, describing the tendency of it to receive edges from or to send edges to other vertices. Under the assumption that the limiting degree distribution of the graph is integrable we determine the typical fraction of infected vertices. Our model allows us to study settings that were outside the reach of current methods, in particular the prominent case in which the degree distribution has an unbounded variance. Among other results, we quantify the notion of "systemic risk", that is, to what extent local adverse shocks can propagate to large parts of the graph through a cascade, and discover novel features that make graphs prone/resilient to initially small infections. We show how our results can be used to study default contagion in a financial network. Furthermore, we discuss several statistical aspects related to our model.
Monday, January 4, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Ibrahim Ekren (ETH Zurich)
Title: Viscosity Solutions for Path-dependent PDEs
Abstract: In this talk, we define derivatives of functionals on the space of continuous paths and give an introduction to path-dependent partial differential equations (PPDEs). Since the space of continuous paths is not locally compact, we cannot rely on the theory of viscosity solutions for PDEs and need to develop a new approach. We focus on the path-dependent heat equation and link it to a control problem. We will also mention new developments, challenges and applications in this field. This talk is based on joint works with Christian Keller, Nizar Touzi and Jianfeng Zhang.
Wednesday, December 9, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Daniel Lacker (Brown University)
Title: Mean field limits for stochastic differential games
Abstract: Mean field game (MFG) theory generalizes classical models of interacting particle systems by replacing the particles with decision-makers, making the theory applicable in economics and other social sciences. Most research so far has focused on the existence and uniqueness of Nash equilibria in a model which arises intuitively as a continuum limit (i.e., an infinite-agent version) of a given large-population stochastic differential game of a certain symmetric type. This talk discusses some recent results in this direction, particularly for MFGs with common noise, but more attention is paid to recent progress on a less well-understood problem: Given for each n a Nash equilibrium for the n-player game, in what sense if any do these equilibria converge as n tends to infinity? The answer is somewhat unexpected, and certain forms of randomness can prevail in the limit which are well beyond the scope of the usual notion of MFG solution. A new notion of weak MFG solutions is shown to precisely characterize the set of possible limits of approximate Nash equilibria of n-player games, for a large class of models.
Wednesday, December 2, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Ian Duncan (PSTAT-UCSB)
Title: The Affordable Care Act at 5 years: an actuarial perspective
Abstract: The Affordable Care Act was passed in 2010 and fully-implemented in 2014. Prof. Duncan was on the board of the Massachusetts Health Connector Authority, predecessor of the ACA and was involved in both Massachusetts reform and the ACA implementation in Massachusetts. He continues to be involved with risk adjustment, one of the important actuarial aspects of the law. He will discuss the evolution of the ACA, its successes and some of the issues likely to emerge in future years and their implications for actuaries.
Wednesday, November 18, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Susan Cassels (Geography-UCSB)
Title: Mathematical models to inform effective home-use HIV testing strategies for men who have sex with men
Abstract: The U.S. Food and Drug Administration (FDA) approved the first over-the-counter home-use HIV test in 2012. Public health departments have started to implement programs to increase their use; however, the potential impact of these tests on the HIV epidemic among men who have sex with men (MSM) is unknown. Home-use HIV tests may reduce HIV incidence if used by MSM who would otherwise not test or if they increase rates of testing, diagnosis and treatment. However, home-use tests may increase transmission if men replace clinic-based tests with home-use tests because the relatively long window period of available tests can result in false-negative tests during acute infection when HIV-infected persons are most infectious. The aim of this research is to inform public health approaches to promote safe and effective home-use HIV testing strategies for diverse populations of MSM. Using dynamic HIV transmission modeling, we find that if home-use HIV tests replace clinic-based testing, HIV prevalence may increase among Seattle MSM, even if home-use tests result in increased testing. Using data from two different epidemiologic settings in the U.S., Seattle and Atlanta, future work will use stochastic network models to estimate how different strategies of home-use HIV testing at the individual and partnership levels affects HIV incidence.
Wednesday, November 4, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Tomoyuki Ichiba (PSTAT-UCSB)
Title: Walsh semimartingales and diffusions on metric graphs
Abstract: In this talk we shall discuss diffusions on metric graphs. We start with a change-of-variable formula of Freidlin-Sheu type for Walsh semimartingale on a star graph. In diffusion case we characterize such processes via martingale problem. As a consequence of folding/unfolding semimartingale, we obtain a system of degenerate stochastic differential equations and examine its solution. The stationary distribution, strong Markov property and related statistical problems are also discussed. Then we extend our considerations to diffusions on metric graphs. This talk is based on joint work with I. Karatzas, V. Prokaj and M. Yan.
Wednesday, October 28, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Tomasz J. Kozubowski (Mathematics and Statistics-University of Nevada, Reno)
Title: Wrapping, mixing, and estimation for directional data
Abstract: Directional statistics is an important area, with applications ranging from biology, through earth sciences, to meteorology and medicine. In the first part of the talk, we present a general scheme of generating circular distributions through wrapping linear distributions around a circle, and discuss its particular cases where the linear distribution is either Gaussian or exponential. We then introduce another scheme, where circular distributions are obtained by mixing, and study its relation to wrapping. We show that, in general, these two operations commute: wrapping a mixture of linear distributions corresponds to mixture of wrapped distributions. We explore this in detail, and show that a large number of wrapped circular distributions introduced in the literature can be defined and studied through mixtures of wrapped Gaussian or wrapped exponential distributions. In the second part of the talk, we discuss computational issues arising in estimating circular parameters, where maximum likelihood estimators are rarely available in explicit forms. We present new general methodology, which is based on likelihood and Bayesian principles and can be adapted to circular data.
Wednesday, October 21, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Michael Nava (PSTAT-UCSB)
Title: A Change-point Problem in Circular Statistics
Abstract: Change-point tests are meant to detect the point in time at which a sample of observations changes in the probability distribution from which they came. Suppose one has a set of independent vectors of measurements, observed in a time-ordered or space-ordered sequence. In our set-up, these observations are circular data and we are interested as to which point in time does the distribution change from having one mode to having more than one mode. In this work we model unimodality or bimodality with a mixture of two Circular Normal distributions, which admits both possibilities, albeit for different parameter values. Tests for detecting the change-point are derived using the generalized likelihood ratio method. We obtain simulated distributions and critical values for the appropriate test statistics in finite samples, as well as provide the asymptotic distributions, under some regularity conditions. We also tackle this problem from a Bayesian perspective.
Wednesday, October 14, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Howard Zail (Elucidor, LLC; New York, NY)
Title: Implementation of Bayesian Predictive Analytics for Insurance Product Pricing, Underwriting and Risk Management
Abstract: There has been an explosion of new and powerful Bayesian predictive analytics techniques and methodologies over the last twenty years, but the insurance industry has been very slow at adopting these methodologies in practice. This seminar will present a number of real problems faced by insurers or pension funds and show how these new techniques can be implemented to improve profitability and establish a more efficient capital management strategy. In particular, we will discuss utilizing the following methodologies in a cohesive framework: state space modeling, hierarchical models, efficient and large scale MCMC, feature selection, and probabilistic programming.
Wednesday, October 7, South Hall 5607F, 3:30-5:00 p.m.; refreshments served at 3:15 p.m.
Andrey Sarantsev (PSTAT-UCSB)
Title: Approximation of reflected diffusions by solutions of SDE
Abstract: Consider a reflected diffusion on the positive half-line. It has a hard barrier at zero, which it cannot penetrate. We approximate it by a soft barrier created by drift, that is, by a solution to an SDE. We also consider a mutlidimensional version of this problem.