Modeling relational data using nested partition models

Event Date: 

Thursday, May 16, 2013 - 3:30pm to 5:00pm

Event Date Details: 

Refreshments served at 3:15 PM

Event Location: 

  • South Hall 5607F

Dr. Kaushik Ghosh (Univ of Nevada, Las Vegas)

Title: Modeling relational data using nested partition models

Abstract: We introduce a flexible class of models for relational data based on a hierarchical extension of the two-parameter Poisson-Dirichlet process. The models are motivated by two different applications: 1) A study of cancer mortality rates in the U.S., where rates for different types of cancer are available for each state, and 2) the analysis of microarray data, where expression levels are available for a large number of genes in a sample of subjects. In both these settings, we are interested in improving the estimation by flexibly borrowing information across rows and columns while partitioning the data into homogeneous subpopulations. Our model allows for a novel nested partitioning structure in the data not provided by existing nonparametric methods, in which rows are clustered while simultaneously grouping together columns within each cluster of rows. The number of partitions are assumed to be unknown and are estimated from the data. We illustrate our models using some real data examples. 

This is joint work with Abel Rodriguez (UC Santa Cruz).