Bin Yu (Statistics and EECS, UC Berkeley)

Event Date: 

Wednesday, May 10, 2017 - 3:30pm to 5:00pm

Event Date Details: 

refreshments served at 3:15 p.m

Event Location: 

  • Multi-Cultural Center
  • Department Seminar Series

Title: Three principles of data science: predictability, stability, and computability

Abstract: In this talk, I'd like to discuss the intertwining importance and connections of three principles of data science in the title in data-driven decisions. Making prediction as its central task and embracing computation as its core, machine learning has enabled wide-ranging data-driven successes. Prediction is a useful way to check with reality. Good prediction implicitly assumes stability between past and future. Stability (relative to data and model perturbations) is also a minimum requirement for interpretability and reproducibility of data driven results (cf. Yu, 2013). It is closely related to uncertainty assessment. Obviously, both prediction and stability principles can not be employed without feasible computational algorithms, hence the importance of computability. 

The three principles will be demonstrated in the context of two neuroscience projects and through analytical connections. In particular, the first project adds stability to predictive modeling used for reconstruction of movies from fMRI brain signals for interpretable models. The second project use predictive transfer learning that combines AlexNet, GoogleNet and VGG with single V4 neuron data for state-of-the-art prediction performance. Our results lend support, to a certain extent, to the assemblence of these CNNs to brain and at the same time provide stable pattern interpretations of neurons in the difficult primate visual cortex V4.