News&Events

BIDSA Seminar: “Learning and exploiting low-dimensional structure in high-dimensional data”
| ROOM: AS03

| 19/09/2019 h.12.30
BIDSA Seminar: “Learning and exploiting low-dimensional structure in high-dimensional data”
Speaker: David Dunson, Duke University

“Learning and exploiting low-dimensional structure in high-dimensional data”


19 September 2019, 12:30PM

Bocconi University, Room AS03



(Roentgen underground floor, access from the stairs in front of BBar)



ABSTRACT

This talk will focus on the problem of learning low-dimensional geometric structure in high-dimensional data. We allow the lower-dimensional subspace to be non-linear. There are a variety of algorithms available for "manifold learning" and non-linear dimensionality reduction, mostly relying on locally linear approximations and not providing a likelihood-based approach for inferences. We propose a new class of simple geometric dictionaries for characterizing the subspace, along with a simple optimization algorithm and a model-based approach to inference. We provide strong theory support, in term of tight bounds on covering numbers, showing advantages of our approach relative to local linear dictionaries. These advantages are shown to carry over to practical performance in a variety of settings including manifold learning, manifold de-noising, data visualization (providing a competitor to the popular tSNE), classification (providing a competitor to deep neural networks that requires fewer training examples), and geodesic distance estimation. We additionally provide a Bayesian nonparametric methodology for inference, using a new class of kernels, which is shown to outperform current methods, such as mixtures of multivariate Gaussians.

SPEAKER

David Dunson is Arts and Sciences Distinguished Professor of Statistical Science and Mathematics at Duke University. He is known for his broad spanning contributions to statistical methodology, with a particular focus on novel modeling frameworks and Bayesian approaches that are motivated by complex and high-dimensional data collected in the sciences. This includes latent factor, dimensionality reduction, nonparametric and machine learning methodology. Primary areas of application include neurosciences and brain network modeling, environmental health, ecology, and human fertility among others. He is a fellow of the ASA, IMS and ISBA and has won numerous awards, including most notably the 2010 COPSS President’s Award. His work is very widely cited and he has an H-index over 70 on Google Scholar.