As datasets grow ever larger in scale, complexity and variety, there is an increasing need for powerful machine learning and statistical techniques that are capable of learning from such data. Bayesian nonparametrics is a promising approach to data analysis that is increasingly popular in machine learning and statistics. Bayesian nonparametric models are highly flexible models with infinite-dimensional parameter spaces that can be used to directly parameterise and learn about functions, densities, conditional distributions etc. This ERC funded project aims to develop Bayesian nonparametric techniques for learning rich representations from structured data in a computationally efficient and scalable manner.
This EPSRC project involving Yee Whye Teh (Oxford), Arnaud Doucet (Oxford), and Christophe Andrieu (Bristol) aims to develop both methodologies and theoretical foundations for scalable Markov chain Monte Carlo methods for big data. The starting point was stochastic gradient Langevin dynamics (SGLD) (Welling and Teh 2011), where we have provided theoretical analyses in terms of both asymptotic convergence (Teh et al 2016) as well as weak error expansion (Vollmer et al 2016). We have also developed a range of novel scalable Monte Carlo algorithms based on different techniques.
This project involving Yee Whye Teh and Dino Sejdinovic aims to explore novel methods for machine learning, with particular foci on deep generative models and hyperparameter optimization. There will be several topics of investigation in these fields that will cut across various aspects of large-scale machine learning: efficient computation for working large datasets; scalable and expressive models that can extract as much information as possible from the available data; large-scale and heterogeneous computational substrates (e.g. GPUs, multicore systems, networked clusters) and theoretical foundations. The project is funded by Tencent AI Lab.