Probabilistic inference, Deep learning, Generative models, Representation Learning, Geometry

I am a second year DPhil student of statistics at the University of Oxford supervised by Prof. Yee Whye Teh and Ryota Tomioka from Microsoft Research. Previously, I received a joint MSc. from Ecole des Ponts ParisTech and Ecole Normale Supérieure Paris-Saclay. My research interests lie in the fields of probabilistic generative models, representation learning and geometry.

Publications

2021

E. Mathieu
,
A. Foster
,
Y. W. Teh
,
On Contrastive Representations of Stochastic Processes, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021.

Learning representations of stochastic processes is an emerging problem in machine learning with applications from meta-learning to physical object models to time series. Typical methods rely on exact reconstruction of observations, but this approach breaks down as observations become high-dimensional or noise distributions become complex. To address this, we propose a unifying framework for learning contrastive representations of stochastic processes (CReSP) that does away with exact reconstruction. We dissect potential use cases for stochastic process representations, and propose methods that accommodate each. Empirically, we show that our methods are effective for learning representations of periodic functions, 3D objects and dynamical processes. Our methods tolerate noisy high-dimensional observations better than traditional approaches, and the learned representations transfer to a range of downstream tasks.

@article{mathieu2021contrastive,
title = {{On Contrastive Representations of Stochastic Processes}},
author = {Mathieu, Emile and Foster, Adam and Teh, Yee Whye},
year = {2021},
journal = {35th Conference on Neural Information Processing Systems (NeurIPS 2021)}
}

2020

E. Mathieu
,
M. Nickel
,
Riemannian Continuous Normalizing Flows, in Advances in Neural Information Processing Systems 33, 2020.

Normalizing flows have shown great promise for modelling flexible probability distributions in a computationally tractable way. However, whilst data is often naturally described on Riemannian manifolds such as spheres, torii, and hyperbolic spaces, most normalizing flows implicitly assume a flat geometry, making them either misspecified or ill-suited in these situations. To overcome this problem, we introduce Riemannian continuous normalizing flows, a model which admits the parametrization of flexible probability measures on smooth manifolds by defining flows as the solution to ordinary differential equations. We show that this approach can lead to substantial improvements on both synthetic and real-world data when compared to standard flows or previously introduced projected flows.

@inproceedings{mathieu2019Riemannian,
title = {Riemannian Continuous Normalizing Flows},
author = {Mathieu, Emile and Nickel, Maximilian},
booktitle = {Advances in Neural Information Processing Systems 33},
year = {2020},
publisher = {Curran Associates, Inc.}
}

2019

E. Mathieu
,
C. Le Lan
,
C. J. Maddison
,
R. Tomioka
,
Y. W. Teh
,
Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders, in Advances in Neural Information Processing Systems 32, 2019, 12565–12576.

The Variational Auto-Encoder (VAE) is a popular method for learning a generative model and embeddings of the data. Many real datasets are hierarchically structured. However, traditional VAEs map data in a Euclidean latent space which cannot efficiently embed tree-like structures. Hyperbolic spaces with negative curvature can. We therefore endow VAEs with a Poincaré ball model of hyperbolic geometry as a latent space and rigorously derive the necessary methods to work with two main Gaussian generalisations on that space. We empirically show better generalisation to unseen data than the Euclidean counterpart, and can qualitatively and quantitatively better recover hierarchical structures.

@inproceedings{mathieu2019Continuous,
title = {Continuous Hierarchical Representations with Poincar\'{e} Variational Auto-Encoders},
author = {Mathieu, Emile and Le Lan, Charline and Maddison, Chris J. and Tomioka, Ryota and Teh, Yee Whye},
booktitle = {Advances in Neural Information Processing Systems 32},
pages = {12565--12576},
year = {2019},
publisher = {Curran Associates, Inc.}
}

E. Mathieu
,
T. Rainforth
,
N. Siddharth
,
Y. W. Teh
,
Disentangling Disentanglement in Variational Autoencoders, in Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, USA, 2019, vol. 97, 4402–4412.

We develop a generalisation of disentanglement in VAEs—decomposition of the latent representation—characterising it as the fulfilment of two factors: a) the latent encodings of the data having an appropriate level of overlap, and b) the aggregate encoding of the data conforming to a desired structure, represented through the prior. Decomposition permits disentanglement, i.e. explicit independence between latents, as a special case, but also allows for a much richer class of properties to be imposed on the learnt representation, such as sparsity, clustering, independent subspaces, or even intricate hierarchical dependency relationships. We show that the β-VAE varies from the standard VAE predominantly in its control of latent overlap and that for the standard choice of an isotropic Gaussian prior, its objective is invariant to rotations of the latent representation. Viewed from the decomposition perspective, breaking this invariance with simple manipulations of the prior can yield better disentanglement with little or no detriment to reconstructions. We further demonstrate how other choices of prior can assist in producing different decompositions and introduce an alternative training objective that allows the control of both decomposition factors in a principled manner.

@inproceedings{pmlr-v97-mathieu19a,
title = {Disentangling Disentanglement in Variational Autoencoders},
author = {Mathieu, Emile and Rainforth, Tom and Siddharth, N and Teh, Yee Whye},
booktitle = {Proceedings of the 36th International Conference on Machine Learning},
pages = {4402--4412},
year = {2019},
volume = {97},
series = {Proceedings of Machine Learning Research},
address = {Long Beach, California, USA},
month = {09--15 Jun},
publisher = {PMLR}
}

@inproceedings{BloemReddy:etal:2018,
author = {Bloem-Reddy, Benjamin and Foster, Adam and Mathieu, Emile and Teh, Yee Whye},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
title = {Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks},
month = aug,
year = {2018}
}

2017

B. Bloem-Reddy
,
E. Mathieu
,
A. Foster
,
T. Rainforth
,
H. Ge
,
M. Lomelí
,
Z. Ghahramani
,
Y. W. Teh
,
Sampling and inference for discrete random probability measures in probabilistic programs, NIPS Workshop on Advances in Approximate Bayesian Inference, 2017.

We consider the problem of sampling a sequence from a discrete random probability measure (RPM) with countable support, under (probabilistic) constraints of finite memory and computation. A canonical example is sampling from the Dirichlet Process, which can be accomplished using its stick-breaking representation and lazy initialization of its atoms. We show that efficiently lazy initialization is possible if and only if a size-biased representation of the discrete RPM is used. For models constructed from such discrete RPMs, we consider the implications for generic particle-based inference methods in probabilistic programming systems. To demonstrate, we implement SMC for Normalized Inverse Gaussian Process mixture models in Turing.

@article{bloemreddy2017rpm,
title = {Sampling and inference for discrete random probability measures in probabilistic programs},
author = {Bloem-Reddy, Benjamin and Mathieu, Emile and Foster, Adam and Rainforth, Tom and Ge, Hong and Lomelí, María and Ghahramani, Zoubin and Teh, Yee Whye},
journal = {NIPS Workshop on Advances in Approximate Bayesian Inference},
year = {2017}
}