I am a DPhil student in Statistics at the University of Oxford, supervised by Yee Whye Teh and Tom Rainforth. I got my Bachelor’s and Master’s degrees in mathematics from Cambridge and worked as a machine learning engineer before joining the department.
I have a broad range of interests in statistical machine learning. A large part of my work in Oxford has been on optimal experimental design: how do we design experiments that will be most informative about the process being investigated, whilst minimizing cost? I contribute to the deep probabilistic programming language Pyro and I am the main author of Pyro’s experimental design support.
There is a deep mathematical connection between optimal experimental design, mutual information and contrastive representation learning.
I also study contrastive learning from the perspectives of invariance and mutual information.
Other research interests of mine include the role of equivariance in machine learning, Bayesian modelling and probabilistic programming, and deep representation learning.
Publications
2024
F. Bickford Smith
,
A. Foster
,
T. Rainforth
,
Making better use of unlabelled data in Bayesian active learning, International Conference on Artificial Intelligence and Statistics, 2024.
@article{bickfordsmith2024making,
author = {Bickford Smith, Freddie and Foster, Adam and Rainforth, Tom},
year = {2024},
title = {Making better use of unlabelled data in {Bayesian} active learning},
journal = {International Conference on Artificial Intelligence and Statistics}
}
@article{rainforth2023modern,
author = {Rainforth, Tom and Foster, Adam and Ivanova, Desi R. and Bickford Smith, Freddie},
year = {2024},
title = {Modern {Bayesian} experimental design},
journal = {Statistical Science}
}
2023
F. Bickford Smith
,
A. Kirsch
,
S. Farquhar
,
Y. Gal
,
A. Foster
,
T. Rainforth
,
Prediction-oriented Bayesian active learning, International Conference on Artificial Intelligence and Statistics, 2023.
@article{bickfordsmith2023prediction,
author = {Bickford Smith, Freddie and Kirsch, Andreas and Farquhar, Sebastian and Gal, Yarin and Foster, Adam and Rainforth, Tom},
year = {2023},
title = {Prediction-oriented {Bayesian} active learning},
journal = {International Conference on Artificial Intelligence and Statistics}
}
2021
A. Foster
,
R. Pukdee
,
T. Rainforth
,
Improving Transformation Invariance in Contrastive Representation Learning, International Conference on Learning Representations (ICLR), 2021.
We propose methods to strengthen the invariance properties of representations obtained by contrastive learning. While existing approaches implicitly induce a degree of invariance as representations are learned, we look to more directly enforce invariance in the encoding process. To this end, we first introduce a training objective for contrastive learning that uses a novel regularizer to control how the representation changes under transformation. We show that representations trained with this objective perform better on downstream tasks and are more robust to the introduction of nuisance transformations at test time. Second, we propose a change to how test time representations are generated by introducing a feature averaging approach that combines encodings from multiple transformations of the original input, finding that this leads to across the board performance gains. Finally, we introduce the novel Spirograph dataset to explore our ideas in the context of a differentiable generative process with multiple downstream tasks, showing that our techniques for learning invariance are highly beneficial.
@article{foster2021improving,
title = {Improving Transformation Invariance in Contrastive Representation Learning},
author = {Foster, Adam and Pukdee, Rattana and Rainforth, Tom},
year = {2021},
journal = {International Conference on Learning Representations (ICLR)}
}
A. Foster
,
D. R. Ivanova
,
I. Malik
,
T. Rainforth
,
Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design, International Conference on Machine Learning (ICML, long presentation), 2021.
We introduce Deep Adaptive Design (DAD), a general method for amortizing the cost of performing sequential adaptive experiments using the framework of Bayesian optimal experimental design (BOED). Traditional sequential BOED approaches require substantial computational time at each stage of the experiment. This makes them unsuitable for most real-world applications, where decisions must typically be made quickly. DAD addresses this restriction by learning an amortized design network upfront and then using this to rapidly run (multiple) adaptive experiments at deployment time. This network takes as input the data from previous steps, and outputs the next design using a single forward pass; these design decisions can be made in milliseconds during the live experiment. To train the network, we introduce contrastive information bounds that are suitable objectives for the sequential setting, and propose a customized network architecture that exploits key symmetries. We demonstrate that DAD successfully amortizes the process of experimental design, outperforming alternative strategies on a number of problems.
@article{foster2021deep,
title = {{Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design}},
author = {Foster, Adam and Ivanova, Desi R and Malik, Ilyas and Rainforth, Tom},
year = {2021},
journal = {International Conference on Machine Learning (ICML, long presentation)}
}
E. Mathieu
,
A. Foster
,
Y. W. Teh
,
On Contrastive Representations of Stochastic Processes, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021.
Learning representations of stochastic processes is an emerging problem in machine learning with applications from meta-learning to physical object models to time series. Typical methods rely on exact reconstruction of observations, but this approach breaks down as observations become high-dimensional or noise distributions become complex. To address this, we propose a unifying framework for learning contrastive representations of stochastic processes (CReSP) that does away with exact reconstruction. We dissect potential use cases for stochastic process representations, and propose methods that accommodate each. Empirically, we show that our methods are effective for learning representations of periodic functions, 3D objects and dynamical processes. Our methods tolerate noisy high-dimensional observations better than traditional approaches, and the learned representations transfer to a range of downstream tasks.
@article{mathieu2021contrastive,
title = {{On Contrastive Representations of Stochastic Processes}},
author = {Mathieu, Emile and Foster, Adam and Teh, Yee Whye},
year = {2021},
journal = {35th Conference on Neural Information Processing Systems (NeurIPS 2021)}
}
D. R. Ivanova
,
A. Foster
,
S. Kleinegesse
,
M. U. Gutmann
,
T. Rainforth
,
Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021.
We introduce implicit Deep Adaptive Design (iDAD), a new method for performing adaptive experiments in real-time with implicit models. iDAD amortizes the cost of Bayesian optimal experimental design (BOED) by learning a design policy network upfront, which can then be deployed quickly at the time of the experiment. The iDAD network can be trained on any model which simulates differentiable samples, unlike previous design policy work that requires a closed form likelihood and conditionally independent experiments. At deployment, iDAD allows design decisions to be made in milliseconds, in contrast to traditional BOED approaches that require heavy computation during the experiment itself. We illustrate the applicability of iDAD on a number of experiments, and show that it provides a fast and effective mechanism for performing adaptive design with implicit models.
@article{ivanova2021implicit,
title = {{Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods}},
author = {Ivanova, Desi R and Foster, Adam and Kleinegesse, Steven and Gutmann, Michael U and Rainforth, Tom},
year = {2021},
journal = {35th Conference on Neural Information Processing Systems (NeurIPS 2021)}
}
2020
A. Foster
,
M. Jankowiak
,
M. O’Meara
,
Y. W. Teh
,
T. Rainforth
,
A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments, International Conference on Artificial Intelligence and Statistics (AISTATS), 2020.
We introduce a fully stochastic gradient based approach to Bayesian optimal experimental design (BOED). This is achieved through the use of variational lower bounds on the expected information gain (EIG) of an experiment that can be simultaneously optimized with respect to both the variational and design parameters. This allows the design process to be carried out through a single unified stochastic gradient ascent procedure, in contrast to existing approaches that typically construct an EIG estimator on a pointwise basis, before passing this estimator to a separate optimizer. We show that this, in turn, leads to more efficient BOED schemes and provide a number of a different variational objectives suited to different settings. Furthermore, we show that our gradient-based approaches are able to provide effective design optimization in substantially higher dimensional settings than existing approaches.
@article{foster2020unified,
title = {{A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments}},
author = {Foster, Adam and Jankowiak, Martin and O'Meara, Matthew and Teh, Yee Whye and Rainforth, Tom},
journal = {International Conference on Artificial Intelligence and Statistics (AISTATS)},
year = {2020}
}
2019
A. Foster
,
M. Jankowiak
,
E. Bingham
,
P. Horsfall
,
Y. W. Teh
,
T. Rainforth
,
N. Goodman
,
Variational Bayesian Optimal Experimental Design, Advances in Neural Information Processing Systems (NeurIPS, spotlight), 2019.
Bayesian optimal experimental design (BOED) is a principled framework
for making efficient use of limited experimental resources. Unfortunately,
its applicability is hampered by the difficulty of obtaining accurate estimates
of the expected information gain (EIG) of an experiment. To address this, we
introduce several classes of fast EIG estimators by building on ideas from
amortized variational inference. We show theoretically and empirically that
these estimators can provide significant gains in speed and accuracy over
previous approaches. We further demonstrate the practicality of our approach
on a number of end-to-end experiments.
@article{foster2019variational,
title = {{Variational Bayesian Optimal Experimental Design}},
author = {Foster, Adam and Jankowiak, Martin and Bingham, Eli and Horsfall, Paul and Teh, Yee Whye and Rainforth, Tom and Goodman, Noah},
journal = {Advances in Neural Information Processing Systems (NeurIPS, spotlight)},
year = {2019}
}
@inproceedings{BloemReddy:etal:2018,
author = {Bloem-Reddy, Benjamin and Foster, Adam and Mathieu, Emile and Teh, Yee Whye},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
title = {Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks},
month = aug,
year = {2018}
}
A. Foster
,
M. Jankowiak
,
E. Bingham
,
Y. W. Teh
,
T. Rainforth
,
N. Goodman
,
Variational Optimal Experiment Design: Efficient Automation of Adaptive Experiments, NeurIPS Workshop on Bayesian Deep Learning, 2018.
Bayesian optimal experimental design (OED) is a principled framework for making efficient use of limited experimental resources. Unfortunately, the applicability of OED is hampered by the difficulty of obtaining accurate estimates of the expected information gain (EIG) for different experimental designs. We introduce a class of fast EIG estimators that leverage amortised variational inference and show that they provide substantial empirical gains over previous approaches. We integrate our approach into a deep probabilistic programming framework, thus making OED accessible to practitioners at large.
@article{foster2018voed,
title = {{Variational Optimal Experiment Design: Efficient Automation of Adaptive Experiments}},
author = {Foster, Adam and Jankowiak, Martin and Bingham, Eli and Teh, Yee Whye and Rainforth, Tom and Goodman, Noah},
journal = {NeurIPS Workshop on Bayesian Deep Learning},
year = {2018}
}
2017
B. Bloem-Reddy
,
E. Mathieu
,
A. Foster
,
T. Rainforth
,
H. Ge
,
M. Lomelí
,
Z. Ghahramani
,
Y. W. Teh
,
Sampling and inference for discrete random probability measures in probabilistic programs, NIPS Workshop on Advances in Approximate Bayesian Inference, 2017.
We consider the problem of sampling a sequence from a discrete random probability measure (RPM) with countable support, under (probabilistic) constraints of finite memory and computation. A canonical example is sampling from the Dirichlet Process, which can be accomplished using its stick-breaking representation and lazy initialization of its atoms. We show that efficiently lazy initialization is possible if and only if a size-biased representation of the discrete RPM is used. For models constructed from such discrete RPMs, we consider the implications for generic particle-based inference methods in probabilistic programming systems. To demonstrate, we implement SMC for Normalized Inverse Gaussian Process mixture models in Turing.
@article{bloemreddy2017rpm,
title = {Sampling and inference for discrete random probability measures in probabilistic programs},
author = {Bloem-Reddy, Benjamin and Mathieu, Emile and Foster, Adam and Rainforth, Tom and Ge, Hong and Lomelí, María and Ghahramani, Zoubin and Teh, Yee Whye},
journal = {NIPS Workshop on Advances in Approximate Bayesian Inference},
year = {2017}
}