Schedule 2025/26

Here is the schedule for the 2025/26 academic year.

Upcoming Talks
Past Talks (This Term)

All Terms

Summer Term

Date	Speaker	Title	Links
21 May 2026	Gabriel Diaz-Aylwin Lancaster University	Constrained Bayesian Optimisation of Field-Valued Constraints with an application to Fusion Reactors.
I will discuss an approach for constrained Bayesian optimisation which uses ideas from PDE-constrained optimisation to pre-compute an approximation of the feasible subset of the optimisation domain. This approximation is injected as a prior into the traditional optimisation loop. I will also introduce and discuss its application to divertor optimisation in tokamak fusion reactors via the Grad-Shafranov equation. This is work in progress, in collaboration with the UK Atomic Energy Authority.
28 May 2026	Edwin Fong University of Hong Kong	Quantile Martingale Posteriors
In this talk, we introduce a novel Bayesian nonparametric method for quantile estimation/regression based on the martingale posterior (MP) framework. The core idea of the MP is that posterior sampling is equivalent to predictive imputation, which allows us to break free of the stringent likelihood-prior specification. We demonstrate that a recursive estimate of a smooth quantile function, subject to a martingale condition, is entirely sufficient for full nonparametric Bayesian inference. We term the resulting posterior distribution as the quantile martingale posterior (QMP), which arises from an implicit generative predictive distribution. Associated with the QMP is an expedient, MCMC-free and parallelizable posterior computation scheme, which can be further accelerated with an asymptotic approximation based on a Gaussian process. Furthermore, the well-known issue of monotonicity in quantile estimation is naturally alleviated through increasing rearrangement due to the connections to the Bayesian bootstrap, and the QMP has a particularly tractable form that allows for comprehensive theoretical study.
5 Jun 2026	Matti Vihola University of Jyväskylä	Mixing time of the conditional backward sampling particle filter	JRSSB
The conditional backward sampling particle filter (CBPF; also known as the particle Gibbs with ancestor sampling, PGAS) is a powerful Markov chain Monte Carlo sampler for general state space hidden Markov model smoothing. It was proposed as an improvement over the conditional particle filter, which is known to have an O(T^2) computational time complexity under a general ‘strong mixing’ assumption of the model, where T is the time horizon. We provide the first proof that the CBPF admits an O(T log T) computational complexity under strong mixing, complementing strong empirical evidence of the superiority of the CBPF in practice. In particular, the CBPF’s mixing time is upper bounded by O(log T), for any sufficiently large number of particles N that depends only on the mixing constants and not T. We show that an O(log T) mixing time is optimal. The proof involves the analysis of a novel coupling of two CBPFs, which involves a maximal coupling of two particle systems at each time instant. The coupling is implementable, and thus can also be used to construct unbiased, finite variance, estimates of functionals which have arbitrary dependence on the latent state’s path, with a total expected cost of O(Tlog T). The talk is based on joint work with Joona Karjalainen, Sumeetpal S. Singh and Anthony Lee.
11 Jun 2026	Tiffany Vlaar University of Glasgow
18 Jun 2026	Alexander Heinlein Delft University of Technology
25 Jun 2026	Matthias Sachs Lancaster University

Date	Speaker	Title	Links
21 May 2026	Gabriel Diaz-Aylwin Lancaster University	Constrained Bayesian Optimisation of Field-Valued Constraints with an application to Fusion Reactors.
I will discuss an approach for constrained Bayesian optimisation which uses ideas from PDE-constrained optimisation to pre-compute an approximation of the feasible subset of the optimisation domain. This approximation is injected as a prior into the traditional optimisation loop. I will also introduce and discuss its application to divertor optimisation in tokamak fusion reactors via the Grad-Shafranov equation. This is work in progress, in collaboration with the UK Atomic Energy Authority.
28 May 2026	Edwin Fong University of Hong Kong	Quantile Martingale Posteriors
In this talk, we introduce a novel Bayesian nonparametric method for quantile estimation/regression based on the martingale posterior (MP) framework. The core idea of the MP is that posterior sampling is equivalent to predictive imputation, which allows us to break free of the stringent likelihood-prior specification. We demonstrate that a recursive estimate of a smooth quantile function, subject to a martingale condition, is entirely sufficient for full nonparametric Bayesian inference. We term the resulting posterior distribution as the quantile martingale posterior (QMP), which arises from an implicit generative predictive distribution. Associated with the QMP is an expedient, MCMC-free and parallelizable posterior computation scheme, which can be further accelerated with an asymptotic approximation based on a Gaussian process. Furthermore, the well-known issue of monotonicity in quantile estimation is naturally alleviated through increasing rearrangement due to the connections to the Bayesian bootstrap, and the QMP has a particularly tractable form that allows for comprehensive theoretical study.
5 Jun 2026	Matti Vihola University of Jyväskylä	Mixing time of the conditional backward sampling particle filter	JRSSB
The conditional backward sampling particle filter (CBPF; also known as the particle Gibbs with ancestor sampling, PGAS) is a powerful Markov chain Monte Carlo sampler for general state space hidden Markov model smoothing. It was proposed as an improvement over the conditional particle filter, which is known to have an O(T^2) computational time complexity under a general ‘strong mixing’ assumption of the model, where T is the time horizon. We provide the first proof that the CBPF admits an O(T log T) computational complexity under strong mixing, complementing strong empirical evidence of the superiority of the CBPF in practice. In particular, the CBPF’s mixing time is upper bounded by O(log T), for any sufficiently large number of particles N that depends only on the mixing constants and not T. We show that an O(log T) mixing time is optimal. The proof involves the analysis of a novel coupling of two CBPFs, which involves a maximal coupling of two particle systems at each time instant. The coupling is implementable, and thus can also be used to construct unbiased, finite variance, estimates of functionals which have arbitrary dependence on the latent state’s path, with a total expected cost of O(Tlog T). The talk is based on joint work with Joona Karjalainen, Sumeetpal S. Singh and Anthony Lee.
11 Jun 2026	Tiffany Vlaar University of Glasgow
18 Jun 2026	Alexander Heinlein Delft University of Technology
25 Jun 2026	Matthias Sachs Lancaster University

Summer Term

Date	Speaker	Title	Links
14 May 2026	Chris Nemeth Lancaster University	Hypergraph Generation via Structured Stochastic Diffusion
Hypergraphs model higher-order interactions, but realistic hypergraph generation remains difficult because incidence, hyperedge-size heterogeneity, and overlap structure are not faithfully captured by pairwise reductions. We propose HEDGE, a generative model defined directly on relaxed incidence matrices via a structured stochastic diffusion. The forward process combines a hypergraph-specific two-sided heat operator with an Ornstein–Uhlenbeck component, preserving structure-aware noising near the data while yielding an explicit Gaussian terminal law. Conditional on an observed hypergraph, this forward process is linear-Gaussian, so conditional means, covariances, scores, and reverse-drift targets are available in closed form. We therefore learn a permutation-equivariant state-only reverse-drift field in incidence space by regressing onto exact conditional targets, and generate samples by simulating a learned reverse-time SDE from the Gaussian base law. We establish exactness in the ideal state-only setting together with finite-horizon stability guarantees, and empirically show improved hypergraph generation quality relative to strong baselines.
30 Apr 2026	Liam Llamazares-Elias Lancaster University	Non-stationary Gaussian fields and Penalized Complexity Priors	Slides
Gaussian random fields (GFs) are fundamental tools in spatial modeling and can be represented flexibly and efficiently as solutions to stochastic partial differential equations (SPDEs). The SPDEs depend on specific parameters, which govern various field behaviors and can be estimated using Bayesian inference. Informative priors are essential to ensure meaningful posterior covariance structures. This study builds on previous work by constructing penalized complexity (PC) priors for a smooth, invertible parameterization of the correlation range, diffusion matrix, and variance of a non-stationary GF. The formulated prior is weakly informative, effectively penalizing complexity by pushing the model towards stationarity while allowing for enough flexibility to capture non-stationary behavior. The model is applied to model precipitation in Spain, particulate matter in California, and electoral data in France with promising results.
23 Apr 2026	Rui Zhang Lancaster University	Why Should We Care About Wasserstein Gradient Flows?	Slides
Wasserstein gradient flow (WGF) has emerged as a useful tool in computational statistics and machine learning from both a theoretical and a methodological point of view. From the theoretical side, WGF helps to interpret the bias of unadjusted Langevin, as well as establish new convergence bounds. From the methodological side, WGF formulation fosters novel sampling algorithms such as Stein variational gradient descent. In addition, the WGF formulation allows us to sample from posteriors that are unattainable via classical methods like MCMC, such as the post-Bayesian predictively-oriented posteriors. In this talk, we will survey some of these developments, especially those that are more relevant to BayesComp methodologies, and (hopefully) keep the mathematics approachable.

Lent Term

Date	Speaker	Title	Links
19 Mar 2026	Masha Naslidnyk University College London	Kernel Quantile Embeddings and Associated Probability Metrics
Embedding probability distributions into reproducing kernel Hilbert spaces (RKHS) has enabled powerful non-parametric methods such as the maximum mean discrepancy (MMD), a statistical distance with strong theoretical and computational properties. At its core, the MMD relies on kernel mean embeddings (KMEs) to represent distributions as mean functions in RKHS . However, it remains unclear if the mean function is the only meaningful RKHS representation. Inspired by generalised quantiles, we introduce the notion of kernel quantile embeddings (KQEs), along with a consistent estimator. We then use KQEs to construct a family of distances that (i) are probability metrics under weaker kernel conditions than MMD ;(ii) recover a kernelised form of the sliced Wasserstein distance; and(iii) can be efficiently estimated with near-linear cost. Through hypothesis testing, we show that these distances offer a competitive alternative to MMD and its fast approximations. Our findings demonstrate the value of representing distributions in Hilbert space beyond simple mean functions, paving the way for new avenues of research.
12 Mar 2026	Chen Qi Aalto University	Theoretical understanding of generalization and memorization in generative models	Paper [1] Paper [2] Paper [3] Paper [4]
Generative models are able to create new and diverse samples, yet the mechanisms that let them generalize rather than memorize remain unclear. This talk presents recent theoretical insights into how autoencoders, adversarial models, and diffusion models learn structure from data while controlling model capacity and preventing overfitting. By drawing on ideas from information theory, statistical learning, and training dynamics, the talk offers a clear picture of why these models can generalize effectively and why some of them show a strong natural resistance to memorization. The talk is mainly based on my previous work [3] and will cover introduction to other closely related works [1, 2, 4].
5 Mar 2026	Paul Fearnhead Lancaster University	Feynman-Kac Correctors in Diffusions: In-painting
The talk will discuss the paper “Feynman-Kac Correctors in Diffusions: Annealing, Guidance and Product of Experts” by Skreta et al. In particular I will look at the relationship of their approach to sampling from reward-tilted targets to that of SMC methods for in-painting.
26 Feb 2026	Jonas Latz University of Manchester	Sparse Techniques for Regression in Deep Gaussian Processes	Paper 1 Paper 2
Gaussian processes (GPs) have gained popularity as flexible machine learning models for regression and function approximation with an in-built method for uncertainty quantification. GPs suffer when the amount of training data is large or when the underlying function contains multiscale features that are difficult to represent by an isotropic kernel. The training of GPs with large scale data is often performed through inducing point approximations (also known as sparse GP regression), where the size of the covariance matrices in GP regression is reduced considerably through a greedy search on the data set. Deep Gaussian processes have recently been introduced as hierarchical models that resolve multi-scale features by composing multiple Gaussian processes. Whilst GPs can be trained through a simple analytical formula, deep GPs require a sampling or, more usual, a variational approximation. Variational approximations lead to large-scale stochastic, non-convex optimisation problems and the resulting approximation tends to represent uncertainty incorrectly. In this work, we combine variational learning with MCMC to develop a particle-based expectation-maximisation method to simultaneously find inducing points within the large-scale data (variationally) and accurately train the Gaussian processes (sampling-based). The result is a highly efficient and accurate methodology for deep GP training on large scale data. We test the method on standard benchmark problems. Joint work with Aretha Teckentrup and Simon Urbainczyk.
19 Feb 2026	Chris Sherlock Lancaster University	Robust, partially alive particle Metropolis-Hastings via the Frankenfilter
When a hidden Markov model permits the conditional likelihood of an observation given the hidden process to be zero, all particle simulations from one observation time to the next could produce zeros. If so, the filtering distribution cannot be estimated and the estimated parameter likelihood is zero. The alive particle filter addresses this by simulating a random number of particles for each inter-observation interval, stopping after a target number of non-zero conditional likelihoods. For outlying observations or poor parameter values, a non-zero result can be extremely unlikely, and computational costs prohibitive. We introduce the Frankenfilter, a principled, partially alive particle filter that targets a user-defined amount of success whilst fixing lower and upper bounds on the number of simulations. The Frankenfilter produces unbiased estimators of the likelihood, suitable for pseudo-marginal Metropolis–Hastings (PMMH). We demonstrate that PMMH with the Frankenfilter is more robust to outliers and mis-specified initial parameter values than PMMH using standard particle filters, and is typically at least 2-3 times more efficient. We also provide advice for choosing the amount of success. In the case of n exact observations, this is particularly simple: target n successes.
12 Feb 2026		Informal session
5 Feb 2026	Andre Menezes Maynooth University	Bayesian nonparametric models for zero-inflated count-compositional data using ensembles of regression trees.
Count-compositional data arise in many different fields, including high-throughput microbiome sequencing and palynology experiments, where a common, important goal is to understand how covariates relate to the observed compositions. Existing methods often fail to simultaneously address key challenges inherent in such data, namely: overdispersion, an excess of zeros, cross-sample heterogeneity, and nonlinear covariate effects. In this talk, we first present novel probabilistic portrayals of two multivariate models designed to handle zero-inflation in count-compositional data. Then, to address the above concerns, we propose novel Bayesian nonparametric models based on ensembles of regression trees. Specifically, we leverage the recently introduced zero-and-N-inflated multinomial distribution and assign independent nonparametric Bayesian additive regression tree (BART) priors to both the compositional and structural zero probability components of our model, to flexibly capture covariate effects. We further extend this by adding latent random effects to capture overdispersion and more general dependence structures among the categories. We develop an efficient inferential algorithm combining recent data augmentation schemes with established BART sampling routines. We evaluate our proposed models in simulation studies and illustrate their applicability with two case studies in microbiome and palaeoclimate modelling.
29 Jan 2026	William Laplante University College London	Conjugate Generalised Bayesian Inference for Discrete Doubly Intractable Problems
Doubly intractable problems occur when both the likelihood and the posterior are available only in unnormalised form, with computationally intractable normalisation constants. Bayesian inference then typically requires direct approximation of the posterior through specialised and typically expensive MCMC methods. In this paper, we provide a computationally efficient alternative in the form of a novel generalised Bayesian posterior that allows for conjugate inference within the class of exponential family models for discrete data. We derive theoretical guarantees to characterise the asymptotic behaviour of the generalised posterior, supporting its use for inference. The method is evaluated on a range of challenging intractable exponential family models, including the Conway-Maxwell-Poisson graphical model of multivariate count data, autoregressive discrete time series models, and Markov random fields such as the Ising and Potts models. The computational gains are significant; in our experiments, the method is between 10 and 6000 times faster than state-of-the-art Bayesian computational methods.
22 Jan 2026	Yuga Iguchi Lancaster University	Dynamical regimes of denoising diffusion models for sampling from multimodal distributions
I will discuss the mechanism of denoising diffusion models (DDMs) for sampling from multimodal distributions on $\mathbb{R}^d$. The first part of the talk will review the basics of DDMs — from discrete Markov chains to continuous-time formulations via SDEs. Then, using a mixture of two Gaussians as a canonical example of a multimodal target, I will describe how DDMs gradually transform the initial prior (a standard Gaussian) into the bimodal target distribution. In particular, I will show analytically that denoising trajectories dynamically change their behaviour during sampling, and that the denoising procedure can be characterised roughly by three stages: 1. Early stage — Contraction; 2. Intermediate stage — Expansion (contraction is lost); 3. Final stage — local attraction to a single mode, possibly contracting again locally. I will also clarify how these stages depend on properties of the target distribution, such as dimension, separation between modes, and the variances of the mixture components. This talk is based on ongoing joint work with Paul Fearnhead.
15 Jan 2026	Hefin Lambley University of Warwick	Autoencoders in function space	Slides
We propose function-space versions of autoencoders—machine-learning methods for dimension reduction and generative modelling—in both their deterministic (FAE) and variational (FVAE) forms. Formulating autoencoder objectives in function space enables training and evaluation with data discretised at arbitrary resolutions, leading to new applications such as inpainting, superresolution, and generative modelling. We discuss the technical challenges of formulating autoencoders in infinite dimension. A key issue is that FVAE’s variational inference is often ill defined, unlike in finite dimensions, limiting its applicability. We then explore specific problem classes where FVAE remains useful. We contrast this with the FAE objective, which remains well defined in many situations where FVAE fails, making it a robust and versatile alternative. We demonstrate both methods on scientific data sets, including Navier–Stokes fluid flow simulations. This is joint work with Justin Bunker and Mark Girolami (Cambridge), Andrew M. Stuart (Caltech) and T. J. Sullivan (Warwick).

Michaelmas Term

Date	Speaker	Title	Links
11 Dec 2025	Zheyang Shen Newcastle University	A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives	Slides
Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised. This increased flexibility introduces a computational challenge, as one loses access to an explicit unnormalised density for the target. To mitigate this difficulty, we introduce a novel measure of suboptimality called ‘gradient discrepancy’, and in particular a ‘kernel’ gradient discrepancy (KGD) that can be explicitly computed. In the standard Bayesian context, KGD coincides with the kernel Stein discrepancy (KSD), and we obtain a novel characterisation of KSD as measuring the size of a variational gradient. Outside this familiar setting, KGD enables novel sampling algorithms to be developed and compared, even when unnormalised densities cannot be obtained. To illustrate this point several novel algorithms are proposed and studied, including a natural generalisation of Stein variational gradient descent, with applications to mean-field neural networks and predictively oriented posteriors presented. On the theoretical side, our principal contribution is to establish sufficient conditions for desirable properties of KGD, such as continuity and convergence control.
4 Dec 2025	Giorgos Vasdekis	Sampling with time-changed Markov processes	Slides
We introduce a framework of time-changed Markov processes to speed up the convergence of Markov chain Monte Carlo (MCMC) algorithms in the context of multimodal distributions and rare event simulation. The time-changed process is defined by adjusting the speed of time of a base process via a user-chosen, state-dependent function. We apply this framework to several Markov processes from the MCMC literature, such as Langevin diffusions and piecewise deterministic Markov processes, obtaining novel modifications of classical algorithms and also re-discovering known MCMC algorithms. We prove theoretical properties of the time-changed process under suitable conditions on the base process, focusing on connecting the stationary distributions and qualitative convergence properties such as geometric and uniform ergodicity, as well as a functional central limit theorem. Time permitting, we will compare our approach with the framework of space transformations, clarifying the similarities between the approaches. This is joint work with Andrea Bertazzi.
20 Nov 2025	Lanya Yang Lancaster University	Exchangeable Particle Gibbs for Markov Jump Processes	Slides
Inference in stochastic reaction-network models—such as the SEIR epidemic model or the Lotka–Volterra predator–prey system—is crucial for understanding the dynamics of interacting systems in epidemiology, ecology, and systems biology. These models are typically represented as Markov jump processes (MJPs) with intractable likelihoods. As a result, particle Markov chain Monte Carlo (particle MCMC) methods, particularly the Particle Gibbs (PG) sampler, have become standard tools for Bayesian inference. However, PG suffers from severe particle degeneracy, especially in high-dimensional state spaces, leading to poor mixing and inefficiency. In this talk, I focus on improving the efficiency of particle MCMC methods for inference in reaction networks by addressing the degeneracy problem. Building on recent work on the Exchangeable Particle Gibbs (xPG) sampler for continuous-state diffusions, this project develops a novel version of xPG tailored to discrete-state reaction networks, where randomness is driven by Poisson processes rather than Brownian motion. The proposed method retains the exchangeability framework of xPG while adapting it to the structural and computational challenges specific to reaction networks.
30 Oct 2025	Rui Zhang Lancaster University	A Dynamic Perspective of Matern Gaussian Processes	Slides HTML Slides
The ubiquitous Gaussian process (GP) models in statistics and machine learning (Williams and Rasmussen; 2006) are static by default, either using the weight-space or function-space views (Kanagawa et al.; 2025), where the observation and test locations have no unilateral dependency order, and this also explains the cubic scalability in computational costs for GP regressions. On the other hand, the dynamic view of Gaussian processes, while only available for a class of GP models, reformulates the dependency structure unilaterally (Whittle; 1954) to enable sequential inferences for GP regressions with computational costs that could scale linearly (Hartikainen and Sarkka;2010; Sarkka and Hartikainen; 2012) with little to no approximation. This talk explores this dynamic perspective of (Matern) Gaussian processes and some consequences of this perspective.
16 Oct 2025	Henry Moss Lancaster University	GPGreen: Linear Operator Learning with Gaussian Processes
4 Sep 2025	Rafael Izbicki Federal University of São Carlos, Brazil	Simulation‑Based Calibration of Confidence Sets for Statistical Models
7 Aug 2025	Jixiang Qing Imperial College London	Bayesian Optimization Over Graphs With Shortest-Path Encodings

Date	Speaker	Title	Links
14 May 2026	Chris Nemeth Lancaster University	Hypergraph Generation via Structured Stochastic Diffusion
Hypergraphs model higher-order interactions, but realistic hypergraph generation remains difficult because incidence, hyperedge-size heterogeneity, and overlap structure are not faithfully captured by pairwise reductions. We propose HEDGE, a generative model defined directly on relaxed incidence matrices via a structured stochastic diffusion. The forward process combines a hypergraph-specific two-sided heat operator with an Ornstein–Uhlenbeck component, preserving structure-aware noising near the data while yielding an explicit Gaussian terminal law. Conditional on an observed hypergraph, this forward process is linear-Gaussian, so conditional means, covariances, scores, and reverse-drift targets are available in closed form. We therefore learn a permutation-equivariant state-only reverse-drift field in incidence space by regressing onto exact conditional targets, and generate samples by simulating a learned reverse-time SDE from the Gaussian base law. We establish exactness in the ideal state-only setting together with finite-horizon stability guarantees, and empirically show improved hypergraph generation quality relative to strong baselines.
30 Apr 2026	Liam Llamazares-Elias Lancaster University	Non-stationary Gaussian fields and Penalized Complexity Priors	Slides
Gaussian random fields (GFs) are fundamental tools in spatial modeling and can be represented flexibly and efficiently as solutions to stochastic partial differential equations (SPDEs). The SPDEs depend on specific parameters, which govern various field behaviors and can be estimated using Bayesian inference. Informative priors are essential to ensure meaningful posterior covariance structures. This study builds on previous work by constructing penalized complexity (PC) priors for a smooth, invertible parameterization of the correlation range, diffusion matrix, and variance of a non-stationary GF. The formulated prior is weakly informative, effectively penalizing complexity by pushing the model towards stationarity while allowing for enough flexibility to capture non-stationary behavior. The model is applied to model precipitation in Spain, particulate matter in California, and electoral data in France with promising results.
23 Apr 2026	Rui Zhang Lancaster University	Why Should We Care About Wasserstein Gradient Flows?	Slides
Wasserstein gradient flow (WGF) has emerged as a useful tool in computational statistics and machine learning from both a theoretical and a methodological point of view. From the theoretical side, WGF helps to interpret the bias of unadjusted Langevin, as well as establish new convergence bounds. From the methodological side, WGF formulation fosters novel sampling algorithms such as Stein variational gradient descent. In addition, the WGF formulation allows us to sample from posteriors that are unattainable via classical methods like MCMC, such as the post-Bayesian predictively-oriented posteriors. In this talk, we will survey some of these developments, especially those that are more relevant to BayesComp methodologies, and (hopefully) keep the mathematics approachable.

Date	Speaker	Title	Links
19 Mar 2026	Masha Naslidnyk University College London	Kernel Quantile Embeddings and Associated Probability Metrics
Embedding probability distributions into reproducing kernel Hilbert spaces (RKHS) has enabled powerful non-parametric methods such as the maximum mean discrepancy (MMD), a statistical distance with strong theoretical and computational properties. At its core, the MMD relies on kernel mean embeddings (KMEs) to represent distributions as mean functions in RKHS . However, it remains unclear if the mean function is the only meaningful RKHS representation. Inspired by generalised quantiles, we introduce the notion of kernel quantile embeddings (KQEs), along with a consistent estimator. We then use KQEs to construct a family of distances that (i) are probability metrics under weaker kernel conditions than MMD ;(ii) recover a kernelised form of the sliced Wasserstein distance; and(iii) can be efficiently estimated with near-linear cost. Through hypothesis testing, we show that these distances offer a competitive alternative to MMD and its fast approximations. Our findings demonstrate the value of representing distributions in Hilbert space beyond simple mean functions, paving the way for new avenues of research.
12 Mar 2026	Chen Qi Aalto University	Theoretical understanding of generalization and memorization in generative models	Paper [1] Paper [2] Paper [3] Paper [4]
Generative models are able to create new and diverse samples, yet the mechanisms that let them generalize rather than memorize remain unclear. This talk presents recent theoretical insights into how autoencoders, adversarial models, and diffusion models learn structure from data while controlling model capacity and preventing overfitting. By drawing on ideas from information theory, statistical learning, and training dynamics, the talk offers a clear picture of why these models can generalize effectively and why some of them show a strong natural resistance to memorization. The talk is mainly based on my previous work [3] and will cover introduction to other closely related works [1, 2, 4].
5 Mar 2026	Paul Fearnhead Lancaster University	Feynman-Kac Correctors in Diffusions: In-painting
The talk will discuss the paper “Feynman-Kac Correctors in Diffusions: Annealing, Guidance and Product of Experts” by Skreta et al. In particular I will look at the relationship of their approach to sampling from reward-tilted targets to that of SMC methods for in-painting.
26 Feb 2026	Jonas Latz University of Manchester	Sparse Techniques for Regression in Deep Gaussian Processes	Paper 1 Paper 2
Gaussian processes (GPs) have gained popularity as flexible machine learning models for regression and function approximation with an in-built method for uncertainty quantification. GPs suffer when the amount of training data is large or when the underlying function contains multiscale features that are difficult to represent by an isotropic kernel. The training of GPs with large scale data is often performed through inducing point approximations (also known as sparse GP regression), where the size of the covariance matrices in GP regression is reduced considerably through a greedy search on the data set. Deep Gaussian processes have recently been introduced as hierarchical models that resolve multi-scale features by composing multiple Gaussian processes. Whilst GPs can be trained through a simple analytical formula, deep GPs require a sampling or, more usual, a variational approximation. Variational approximations lead to large-scale stochastic, non-convex optimisation problems and the resulting approximation tends to represent uncertainty incorrectly. In this work, we combine variational learning with MCMC to develop a particle-based expectation-maximisation method to simultaneously find inducing points within the large-scale data (variationally) and accurately train the Gaussian processes (sampling-based). The result is a highly efficient and accurate methodology for deep GP training on large scale data. We test the method on standard benchmark problems. Joint work with Aretha Teckentrup and Simon Urbainczyk.
19 Feb 2026	Chris Sherlock Lancaster University	Robust, partially alive particle Metropolis-Hastings via the Frankenfilter
When a hidden Markov model permits the conditional likelihood of an observation given the hidden process to be zero, all particle simulations from one observation time to the next could produce zeros. If so, the filtering distribution cannot be estimated and the estimated parameter likelihood is zero. The alive particle filter addresses this by simulating a random number of particles for each inter-observation interval, stopping after a target number of non-zero conditional likelihoods. For outlying observations or poor parameter values, a non-zero result can be extremely unlikely, and computational costs prohibitive. We introduce the Frankenfilter, a principled, partially alive particle filter that targets a user-defined amount of success whilst fixing lower and upper bounds on the number of simulations. The Frankenfilter produces unbiased estimators of the likelihood, suitable for pseudo-marginal Metropolis–Hastings (PMMH). We demonstrate that PMMH with the Frankenfilter is more robust to outliers and mis-specified initial parameter values than PMMH using standard particle filters, and is typically at least 2-3 times more efficient. We also provide advice for choosing the amount of success. In the case of n exact observations, this is particularly simple: target n successes.
12 Feb 2026		Informal session
5 Feb 2026	Andre Menezes Maynooth University	Bayesian nonparametric models for zero-inflated count-compositional data using ensembles of regression trees.
Count-compositional data arise in many different fields, including high-throughput microbiome sequencing and palynology experiments, where a common, important goal is to understand how covariates relate to the observed compositions. Existing methods often fail to simultaneously address key challenges inherent in such data, namely: overdispersion, an excess of zeros, cross-sample heterogeneity, and nonlinear covariate effects. In this talk, we first present novel probabilistic portrayals of two multivariate models designed to handle zero-inflation in count-compositional data. Then, to address the above concerns, we propose novel Bayesian nonparametric models based on ensembles of regression trees. Specifically, we leverage the recently introduced zero-and-N-inflated multinomial distribution and assign independent nonparametric Bayesian additive regression tree (BART) priors to both the compositional and structural zero probability components of our model, to flexibly capture covariate effects. We further extend this by adding latent random effects to capture overdispersion and more general dependence structures among the categories. We develop an efficient inferential algorithm combining recent data augmentation schemes with established BART sampling routines. We evaluate our proposed models in simulation studies and illustrate their applicability with two case studies in microbiome and palaeoclimate modelling.
29 Jan 2026	William Laplante University College London	Conjugate Generalised Bayesian Inference for Discrete Doubly Intractable Problems
Doubly intractable problems occur when both the likelihood and the posterior are available only in unnormalised form, with computationally intractable normalisation constants. Bayesian inference then typically requires direct approximation of the posterior through specialised and typically expensive MCMC methods. In this paper, we provide a computationally efficient alternative in the form of a novel generalised Bayesian posterior that allows for conjugate inference within the class of exponential family models for discrete data. We derive theoretical guarantees to characterise the asymptotic behaviour of the generalised posterior, supporting its use for inference. The method is evaluated on a range of challenging intractable exponential family models, including the Conway-Maxwell-Poisson graphical model of multivariate count data, autoregressive discrete time series models, and Markov random fields such as the Ising and Potts models. The computational gains are significant; in our experiments, the method is between 10 and 6000 times faster than state-of-the-art Bayesian computational methods.
22 Jan 2026	Yuga Iguchi Lancaster University	Dynamical regimes of denoising diffusion models for sampling from multimodal distributions
I will discuss the mechanism of denoising diffusion models (DDMs) for sampling from multimodal distributions on $\mathbb{R}^d$. The first part of the talk will review the basics of DDMs — from discrete Markov chains to continuous-time formulations via SDEs. Then, using a mixture of two Gaussians as a canonical example of a multimodal target, I will describe how DDMs gradually transform the initial prior (a standard Gaussian) into the bimodal target distribution. In particular, I will show analytically that denoising trajectories dynamically change their behaviour during sampling, and that the denoising procedure can be characterised roughly by three stages: 1. Early stage — Contraction; 2. Intermediate stage — Expansion (contraction is lost); 3. Final stage — local attraction to a single mode, possibly contracting again locally. I will also clarify how these stages depend on properties of the target distribution, such as dimension, separation between modes, and the variances of the mixture components. This talk is based on ongoing joint work with Paul Fearnhead.
15 Jan 2026	Hefin Lambley University of Warwick	Autoencoders in function space	Slides
We propose function-space versions of autoencoders—machine-learning methods for dimension reduction and generative modelling—in both their deterministic (FAE) and variational (FVAE) forms. Formulating autoencoder objectives in function space enables training and evaluation with data discretised at arbitrary resolutions, leading to new applications such as inpainting, superresolution, and generative modelling. We discuss the technical challenges of formulating autoencoders in infinite dimension. A key issue is that FVAE’s variational inference is often ill defined, unlike in finite dimensions, limiting its applicability. We then explore specific problem classes where FVAE remains useful. We contrast this with the FAE objective, which remains well defined in many situations where FVAE fails, making it a robust and versatile alternative. We demonstrate both methods on scientific data sets, including Navier–Stokes fluid flow simulations. This is joint work with Justin Bunker and Mark Girolami (Cambridge), Andrew M. Stuart (Caltech) and T. J. Sullivan (Warwick).

Date	Speaker	Title	Links
11 Dec 2025	Zheyang Shen Newcastle University	A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives	Slides
Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised. This increased flexibility introduces a computational challenge, as one loses access to an explicit unnormalised density for the target. To mitigate this difficulty, we introduce a novel measure of suboptimality called ‘gradient discrepancy’, and in particular a ‘kernel’ gradient discrepancy (KGD) that can be explicitly computed. In the standard Bayesian context, KGD coincides with the kernel Stein discrepancy (KSD), and we obtain a novel characterisation of KSD as measuring the size of a variational gradient. Outside this familiar setting, KGD enables novel sampling algorithms to be developed and compared, even when unnormalised densities cannot be obtained. To illustrate this point several novel algorithms are proposed and studied, including a natural generalisation of Stein variational gradient descent, with applications to mean-field neural networks and predictively oriented posteriors presented. On the theoretical side, our principal contribution is to establish sufficient conditions for desirable properties of KGD, such as continuity and convergence control.
4 Dec 2025	Giorgos Vasdekis	Sampling with time-changed Markov processes	Slides
We introduce a framework of time-changed Markov processes to speed up the convergence of Markov chain Monte Carlo (MCMC) algorithms in the context of multimodal distributions and rare event simulation. The time-changed process is defined by adjusting the speed of time of a base process via a user-chosen, state-dependent function. We apply this framework to several Markov processes from the MCMC literature, such as Langevin diffusions and piecewise deterministic Markov processes, obtaining novel modifications of classical algorithms and also re-discovering known MCMC algorithms. We prove theoretical properties of the time-changed process under suitable conditions on the base process, focusing on connecting the stationary distributions and qualitative convergence properties such as geometric and uniform ergodicity, as well as a functional central limit theorem. Time permitting, we will compare our approach with the framework of space transformations, clarifying the similarities between the approaches. This is joint work with Andrea Bertazzi.
20 Nov 2025	Lanya Yang Lancaster University	Exchangeable Particle Gibbs for Markov Jump Processes	Slides
Inference in stochastic reaction-network models—such as the SEIR epidemic model or the Lotka–Volterra predator–prey system—is crucial for understanding the dynamics of interacting systems in epidemiology, ecology, and systems biology. These models are typically represented as Markov jump processes (MJPs) with intractable likelihoods. As a result, particle Markov chain Monte Carlo (particle MCMC) methods, particularly the Particle Gibbs (PG) sampler, have become standard tools for Bayesian inference. However, PG suffers from severe particle degeneracy, especially in high-dimensional state spaces, leading to poor mixing and inefficiency. In this talk, I focus on improving the efficiency of particle MCMC methods for inference in reaction networks by addressing the degeneracy problem. Building on recent work on the Exchangeable Particle Gibbs (xPG) sampler for continuous-state diffusions, this project develops a novel version of xPG tailored to discrete-state reaction networks, where randomness is driven by Poisson processes rather than Brownian motion. The proposed method retains the exchangeability framework of xPG while adapting it to the structural and computational challenges specific to reaction networks.
30 Oct 2025	Rui Zhang Lancaster University	A Dynamic Perspective of Matern Gaussian Processes	Slides HTML Slides
The ubiquitous Gaussian process (GP) models in statistics and machine learning (Williams and Rasmussen; 2006) are static by default, either using the weight-space or function-space views (Kanagawa et al.; 2025), where the observation and test locations have no unilateral dependency order, and this also explains the cubic scalability in computational costs for GP regressions. On the other hand, the dynamic view of Gaussian processes, while only available for a class of GP models, reformulates the dependency structure unilaterally (Whittle; 1954) to enable sequential inferences for GP regressions with computational costs that could scale linearly (Hartikainen and Sarkka;2010; Sarkka and Hartikainen; 2012) with little to no approximation. This talk explores this dynamic perspective of (Matern) Gaussian processes and some consequences of this perspective.
16 Oct 2025	Henry Moss Lancaster University	GPGreen: Linear Operator Learning with Gaussian Processes
4 Sep 2025	Rafael Izbicki Federal University of São Carlos, Brazil	Simulation‑Based Calibration of Confidence Sets for Statistical Models
7 Aug 2025	Jixiang Qing Imperial College London	Bayesian Optimization Over Graphs With Shortest-Path Encodings