61 research outputs found
Particle-MALA and Particle-mGRAD: Gradient-based MCMC methods for high-dimensional state-space models
State-of-the-art methods for Bayesian inference in state-space models are (a)
conditional sequential Monte Carlo (CSMC) algorithms; (b) sophisticated
'classical' MCMC algorithms like MALA, or mGRAD from Titsias and
Papaspiliopoulos (2018, arXiv:1610.09641v3 [stat.ML]). The former propose
particles at each time step to exploit the model's 'decorrelation-over-time'
property and thus scale favourably with the time horizon, , but break down
if the dimension of the latent states, , is large. The latter leverage
gradient-/prior-informed local proposals to scale favourably with but
exhibit sub-optimal scalability with due to a lack of model-structure
exploitation. We introduce methods which combine the strengths of both
approaches. The first, Particle-MALA, spreads particles locally around the
current state using gradient information, thus extending MALA to time
steps and proposals. The second, Particle-mGRAD, additionally
incorporates (conditionally) Gaussian prior dynamics into the proposal, thus
extending the mGRAD algorithm to time steps and proposals. We
prove that Particle-mGRAD interpolates between CSMC and Particle-MALA,
resolving the 'tuning problem' of choosing between CSMC (superior for highly
informative prior dynamics) and Particle-MALA (superior for weakly informative
prior dynamics). We similarly extend other 'classical' MCMC approaches like
auxiliary MALA, aGRAD, and preconditioned Crank-Nicolson-Langevin (PCNL) to time steps and proposals. In experiments, for both highly and
weakly informative prior dynamics, our methods substantially improve upon both
CSMC and sophisticated 'classical' MCMC approaches.Comment: 29 pages + 31 pages appendix. 6 figures and tables (+ 7 in appendix).
Code available at https://github.com/AdrienCorenflos/particle_mala
Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation
Continual learning refers to the capability of a machine learning model to
learn and adapt to new information, without compromising its performance on
previously learned tasks. Although several studies have investigated continual
learning methods for information retrieval tasks, a well-defined task
formulation is still lacking, and it is unclear how typical learning strategies
perform in this context. To address this challenge, a systematic task
formulation of continual neural information retrieval is presented, along with
a multiple-topic dataset that simulates continuous information retrieval. A
comprehensive continual neural information retrieval framework consisting of
typical retrieval models and continual learning strategies is then proposed.
Empirical evaluations illustrate that the proposed framework can successfully
prevent catastrophic forgetting in neural information retrieval and enhance
performance on previously learned tasks. The results indicate that
embedding-based retrieval models experience a decline in their continual
learning performance as the topic shift distance and dataset volume of new
tasks increase. In contrast, pretraining-based models do not show any such
correlation. Adopting suitable learning strategies can mitigate the effects of
topic shift and data augmentation.Comment: Submitted to Information Science
On extended state-space constructions for monte carlo methods
This thesis develops computationally efficient methodology in two areas. Firstly, we consider a particularly challenging class of discretely observed continuous-time point-process models. For these, we analyse and improve an existing filtering algorithm based on sequential Monte Carlo (smc) methods. To estimate the static parameters in such models, we devise novel particle Gibbs samplers. One of these exploits a sophisticated non-entred parametrisation whose benefits in a Markov chain Monte Carlo (mcmc) context have previously been limited by the lack of blockwise updates for the latent point process. We apply this algorithm to a Lévy-driven stochastic volatility model. Secondly, we devise novel Monte Carlo methods – based around pseudo-marginal and conditional smc approaches – for performing optimisation in latent-variable models and more generally. To ease the explanation of the wide range of techniques employed in this work, we describe a generic importance-sampling framework which admits virtually all Monte Carlo methods, including smc and mcmc methods, as special cases. Indeed, hierarchical combinations of different Monte Carlo schemes such as smc within mcmc or smc within smc can be justified as repeated applications of this framework
Efficient sequential Monte Carlo algorithms for integrated population models
In statistical ecology, state-space models are commonly used to represent the biological mechanisms by which population counts—often subdivided according to characteristics such as age group, gender or breeding status—evolve over time. As the counts are only noisily or partially observed, they are typically not sufficiently informative about demographic parameters of interest and must be combined with additional ecological observations within an integrated data analysis. Fitting integrated models can be challenging, especially if the constituent state-space model is nonlinear/non-Gaussian. We first propose an efficient particle Markov chain Monte Carlo algorithm to estimate demographic parameters without a need for linear or Gaussian approximations. We then incorporate this algorithm into a sequential Monte Carlo sampler to perform model comparison. We also exploit the integrated model structure to enhance the efficiency of both algorithms. The methods are demonstrated on two real data sets: little owls and grey herons. For the owls, we find that the data do not support an ecological hypothesis found in the literature. For the herons, our methodology highlights the limitations of existing models which we address through a novel regime-switching model. Supplementary materials accompanying this paper appear online
Neural-based cross-modal search and retrieval of artwork
Creating an intelligent search and retrieval system for artwork images, particularly paintings, is crucial for documenting cultural heritage, fostering wider public engagement, and advancing artistic analysis and interpretation. Visual-Semantic Embedding (VSE) networks are deep learning models used for information retrieval, which learn joint representations of textual and visual data, enabling 1) cross-modal search and retrieval tasks, such as image-to-text and text-to-image retrieval; and 2) relation-focused retrieval to capture entity relationships and provide more contextually relevant search results. Although VSE networks have played a significant role in cross-modal information retrieval, their application to painting datasets, such as ArtUK, remains unexplored. This paper introduces BoonArt, a VSE-based cross-modal search engine that allows users to search for images using textual queries, and to obtain textual descriptions along with the corresponding images when using image queries. The performance of BoonArt was evaluated using the ArtUK dataset. Experimental evaluations revealed that BoonArt achieved 97 % Recall@10 for image-to-text retrieval, and 97.4 % Recall@10 for text-to-image Retrieval. By bridging the gap between textual and visual modalities, BoonArt provides a much-improved search performance compared to traditional search engines, such as the one provided by the ArtUK website. BoonArt can be utilised to work with other artwork datasets.</p
Morphological Image Analysis and Feature Extraction for Reasoning with AI-based Defect Detection and Classification Models
As the use of artificial intelligent (AI) models becomes more prevalent in
industries such as engineering and manufacturing, it is essential that these
models provide transparent reasoning behind their predictions. This paper
proposes the AI-Reasoner, which extracts the morphological characteristics of
defects (DefChars) from images and utilises decision trees to reason with the
DefChar values. Thereafter, the AI-Reasoner exports visualisations (i.e.
charts) and textual explanations to provide insights into outputs made by
masked-based defect detection and classification models. It also provides
effective mitigation strategies to enhance data pre-processing and overall
model performance. The AI-Reasoner was tested on explaining the outputs of an
IE Mask R-CNN model using a set of 366 images containing defects. The results
demonstrated its effectiveness in explaining the IE Mask R-CNN model's
predictions. Overall, the proposed AI-Reasoner provides a solution for
improving the performance of AI models in industrial applications that require
defect analysis.Comment: 8 pages, 3 figures, 5 tables; submitted to 2023 IEEE symposium series
on computational intelligence (SSCI
Limit theorems for sequential MCMC methods
Both sequential Monte Carlo (SMC) methods (a.k.a. ‘particle filters’) and sequential Markov chain Monte Carlo (sequential MCMC) methods constitute classes of algorithms which can be used to approximate expectations with respect to (a sequence of) probability distributions and their normalising constants. While SMC methods sample particles conditionally independently at each time step, sequential MCMC methods sample particles according to a Markov chain Monte Carlo (MCMC) kernel. Introduced over twenty years ago in [6], sequential MCMC methods have attracted renewed interest recently as they empirically outperform SMC methods in some applications. We establish an -inequality (which implies a strong law of large numbers) and a central limit theorem for sequential MCMC methods and provide conditions under which errors can be controlled uniformly in time. In the context of state-space models, we also provide conditions under which sequential MCMC methods can indeed outperform standard SMC methods in terms of asymptotic variance of the corresponding Monte Carlo estimators
Prognostic Factors Affecting Outcome after Allogeneic Transplantation for Hematological Malignancies from Unrelated Donors: Results from a Randomized Trial
Several prognostic factors for the outcome after allogeneic hematopoietic stem-cell transplant (HSCT) from matched unrelated donors have been postulated from registry data; however, data from randomized trials are lacking. We present analyses on the effects of patient-related, donor-related, and treatment-related prognostic factors on acute GVHD (aGVHD), chronic GVHD (cGVHD), relapse, nonrelapse mortality (NRM), disease-free survival (DFS), and overall survival (OS) in a randomized, multicenter, open-label, phase III trial comparing standard graft-versus-host-disease (GVHD) prophylaxis with and without pretransplantation ATG-Fresenius (ATG-F) in 201 adult patients receiving myeloablative conditioning before HSCT from HLA-A, HLA-B antigen, HLA-DRB1, HLA-DQB1 allele matched unrelated donors. High-resolution testing (allele) of HLA-A, HLA-B, and HLA-C were obtained after study closure, and the impact of an HLA 10/10 4-digit mismatch on outcome and on the treatment effect of ATG-F versus control investigated. Advanced disease was a negative factor for relapse, DFS, and OS. Donor age ≥40 adversely affected the risk of aGVHD III-IV, extensive cGVHD, and OS. Younger donors are to be preferred in unrelated donor transplantation. Advanced disease patients need special precautions to improve outcome. The degree of mismatch had no major influence on the positive effect of ATG-F on the reduction of aGVHD and cGVHD
- …