61 research outputs found

    Particle-MALA and Particle-mGRAD: Gradient-based MCMC methods for high-dimensional state-space models

    Full text link
    State-of-the-art methods for Bayesian inference in state-space models are (a) conditional sequential Monte Carlo (CSMC) algorithms; (b) sophisticated 'classical' MCMC algorithms like MALA, or mGRAD from Titsias and Papaspiliopoulos (2018, arXiv:1610.09641v3 [stat.ML]). The former propose NN particles at each time step to exploit the model's 'decorrelation-over-time' property and thus scale favourably with the time horizon, TT , but break down if the dimension of the latent states, DD, is large. The latter leverage gradient-/prior-informed local proposals to scale favourably with DD but exhibit sub-optimal scalability with TT due to a lack of model-structure exploitation. We introduce methods which combine the strengths of both approaches. The first, Particle-MALA, spreads NN particles locally around the current state using gradient information, thus extending MALA to T>1T > 1 time steps and N>1N > 1 proposals. The second, Particle-mGRAD, additionally incorporates (conditionally) Gaussian prior dynamics into the proposal, thus extending the mGRAD algorithm to T>1T > 1 time steps and N>1N > 1 proposals. We prove that Particle-mGRAD interpolates between CSMC and Particle-MALA, resolving the 'tuning problem' of choosing between CSMC (superior for highly informative prior dynamics) and Particle-MALA (superior for weakly informative prior dynamics). We similarly extend other 'classical' MCMC approaches like auxiliary MALA, aGRAD, and preconditioned Crank-Nicolson-Langevin (PCNL) to T>1T > 1 time steps and N>1N > 1 proposals. In experiments, for both highly and weakly informative prior dynamics, our methods substantially improve upon both CSMC and sophisticated 'classical' MCMC approaches.Comment: 29 pages + 31 pages appendix. 6 figures and tables (+ 7 in appendix). Code available at https://github.com/AdrienCorenflos/particle_mala

    Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation

    Full text link
    Continual learning refers to the capability of a machine learning model to learn and adapt to new information, without compromising its performance on previously learned tasks. Although several studies have investigated continual learning methods for information retrieval tasks, a well-defined task formulation is still lacking, and it is unclear how typical learning strategies perform in this context. To address this challenge, a systematic task formulation of continual neural information retrieval is presented, along with a multiple-topic dataset that simulates continuous information retrieval. A comprehensive continual neural information retrieval framework consisting of typical retrieval models and continual learning strategies is then proposed. Empirical evaluations illustrate that the proposed framework can successfully prevent catastrophic forgetting in neural information retrieval and enhance performance on previously learned tasks. The results indicate that embedding-based retrieval models experience a decline in their continual learning performance as the topic shift distance and dataset volume of new tasks increase. In contrast, pretraining-based models do not show any such correlation. Adopting suitable learning strategies can mitigate the effects of topic shift and data augmentation.Comment: Submitted to Information Science

    On extended state-space constructions for monte carlo methods

    Get PDF
    This thesis develops computationally efficient methodology in two areas. Firstly, we consider a particularly challenging class of discretely observed continuous-time point-process models. For these, we analyse and improve an existing filtering algorithm based on sequential Monte Carlo (smc) methods. To estimate the static parameters in such models, we devise novel particle Gibbs samplers. One of these exploits a sophisticated non-entred parametrisation whose benefits in a Markov chain Monte Carlo (mcmc) context have previously been limited by the lack of blockwise updates for the latent point process. We apply this algorithm to a Lévy-driven stochastic volatility model. Secondly, we devise novel Monte Carlo methods – based around pseudo-marginal and conditional smc approaches – for performing optimisation in latent-variable models and more generally. To ease the explanation of the wide range of techniques employed in this work, we describe a generic importance-sampling framework which admits virtually all Monte Carlo methods, including smc and mcmc methods, as special cases. Indeed, hierarchical combinations of different Monte Carlo schemes such as smc within mcmc or smc within smc can be justified as repeated applications of this framework

    Efficient sequential Monte Carlo algorithms for integrated population models

    Get PDF
    In statistical ecology, state-space models are commonly used to represent the biological mechanisms by which population counts—often subdivided according to characteristics such as age group, gender or breeding status—evolve over time. As the counts are only noisily or partially observed, they are typically not sufficiently informative about demographic parameters of interest and must be combined with additional ecological observations within an integrated data analysis. Fitting integrated models can be challenging, especially if the constituent state-space model is nonlinear/non-Gaussian. We first propose an efficient particle Markov chain Monte Carlo algorithm to estimate demographic parameters without a need for linear or Gaussian approximations. We then incorporate this algorithm into a sequential Monte Carlo sampler to perform model comparison. We also exploit the integrated model structure to enhance the efficiency of both algorithms. The methods are demonstrated on two real data sets: little owls and grey herons. For the owls, we find that the data do not support an ecological hypothesis found in the literature. For the herons, our methodology highlights the limitations of existing models which we address through a novel regime-switching model. Supplementary materials accompanying this paper appear online

    Neural-based cross-modal search and retrieval of artwork

    Get PDF
    Creating an intelligent search and retrieval system for artwork images, particularly paintings, is crucial for documenting cultural heritage, fostering wider public engagement, and advancing artistic analysis and interpretation. Visual-Semantic Embedding (VSE) networks are deep learning models used for information retrieval, which learn joint representations of textual and visual data, enabling 1) cross-modal search and retrieval tasks, such as image-to-text and text-to-image retrieval; and 2) relation-focused retrieval to capture entity relationships and provide more contextually relevant search results. Although VSE networks have played a significant role in cross-modal information retrieval, their application to painting datasets, such as ArtUK, remains unexplored. This paper introduces BoonArt, a VSE-based cross-modal search engine that allows users to search for images using textual queries, and to obtain textual descriptions along with the corresponding images when using image queries. The performance of BoonArt was evaluated using the ArtUK dataset. Experimental evaluations revealed that BoonArt achieved 97 % Recall@10 for image-to-text retrieval, and 97.4 % Recall@10 for text-to-image Retrieval. By bridging the gap between textual and visual modalities, BoonArt provides a much-improved search performance compared to traditional search engines, such as the one provided by the ArtUK website. BoonArt can be utilised to work with other artwork datasets.</p

    Morphological Image Analysis and Feature Extraction for Reasoning with AI-based Defect Detection and Classification Models

    Full text link
    As the use of artificial intelligent (AI) models becomes more prevalent in industries such as engineering and manufacturing, it is essential that these models provide transparent reasoning behind their predictions. This paper proposes the AI-Reasoner, which extracts the morphological characteristics of defects (DefChars) from images and utilises decision trees to reason with the DefChar values. Thereafter, the AI-Reasoner exports visualisations (i.e. charts) and textual explanations to provide insights into outputs made by masked-based defect detection and classification models. It also provides effective mitigation strategies to enhance data pre-processing and overall model performance. The AI-Reasoner was tested on explaining the outputs of an IE Mask R-CNN model using a set of 366 images containing defects. The results demonstrated its effectiveness in explaining the IE Mask R-CNN model's predictions. Overall, the proposed AI-Reasoner provides a solution for improving the performance of AI models in industrial applications that require defect analysis.Comment: 8 pages, 3 figures, 5 tables; submitted to 2023 IEEE symposium series on computational intelligence (SSCI

    Limit theorems for sequential MCMC methods

    Get PDF
    Both sequential Monte Carlo (SMC) methods (a.k.a. ‘particle filters’) and sequential Markov chain Monte Carlo (sequential MCMC) methods constitute classes of algorithms which can be used to approximate expectations with respect to (a sequence of) probability distributions and their normalising constants. While SMC methods sample particles conditionally independently at each time step, sequential MCMC methods sample particles according to a Markov chain Monte Carlo (MCMC) kernel. Introduced over twenty years ago in [6], sequential MCMC methods have attracted renewed interest recently as they empirically outperform SMC methods in some applications. We establish an -inequality (which implies a strong law of large numbers) and a central limit theorem for sequential MCMC methods and provide conditions under which errors can be controlled uniformly in time. In the context of state-space models, we also provide conditions under which sequential MCMC methods can indeed outperform standard SMC methods in terms of asymptotic variance of the corresponding Monte Carlo estimators

    Prognostic Factors Affecting Outcome after Allogeneic Transplantation for Hematological Malignancies from Unrelated Donors: Results from a Randomized Trial

    Get PDF
    Several prognostic factors for the outcome after allogeneic hematopoietic stem-cell transplant (HSCT) from matched unrelated donors have been postulated from registry data; however, data from randomized trials are lacking. We present analyses on the effects of patient-related, donor-related, and treatment-related prognostic factors on acute GVHD (aGVHD), chronic GVHD (cGVHD), relapse, nonrelapse mortality (NRM), disease-free survival (DFS), and overall survival (OS) in a randomized, multicenter, open-label, phase III trial comparing standard graft-versus-host-disease (GVHD) prophylaxis with and without pretransplantation ATG-Fresenius (ATG-F) in 201 adult patients receiving myeloablative conditioning before HSCT from HLA-A, HLA-B antigen, HLA-DRB1, HLA-DQB1 allele matched unrelated donors. High-resolution testing (allele) of HLA-A, HLA-B, and HLA-C were obtained after study closure, and the impact of an HLA 10/10 4-digit mismatch on outcome and on the treatment effect of ATG-F versus control investigated. Advanced disease was a negative factor for relapse, DFS, and OS. Donor age ≥40 adversely affected the risk of aGVHD III-IV, extensive cGVHD, and OS. Younger donors are to be preferred in unrelated donor transplantation. Advanced disease patients need special precautions to improve outcome. The degree of mismatch had no major influence on the positive effect of ATG-F on the reduction of aGVHD and cGVHD
    • …
    corecore