34 research outputs found

    Pre-processing for approximate Bayesian computation in image analysis

    Get PDF
    Most of the existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. Images encountered in real world applications can have millions of pixels, therefore scalability is a major concern. We apply ABC with a synthetic likelihood to the hidden Potts model with additive Gaussian noise. Using a pre-processing step, we fit a binding function to model the relationship between the model parameters and the synthetic likelihood parameters. Our numerical experiments demonstrate that the precomputed binding function dramatically improves the scalability of ABC, reducing the average runtime required for model fitting from 71 hours to only 7 minutes. We also illustrate the method by estimating the smoothing parameter for remotely sensed satellite imagery. Without precomputation, Bayesian inference is impractical for datasets of that scale.Comment: 5th IMS-ISBA joint meeting (MCMSki IV

    Bayesian quantification for coherent anti-Stokes Raman scattering spectroscopy

    Full text link
    We propose a Bayesian statistical model for analyzing coherent anti-Stokes Raman scattering (CARS) spectra. Our quantitative analysis includes statistical estimation of constituent line-shape parameters, underlying Raman signal, error-corrected CARS spectrum, and the measured CARS spectrum. As such, this work enables extensive uncertainty quantification in the context of CARS spectroscopy. Furthermore, we present an unsupervised method for improving spectral resolution of Raman-like spectra requiring little to no \textit{a priori} information. Finally, the recently-proposed wavelet prism method for correcting the experimental artefacts in CARS is enhanced by using interpolation techniques for wavelets. The method is validated using CARS spectra of adenosine mono-, di-, and triphosphate in water, as well as, equimolar aqueous solutions of D-fructose, D-glucose, and their disaccharide combination sucrose

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Integrating XSL-FO with Enterprise Reporting

    Get PDF
    This paper discusses a project to integrate the processing of XSL Formatting Objects (XSL-FO) within an enterprise reporting solution. The software components utilised in the implementation form part of Oracle eBusiness Suite. However, the findings from this project are applicable to a range of XML-based technologies, independent of vendor. The Report Manager project is unusual in a number of ways, the main one being the use of Microsoft Excel spreadsheets as a medium for XSL-FO output, as well as for editing the XSL-FO templates. Excel is ubiquitous in business, and it is the expected familiarity of our target users with this tool that motivates this approach. The spreadsheet medium also provides users with additional means for interacting with the data and performing further analysis. It has clear advantages over PDF or HTML output for this purpose. XSL-FO provides a high degree of control over the visual representation of an XML document, in an output-independent manner. The same template can be used to render a document to PDF, HTML and Excel (as well as any other output media supported by the XSL-FO rendering engine). This enables users to select the output medium that is most appropriate for the task at hand. HTML is useful for previewing the report in a browser without loading any external applications. PDF output gives the most accurate representation of the printed document, and is platform-independent. XSL-FO also meets the need for high-fidelity presentation of published reports. The end goal of such a project is to achieve pixel perfect reproduction of the document, on all of the available output media. This paper will discuss the extent to which we believe we achieved that goal, and the challenges that we have faced in doing so

    Analysis of Cone-Beam CT using prior information

    Get PDF
    Treatment plans for conformal radiotherapy are based on an initial CT scan. The aim is to deliver the prescribed dose to the tumour, while minimising exposure to nearby organs. Recent advances make it possible to also obtain a Cone-Beam CT (CBCT) scan, once the patient has been positioned for treatment. A statistical model will be developed to compare these CBCT scans with the initial CT scan. Changes in the size, shape and position of the tumour and organs will be detected and quantified. Some progress has already been made in segmentation of prostate CBCT scans [1],[2],[3]. However, none of the existing approaches have taken full advantage of the prior information that is available. The planning CT scan is expertly annotated with contours of the tumour and nearby sensitive objects. This data is specific to the individual patient and can be viewed as a snapshot of spatial information at a point in time. There is an abundance of studies in the radiotherapy literature that describe the amount of variation in the relevant organs between treatments. The findings from these studies can form a basis for estimating the degree of uncertainty. All of this information can be incorporated as an informative prior into a Bayesian statistical model. This model will be developed using scans of CT phantoms, which are objects with known geometry. Thus, the accuracy of the model can be evaluated objectively. This will also enable comparison between alternative models

    Bayesian approaches to spatial inference: Modelling and computational challenges and solutions

    Get PDF
    We discuss a range of Bayesian modelling approaches for spatial data and investigate some of the associated computational challenges. This paper commences with a brief review of Bayesian mixture models and Markov random fields, with enabling computational algorithms including Markov chain Monte Carlo (MCMC) and integrated nested Laplace approximation (INLA). Following this, we focus on the Potts model as a canonical approach, and discuss the challenge of estimating the inverse temperature parameter that controls the degree of spatial smoothing. We compare three approaches to addressing the doubly intractable nature of the likelihood, namely pseudo-likelihood, path sampling and the exchange algorithm. These techniques are applied to satellite data used to analyse water quality in the Great Barrier Reef

    BAYESIAN COMPUTATIONAL METHODS FOR SPATIAL ANALYSIS OF IMAGES

    No full text

    Bayesian computational methods for spatial analysis of images

    Get PDF
    This thesis introduces a new way of using prior information in a spatial model and develops scalable algorithms for fitting this model to large imaging datasets. These methods are employed for image-guided radiation therapy and satellite based classification of land use and water quality. This study has utilized a pre-computation step to achieve a hundredfold improvement in the elapsed runtime for model fitting. This makes it much more feasible to apply these models to real-world problems, and enables full Bayesian inference for images with a million or more pixels

    Accelerating pseudo-marginal MCMC using Gaussian processes

    Get PDF
    The grouped independence Metropolis–Hastings (GIMH) and Markov chain within Metropolis (MCWM) algorithms are pseudo-marginal methods used to perform Bayesian inference in latent variable models. These methods replace intractable likelihood calculations with unbiased estimates within Markov chain Monte Carlo algorithms. The GIMH method has the posterior of interest as its limiting distribution, but suffers from poor mixing if it is too computationally intensive to obtain high-precision likelihood estimates. The MCWM algorithm has better mixing properties, but tends to give conservative approximations of the posterior and is still expensive. A new method is developed to accelerate the GIMH method by using a Gaussian process (GP) approximation to the log-likelihood and train this GP using a short pilot run of the MCWM algorithm. This new method called GP-GIMH is illustrated on simulated data from a stochastic volatility and a gene network model. The new approach produces reasonable posterior approximations in these examples with at least an order of magnitude improvement in computing time. Code to implement the method for the gene network example can be found at http://www.runmycode.org/companion/view/266
    corecore