8,772 research outputs found
Approximate Bayesian Computation by Modelling Summary Statistics in a Quasi-likelihood Framework
Approximate Bayesian Computation (ABC) is a useful class of methods for
Bayesian inference when the likelihood function is computationally intractable.
In practice, the basic ABC algorithm may be inefficient in the presence of
discrepancy between prior and posterior. Therefore, more elaborate methods,
such as ABC with the Markov chain Monte Carlo algorithm (ABC-MCMC), should be
used. However, the elaboration of a proposal density for MCMC is a sensitive
issue and very difficult in the ABC setting, where the likelihood is
intractable. We discuss an automatic proposal distribution useful for ABC-MCMC
algorithms. This proposal is inspired by the theory of quasi-likelihood (QL)
functions and is obtained by modelling the distribution of the summary
statistics as a function of the parameters. Essentially, given a real-valued
vector of summary statistics, we reparametrize the model by means of a
regression function of the statistics on parameters, obtained by sampling from
the original model in a pilot-run simulation study. The QL theory is well
established for a scalar parameter, and it is shown that when the conditional
variance of the summary statistic is assumed constant, the QL has a closed-form
normal density. This idea of constructing proposal distributions is extended to
non constant variance and to real-valued parameter vectors. The method is
illustrated by several examples and by an application to a real problem in
population genetics.Comment: Published at http://dx.doi.org/10.1214/14-BA921 in the Bayesian
Analysis (http://projecteuclid.org/euclid.ba) by the International Society of
Bayesian Analysis (http://bayesian.org/
A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks
An explosion of high-throughput DNA sequencing in the past decade has led to
a surge of interest in population-scale inference with whole-genome data.
Recent work in population genetics has centered on designing inference methods
for relatively simple model classes, and few scalable general-purpose inference
techniques exist for more realistic, complex models. To achieve this, two
inferential challenges need to be addressed: (1) population data are
exchangeable, calling for methods that efficiently exploit the symmetries of
the data, and (2) computing likelihoods is intractable as it requires
integrating over a set of correlated, extremely high-dimensional latent
variables. These challenges are traditionally tackled by likelihood-free
methods that use scientific simulators to generate datasets and reduce them to
hand-designed, permutation-invariant summary statistics, often leading to
inaccurate inference. In this work, we develop an exchangeable neural network
that performs summary statistic-free, likelihood-free inference. Our framework
can be applied in a black-box fashion across a variety of simulation-based
tasks, both within and outside biology. We demonstrate the power of our
approach on the recombination hotspot testing problem, outperforming the
state-of-the-art.Comment: 9 pages, 8 figure
Chaste: a test-driven approach to software development for biological modelling
Chaste (‘Cancer, heart and soft-tissue environment’) is a software library and a set of test suites for computational simulations in the domain of biology. Current functionality has arisen from modelling in the fields of cancer, cardiac physiology and soft-tissue mechanics. It is released under the LGPL 2.1 licence.\ud
\ud
Chaste has been developed using agile programming methods. The project began in 2005 when it was reasoned that the modelling of a variety of physiological phenomena required both a generic mathematical modelling framework, and a generic computational/simulation framework. The Chaste project evolved from the Integrative Biology (IB) e-Science Project, an inter-institutional project aimed at developing a suitable IT infrastructure to support physiome-level computational modelling, with a primary focus on cardiac and cancer modelling
Biomedical information extraction for matching patients to clinical trials
Digital Medical information had an astonishing growth on the last decades, driven
by an unprecedented number of medical writers, which lead to a complete revolution in
what and how much information is available to the health professionals.
The problem with this wave of information is that performing a precise selection of
the information retrieved by medical information repositories is very exhaustive and time
consuming for physicians. This is one of the biggest challenges for physicians with the
new digital era: how to reduce the time spent finding the perfect matching document for a
patient (e.g. intervention articles, clinical trial, prescriptions).
Precision Medicine (PM) 2017 is the track by the Text REtrieval Conference (TREC),
that is focused on this type of challenges exclusively for oncology. Using a dataset with a
large amount of clinical trials, this track is a good real life example on how information
retrieval solutions can be used to solve this types of problems. This track can be a very
good starting point for applying information extraction and retrieval methods, in a very
complex domain.
The purpose of this thesis is to improve a system designed by the NovaSearch team
for TREC PM 2017 Clinical Trials task, which got ranked on the top-5 systems of 2017.
The NovaSearch team also participated on the 2018 track and got a 15% increase on
precision compared to the 2017 one. It was used multiple IR techniques for information
extraction and processing of data, including rank fusion, query expansion (e.g. Pseudo
relevance feedback, Mesh terms expansion) and experiments with Learning to Rank
(LETOR) algorithms. Our goal is to retrieve the best possible set of trials for a given
patient, using precise documents filters to exclude the unwanted clinical trials. This work
can open doors in what can be done for searching and perceiving the criteria to exclude or
include the trials, helping physicians even on the more complex and difficult information
retrieval tasks
- …