Search CORE

614 research outputs found

Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization

Author: A. H. Mueller
B. Derrida
C. Bender
D. Simon
J. F. C. Kingman
J. Schweinsberg
M. Mézard
M. Mézard
S. Munier
S. Tavaré
É. Brunet
Publication venue: 'American Physical Society (APS)'
Publication date: 25/04/2007
Field of study

We consider a family of models describing the evolution under selection of a population whose dynamics can be related to the propagation of noisy traveling waves. For one particular model, that we shall call the exponential model, the properties of the traveling wave front can be calculated exactly, as well as the statistics of the genealogy of the population. One striking result is that, for this particular model, the genealogical trees have the same statistics as the trees of replicas in the Parisi mean-field theory of spin glasses. We also find that in the exponential model, the coalescence times along these trees grow like the logarithm of the population size. A phenomenological picture of the propagation of wave fronts that we introduced in a previous work, as well as our numerical data, suggest that these statistics remain valid for a larger class of models, while the coalescence times grow like the cube of the logarithm of the population size.Comment: 26 page

arXiv.org e-Print Archive

Crossref

The Time Machine: A Simulation Approach for Stochastic Trees

Author: Dempster A. P.
Edwards A. W. F.
Fearnhead P.
Gorur D.
Hudson R. R.
Kuhner M. K.
Stephens M.
Tavaré S.
Wilson I. J.
Publication venue: 'The Royal Society'
Publication date: 26/09/2010
Field of study

In the following paper we consider a simulation technique for stochastic trees. One of the most important areas in computational genetics is the calculation and subsequent maximization of the likelihood function associated to such models. This typically consists of using importance sampling (IS) and sequential Monte Carlo (SMC) techniques. The approach proceeds by simulating the tree, backward in time from observed data, to a most recent common ancestor (MRCA). However, in many cases, the computational time and variance of estimators are often too high to make standard approaches useful. In this paper we propose to stop the simulation, subsequently yielding biased estimates of the likelihood surface. The bias is investigated from a theoretical point of view. Results from simulation studies are also given to investigate the balance between loss of accuracy, saving in computing time and variance reduction.Comment: 22 Pages, 5 Figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

UCL Discovery

Evolution of the most recent common ancestor of a population with no selection

Author: Bernard Derrida
Damien Simon
Derrida B
Drummond A J
Durrett R
Fisher R A
Fu Y-X
Hein J
Kaplan N L
Kimura M
Kimura M
Schweinsberg J
Serva M
Tajima F
Tavaré S
Wakeley J
Wiuf C
Wright S
Publication venue: 'IOP Publishing'
Publication date: 03/03/2006
Field of study

We consider the evolution of a population of fixed size with no selection. The number of generations

G

to reach the first common ancestor evolves in time. This evolution can be described by a simple Markov process which allows one to calculate several characteristics of the time dependence of

G

. We also study how

G

is correlated to the genetic diversity.Comment: 21 pages, 10 figures, uses RevTex4 and feynmf.sty Corrections : introduction and conclusion rewritten, references adde

arXiv.org e-Print Archive

Crossref

The cost of reducing starting RNA quantity for Illumina BeadArrays: a bead-level dilution experiment.

Author: Dunning Mark J
Hadfield James
Lynch Andy G
Osborne Michelle
Tavaré Simon
Thorne Natalie P
Publication venue: BMC Genomics
Publication date: 06/10/2010
Field of study

BACKGROUND: The demands of microarray expression technologies for quantities of RNA place a limit on the questions they can address. As a consequence, the RNA requirements have reduced over time as technologies have improved. In this paper we investigate the costs of reducing the starting quantity of RNA for the Illumina BeadArray platform. This we do via a dilution data set generated from two reference RNA sources that have become the standard for investigations into microarray and sequencing technologies. RESULTS: We find that the starting quantity of RNA has an effect on observed intensities despite the fact that the quantity of cRNA being hybridized remains constant. We see a loss of sensitivity when using lower quantities of RNA, but no great rise in the false positive rate. Even with 10 ng of starting RNA, the positive results are reliable although many differentially expressed genes are missed. We see that there is some scope for combining data from samples that have contributed differing quantities of RNA, but note also that sample sizes should increase to compensate for the loss of signal-to-noise when using low quantities of starting RNA. CONCLUSIONS: The BeadArray platform maintains a low false discovery rate even when small amounts of starting RNA are used. In contrast, the sensitivity of the platform drops off noticeably over the same range. Thus, those conducting experiments should not opt for low quantities of starting RNA without consideration of the costs of doing so. The implications for experimental design, and the integration of data from different starting quantities, are complex.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

Springer - Publisher Connector

PubMed Central

Apollo (Cambridge)

University of St. Andrews - Pure

Spike-in validation of an Illumina-specific variance-stabilizing transformation

Author: Barbosa-Morais Nuno L
Dunning Mark J
Lynch Andy G
Ritchie Matthew E
Tavaré Simon
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

BACKGROUND: Variance-stabilizing techniques have been used for some time in the analysis of gene expression microarray data. A new adaptation, the variance-stabilizing transformation (VST), has recently been developed to take advantage of the unique features of Illumina BeadArrays. VST has been shown to perform well in comparison with the widely-used approach of taking a log2 transformation, but has not been validated on a spike-in experiment. We apply VST to the data from a recently published spike-in experiment and compare it both to a regular log2 analysis and a recently recommended analysis that can be applied if all raw data are available. FINDINGS: VST provides more power to detect differentially expressed genes than a log2 transformation. However, the gain in power is roughly the same as utilizing the raw data from an experiment and weighting observations accordingly. VST is still advantageous when large changes in expression are anticipated, while a weighted log2 approach performs better for smaller changes. CONCLUSION: VST can be recommended for summarized Illumina data regardless of which Illumina pre-processing options have been used. However, using the raw data is still encouraged whenever possible

Crossref

Springer - Publisher Connector

PubMed Central

University of Melbourne Institutional Repository

University of St. Andrews - Pure

Numbers of mutations to different types of colorectal cancer

Author: Darryl Shibata
Heikki J Järvinen
Jukka-Pekka Mecklin
Lauri A Aaltonen
Peter Calabrese
Simon Tavaré
Publication venue: Springer Nature
Publication date: 03/10/2005
Field of study

BACKGROUND: The numbers of oncogenic mutations required for transformation are uncertain but may be inferred from how cancer frequencies increase with aging. Cancers requiring more mutations will tend to appear later in life. This type of approach may be confounded by biologic heterogeneity because different cancer subtypes may require different numbers of mutations. For example, a sporadic cancer should require at least one more somatic mutation relative to its hereditary counterpart. METHODS: To better estimate numbers of mutations before transformation, 1,022 colorectal cancers were classified with respect to microsatellite instability (MSI) and germline DNA mismatch repair mutations characteristic of hereditary nonpolyposis colorectal cancer (HNPCC). MSI- cancers were also classified with respect to clinical stage. Ages at cancer and a Bayesian algorithm were used to estimate the numbers of oncogenic mutations required for transformation for each cancer subtype. RESULTS: Ages at MSI+ cancers were consistent with five or six oncogenic mutations for hereditary (HNPCC) cancers, and seven or eight mutations for its sporadic counterpart. Ages at cancer were consistent with seven mutations for sporadic MSI- cancers, and were similar (six to eight mutations) regardless of clinical cancer stage. CONCLUSION: Different biologic subtypes of colorectal cancer appear to require different numbers of oncogenic mutations before transformation. Sporadic MSI+ cancers may require more than a single additional somatic alteration compared to hereditary MSI+ cancers because the epigenetic inactivation of MLH1 commonly observed in sporadic MSI+ cancers may be a multistep process. Interestingly, estimated numbers of MSI- cancer mutations were similar (six to eight mutations) regardless of clinical cancer stage, suggesting a propensity to spread or metastasize does not require additional mutations after transformation. Estimates of oncogenic mutation numbers may help explain some of the biology underlying different cancer subtypes

Springer - Publisher Connector

PubMed Central

Studies on the autophosphorylation of the insulin receptor from human placenta. Analysis of the sites phosphorylated by two-dimensional peptide mapping

Author: J M Tavaré
R M Denton
Publication venue: 'Portland Press Ltd.'
Publication date
Field of study

Crossref

Noisy traveling waves: effect of selection on genealogies

Author: A. H Mueller
B Derrida
Breuer H.-P.
E Brunet
Fisher R. A.
Kaplan N. L.
Kolmogorov A.
McKean H. P.
Mézard M.
Parisi G.
S Munier
Schweinsberg J.
Snyder R. E.
Tavaré S.
Publication venue: 'IOP Publishing'
Publication date: 01/09/2006
Field of study

For a family of models of evolving population under selection, which can be described by noisy traveling wave equations, the coalescence times along the genealogical tree scale like

\log^\alpha N

, where

N

is the size of the population, in contrast with neutral models for which they scale like

N

. An argument relating this time scale to the diffusion constant of the noisy traveling wave leads to a prediction for

\alpha

which agrees with our simulations. An exactly soluble case gives trees with statistics identical to those predicted for mean-field spin glasses in Parisi's theory.Comment: 4 pages, 2 figures New version includes more numerical simulations and some rewriting of the text presenting our result

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Transcriptional dynamics elicited by a short pulse of notch activation involves feed-forward regulation by E(spl)/Hes genes.

Author: Bernard Fred
Bray Sarah J
Fischer Bettina
Fu Audrey Q
Housden Ben E
Krejci Alena
Russell Steven
Tavaré Simon
Publication venue: PLoS Genet
Publication date: 01/01/2013
Field of study

Dynamic activity of signaling pathways, such as Notch, is vital to achieve correct development and homeostasis. However, most studies assess output many hours or days after initiation of signaling, once the outcome has been consolidated. Here we analyze genome-wide changes in transcript levels, binding of the Notch pathway transcription factor, CSL [Suppressor of Hairless, Su(H), in Drosophila], and RNA Polymerase II (Pol II) immediately following a short pulse of Notch stimulation. A total of 154 genes showed significant differential expression (DE) over time, and their expression profiles stratified into 14 clusters based on the timing, magnitude, and direction of DE. E(spl) genes were the most rapidly upregulated, with Su(H), Pol II, and transcript levels increasing within 5-10 minutes. Other genes had a more delayed response, the timing of which was largely unaffected by more prolonged Notch activation. Neither Su(H) binding nor poised Pol II could fully explain the differences between profiles. Instead, our data indicate that regulatory interactions, driven by the early-responding E(spl)bHLH genes, are required. Proposed cross-regulatory relationships were validated in vivo and in cell culture, supporting the view that feed-forward repression by E(spl)bHLH/Hes shapes the response of late-responding genes. Based on these data, we propose a model in which Hes genes are responsible for co-ordinating the Notch response of a wide spectrum of other targets, explaining the critical functions these key regulators play in many developmental and disease contexts

Directory of Open Access Journals

PubMed Central

Apollo (Cambridge)

Non-linear regression models for Approximate Bayesian Computation

Author: A. Butler
A. Gelman
B. Schölkopf
B.D. Ripley
C. Gourieroux
C.M. Bishop
C.P. Robert
D.A. Nix
D.E. Reich
E.A. Nadaraya
G. Weiss
G.E.P. Box
G.S. Watson
I.J. Wilson
J. Fan
J. Hey
J.H. Friedman
J.K. Pritchard
J.K. Pritchard
J.P. King
J.S. Liu
K. Heggland
L.A. Zhivotovsky
L.A. Zhivotovsky
M. Stephens
M. Tanaka
M.A. Beaumont
M.D. Shriver
M.K. Kuhner
N.J.R. Fagundes
O. Ratmann
P. Bortot
P. Marjoram
P. Marjoram
P.J. Diggle
S. Tavaré
S. Tavaré
S.A. Sisson
T. Ohta
T. Toni
V.N. Vapnik
W. Härdle
Y.-X. Fu
Y.-X. Fu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/02/2009
Field of study

Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model.Comment: 4 figures; version 3 minor changes; to appear in Statistics and Computin

arXiv.org e-Print Archive

Crossref