Search CORE

306 research outputs found

Ancient Admixture into Africa from the ancestors of non-Africans

Author: Cole C.
Lunter G.
Mathieson I.
Prüfer K.
Zhu S.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/06/2020
Field of study

Accurate reconstruction of insertion-deletion histories by statistical phylogenetics

Author: A Heger
A Löytynoja
A Löytynoja
A Siepel
A Siepel
A Siepel
AG Clark
AM Moses
Art F. Y. Poon
B Knudsen
B Paten
B Rannala
Benedict Paten
C Lee
C Strope
DG Higgins
EF Moore
FA Matsen
FR Kschischang
G Lunter
Gerton Lunter
I Holmes
I Miklós
Ian Holmes
J Felsenstein
JD Thompson
JL Thorne
JL Thorne
JS Pedersen
K Katoh
K Liu
KM Wong
KS Pollard
L Gomez-Valero
L Zhu
M Larkin
M Mohri
MA Suchard
N de la Chaux
O Kamneva
O Westesson
Oscar Westesson
P Markova-Raina
R Mills
RA Cartwright
RC Edgar
RK Bradley
RK Bradley
S Nelesen
S Saccone
S Sinha
T Beissbarth
X Qu
Z Wang
Z Yang
Z Yang
Z Yang
Z Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.Comment: 28 pages, 15 figures. arXiv admin note: text overlap with arXiv:1103.434

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

FigShare

Reliability of panel-based mutational signatures for immune-checkpoint-inhibition efficacy prediction in non-small cell lung cancer

Author: Cuppens K.
Donker H. C.
Froyen G.
Groen H. J.M.
Hiltermann T. J.N.
Lunter G. A.
Maes B.
Schuuring E.
van Es B.
Volders P. J.
Publication venue
Publication date: 01/08/2023
Field of study

OBJECTIVES: Mutational signatures (MS) are gaining traction for deriving therapeutic insights for immune checkpoint inhibition (ICI). We asked if MS attributions from comprehensive targeted sequencing assays are reliable enough for predicting ICI efficacy in non-small cell lung cancer (NSCLC).METHODS: Somatic mutations of m = 126 patients were assayed using panel-based sequencing of 523 cancer-related genes. In silico simulations of MS attributions for various panels were performed on a separate dataset of m = 101 whole genome sequenced patients. Non-synonymous mutations were deconvoluted using COSMIC v3.3 signatures and used to test a previously published machine learning classifier.RESULTS: The ICI efficacy predictor performed poorly with an accuracy of 0.51 -0.09 +0.09, average precision of 0.52 -0.11 +0.11, and an area under the receiver operating characteristic curve of 0.50 -0.09 +0.10. Theoretical arguments, experimental data, and in silico simulations pointed to false negative rates (FNR) related to panel size. A secondary effect was observed, where deconvolution of small ensembles of point mutations lead to reconstruction errors and misattributions. CONCLUSION: MS attributions from current targeted panel sequencing are not reliable enough to predict ICI efficacy. We suggest that, for downstream classification tasks in NSCLC, signature attributions be based on whole exome or genome sequencing instead.</p

Proceedings - University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

The variant call format and VCFtools

Author: A. Auton
C. A. Albers
Durbin
E. Banks
G. Abecasis
G. Lunter
G. McVean
G. T. Marth
M. A. DePristo
P. Danecek
R. Durbin
R. E. Handsaker
S. T. Sherry
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API

Oxford University Research Archive

Resonances in a spring-pendulum: algorithms for equivariant singularity theory

Author: Arnol'd V I
Arnol'd V I
Bridges T J
Broer H W
Broer H W
Broer H W
Broer H W
Bröcker Th
Callahan T K
Cowell R G
Cox D
Deprit A
Deprit A
Duistermaat J J
G A Lunter
G Vegter
Gatermann K
Golubitsky M
Golubitsky M
Golubitsky M
H W Broer
I Hoveijn
Kas A
Lichtenberg A J
Martinet J
Meyer K R
Montaldi J
Mora F
Poston T
Poènaru V
Ruijgrok M
Ruijgrok M
Sanders J A
Sturmfels B
van der Meer J-C
Wassermann G
Publication venue
Publication date: 01/01/1998
Field of study

A spring-pendulum in resonance is a time-independent Hamiltonian model system for formal reduction to one degree of freedom, where some symmetry (reversibility) is maintained. The reduction is handled by equivariant singularity theory with a distinguished parameter, yielding an integrable approximation of the Poincaré map. This makes a concise description of certain bifurcations possible. The computation of reparametrizations from normal form to the actual system is performed by Gröbner basis techniques.

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Oxford University Research Archive

University of Groningen Digital Archive

Dissertations of the University of Groningen

Short and long-read genome sequencing methodologies for somatic variant detection; genomic analysis of a patient with diffuse large B-cell lymphoma

Author: Bowden Rory
Buck David
Dreau Helene
Freeman Colin
Knight Samantha J. L.
Lockstone Helen
Lonie Lorne
Lopopolo Maria
Lunter Gerton
Pagnamenta Alistair T.
Parkes Duncan
Roberts Hannah E.
Schuh Anna
Sharma Eshita
Taylor Jenny C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/03/2021
Field of study

Recent advances in throughput and accuracy mean that the Oxford Nanopore Technologies PromethiON platform is a now a viable solution for genome sequencing. Much of the validation of bioinformatic tools for this long-read data has focussed on calling germline variants (including structural variants). Somatic variants are outnumbered many-fold by germline variants and their detection is further complicated by the effects of tumour purity/subclonality. Here, we evaluate the extent to which Nanopore sequencing enables detection and analysis of somatic variation. We do this through sequencing tumour and germline genomes for a patient with diffuse B-cell lymphoma and comparing results with 150 bp short-read sequencing of the same samples. Calling germline single nucleotide variants (SNVs) from specific chromosomes of the long-read data achieved good specificity and sensitivity. However, results of somatic SNV calling highlight the need for the development of specialised joint calling algorithms. We find the comparative genome-wide performance of different tools varies significantly between structural variant types, and suggest long reads are especially advantageous for calling large somatic deletions and duplications. Finally, we highlight the utility of long reads for phasing clinically relevant variants, confirming that a somatic 1.6 Mb deletion and a p.(Arg249Met) mutation involving TP53 are oriented in trans

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden

Author: A Roth
AL Gartel
C Balbas-Martinez
C Yau
D Cappellen
D Sidransky
DA Solomon
G Gundem
G Guo
G Lunter
JB Cazier
L Lacombe
LB Alexandrov
ML Lu
MS Lawrence
P Lianes
PJ Goebell
S Denzinger
S Lise
T Abbas
Y Gui
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former are not mutually exclusive with TP53 mutations or MDM2 amplification, showing that CDKN1A dysfunction is not simply an alternative mechanism for p53 pathway inactivation. We find strong positive associations between higher tumour stage/grade and greater clonal diversity, the number of somatic mutations and the burden of copy number changes. In principle, the identification of sub-clones with greater diversity and/or mutation burden within early-stage or low-grade tumours could identify lesions with a high risk of invasive progression

Crossref

University of Birmingham Research Portal

PubMed Central

Oxford University Research Archive

White Rose Research Online

University of Melbourne Institutional Repository

Multi-level evidence of an allelic hierarchy of USH2A variants in hearing, auditory processing and speech/language outcomes.

Author: 1000 Genomes Project C.
A Adato
A Boyd
A Gialluisi
A Rimmer
A Van Aarem
A Vouloumanos
AA Benasich
AL Barabasi
BJ Keats
BR Shrestha
C Kilkenny
C Witton
CA Anderson
CC Brewer
CF Norbury
CF Reisser
CS Lai
D Szklarczyk
DF Newbury
DF Newbury
DR Moore
DR Moore
DV Bishop
E Eising
E Lenassi
G Conti-Ramsden
G Dehaene-Lambertz
G Lunter
GR Abecasis
J Golding
J Heckman
J Hornickel
JA Boughman
JC Barrett
JC Taylor
JM Ellingford
K Walter
K Wang
K Watanabe
L Huang
M Kircher
M Lek
M Luciano
M Van Segbroeck
MC Liberman
MEK Niemi
MG Filipe
MJ Henry
MR Bowl
N Pearsall
P Cingolani
P Danecek
P Le Quesne Stabej
R Mora
R Nudel
RH Fitch
RH Fitch
RM Rosenfeld
S Colella
S Lee
S Purcell
S Richards
S Shultz
SW Threlkeld
WJ Kent
WJ Kimberling
X Liu
X Zhan
XS Chen
Y Zhou
Y-M Tien
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Language development builds upon a complex network of interacting subservient systems. It therefore follows that variations in, and subclinical disruptions of, these systems may have secondary effects on emergent language. In this paper, we consider the relationship between genetic variants, hearing, auditory processing and language development. We employ whole genome sequencing in a discovery family to target association and gene x environment interaction analyses in two large population cohorts; the Avon Longitudinal Study of Parents and Children (ALSPAC) and UK10K. These investigations indicate that USH2A variants are associated with altered low-frequency sound perception which, in turn, increases the risk of developmental language disorder. We further show that Ush2a heterozygote mice have low-level hearing impairments, persistent higher-order acoustic processing deficits and altered vocalizations. These findings provide new insights into the complexity of genetic mechanisms serving language development and disorders and the relationships between developmental auditory and neural systems

Crossref

Edinburgh Research Explorer

Oxford University Research Archive

St George's Online Research Archive

University of Melbourne Institutional Repository

Oxford Brookes University: RADAR

Explore Bristol Research

Alignment and Prediction of cis-Regulatory Modules Based on a Probabilistic Model of Evolution

Author: A Bais
A Halpern
A Lifanov
A Moses
A Moses
A Moses
A Siepel
B Berman
B Knudsen
C Bergman
C Bergman
C Dewey
D Halligan
D Karolchik
D Pollard
D Pollard
D Raijman
E Berezikov
E Birney
E Blackwood
E Davidson
E Dermitzakis
F Gao
G Lunter
G Lunter
G Lunter
G Stormo
G Wray
G Wray
I Holmes
I Holmes
I Holmes
I Miklos
J Berg
J Stone
J Thorne
J Thorne
J Warner
K Wong
M Brudno
M Frith
M Frith
M Hasegawa
M Ludwig
M Ludwig
M Noyes
O Hallikas
P Andolfatto
P Keightley
P Kheradpour
P Ray
P Tomancak
R Cartwright
R Durrett
R Satija
R Siddharthan
R Waterston
S Aerts
S Doniger
S Gallo
S MacArthur
S Sinha
S Sinha
Saurabh Sinha
V Mustonen
W Huang
W Wasserman
W Wong
Wyeth W. Wasserman
X Li
X Li
Xin He
Xu Ling
Z Hu
Publication venue: Public Library of Science
Publication date: 01/03/2009
Field of study

Cross-species comparison has emerged as a powerful paradigm for predicting cis-regulatory modules (CRMs) and understanding their evolution. The comparison requires reliable sequence alignment, which remains a challenging task for less conserved noncoding sequences. Furthermore, the existing models of DNA sequence evolution generally do not explicitly treat the special properties of CRM sequences. To address these limitations, we propose a model of CRM evolution that captures different modes of evolution of functional transcription factor binding sites (TFBSs) and the background sequences. A particularly novel aspect of our work is a probabilistic model of gains and losses of TFBSs, a process being recognized as an important part of regulatory sequence evolution. We present a computational framework that uses this model to solve the problems of CRM alignment and prediction. Our alignment method is similar to existing methods of statistical alignment but uses the conserved binding sites to improve alignment. Our CRM prediction method deals with the inherent uncertainties of binding site annotations and sequence alignment in a probabilistic framework. In simulated as well as real data, we demonstrate that our program is able to improve both alignment and prediction of CRM sequences over several state-of-the-art methods. Finally, we used alignments produced by our program to study binding site conservation in genome-wide binding data of key transcription factors in the Drosophila blastoderm, with two intriguing results: (i) the factor-bound sequences are under strong evolutionary constraints even if their neighboring genes are not expressed in the blastoderm and (ii) binding sites in distal bound sequences (relative to transcription start sites) tend to be more conserved than those in proximal regions. Our approach is implemented as software, EMMA (Evolutionary Model-based cis-regulatory Module Analysis), ready to be applied in a broad biological context

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Unlocking the bottleneck in forward genetics using whole-genome sequencing and identity by descent to isolate causative mutations

Author: Bertram Edward M
Beutler Bruce
Brown Steve D M
Bull Katherine R
Cornall Richard J
Crockford Tanya L
Enders Anselm
Goodnow Christopher C
Lunter Gerton
Mallon Ann-Marie
Miosge Lisa A
Potter Paul K
Rimmer Andrew J
Roots Carla M
Siggs Owen M
Simon Michelle M
Whittle Belinda
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/12/2015
Field of study

Forward genetics screens with N-ethyl-N-nitrosourea (ENU) provide a powerful way to illuminate gene function and generate mouse models of human disease; however, the identification of causative mutations remains a limiting step. Current strategies depend on conventional mapping, so the propagation of affected mice requires non-lethal screens; accurate tracking of phenotypes through pedigrees is complex and uncertain; out-crossing can introduce unexpected modifiers; and Sanger sequencing of candidate genes is inefficient. Here we show how these problems can be efficiently overcome using whole-genome sequencing (WGS) to detect the ENU mutations and then identify regions that are identical by descent (IBD) in multiple affected mice. In this strategy, we use a modification of the Lander-Green algorithm to isolate causative recessive and dominant mutations, even at low coverage, on a pure strain background. Analysis of the IBD regions also allows us to calculate the ENU mutation rate (1.54 mutations per Mb) and to model future strategies for genetic screens in mice. The introduction of this approach will accelerate the discovery of causal variants, permit broader and more informative lethal screens to be used, reduce animal costs, and herald a new era for ENU mutagenesis.The High-Throughput Genomics Group at the Wellcome Trust Centre for Human Genetics is funded by Wellcome Trust grant reference 090532/Z/09/Z and MRC Hub grant G0900747 91070. This study was supported by Wellcome Trust Strategic Award 082030 (CCG), Wellcome Trust Studentship 094446/Z/10/Z (KRB), the Oxford NIHR Biomedical Research Centre, and the MRC Human Immunology Unit (RJC). AJR and GL were supported by Wellcome Trust grant 090532/Z/ 09/Z, CCG and AE by a Major initiative Award from the Clive and Vera Ramaciotti Foundation, and AE by an NHMRC Career Development Award. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

The Australian National University