Search CORE

University of Queensland eSpace

'Coppertails and silvertails': Queensland women and their struggle for the political franchise, 1899-1905

Author: Paten J.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2005
Field of study

A Unifying Model of Genome Evolution Under Parsimony

Author: A Bergeron
A Caprara
AE Darling
AW Xu
B Paten
B Paten
B Paten
B Raphael
Benedict Paten
C Chauve
D Bienstock
Daniel R Zerbino
David Haussler
E Tannier
G Bourque
Glenn Hickey
I Elias
J Edmonds
J Felsenstein
J Kim
J Ma
L Chindelevitch
LL Wang
M Alekseyev
M Bader
M Blanchette
M Shao
MD Braga
N El-Mabrouk
N El-Mabrouk
O Westesson
P Medvedev
S Hannenhalli
S Yancopoulos
S Yancopoulos
W Day
W Miller
YS Song
Publication venue
Publication date: 12/05/2014
Field of study

We present a data structure called a history graph that offers a practical basis for the analysis of genome evolution. It conceptually simplifies the study of parsimonious evolutionary histories by representing both substitutions and double cut and join (DCJ) rearrangements in the presence of duplications. The problem of constructing parsimonious history graphs thus subsumes related maximum parsimony problems in the fields of phylogenetic reconstruction and genome rearrangement. We show that tractable functions can be used to define upper and lower bounds on the minimum number of substitutions and DCJ rearrangements needed to explain any history graph. These bounds become tight for a special type of unambiguous history graph called an ancestral variation graph (AVG), which constrains in its combinatorial structure the number of operations required. We finally demonstrate that for a given history graph

G

, a finite set of AVGs describe all parsimonious interpretations of

G

, and this set can be explored with a few sampling moves.Comment: 52 pages, 24 figure

arXiv.org e-Print Archive

Springer - Publisher Connector

arXiv.org e-Print Archive

Haplotype-aware graph indexes

Author: Durbin Richard
Garrison Erik
Novak Adam M.
Paten Benedict J.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 18th International Workshop on Algorithms in Bioinformatics (WABI 2018)
Publication date: 01/01/2018
Field of study

The variation graph toolkit (VG) represents genetic variation as a graph. Each path in the graph is a potential haplotype, though most paths are unlikely recombinations of true haplotypes. We augment the VG model with haplotype information to identify which paths are more likely to be correct. For this purpose, we develop a scalable implementation of the graph extension of the positional Burrows-Wheeler transform. We demonstrate the scalability of the new implementation by indexing the 1000 Genomes Project haplotypes. We also develop an algorithm for simplifying variation graphs for k-mer indexing without losing any k-mers in the haplotypes

Dagstuhl Research Online Publication Server

NASA Technical Reports Server

Apollo (Cambridge)

Beyond Nanopore Sequencing in Space: Identifying the Unknown

Author: Akeson Mark
Burton Aaron S.
Castro-Wallace Sarah L.
Harrington Eoghan
Jain Miten
John Kristen K.
Juul Sissel
Paten Benedict
Stahl Sarah E.
Turner Daniel J.
Publication venue
Publication date
Field of study

Astronaut Kate Rubins sequenced DNA on the International Space Station (ISS) for the first time in August 2016 (Figure 1A). A 2D sequencing library containing an equal mixture of lambda bacteriophage, Escherichia coli, and Mus musculus was prepared on the ground with a SQK_MAP006 kit and sent to the ISS frozen and loaded into R7.3 flow cells. After a total of 9 on-orbit sequencing runs over 6 months, it was determined that there was no decrease in sequencing performance on-orbit compared to ground controls (1). A total of ~280,000 and ~130,000 reads generated on-orbit and on the ground, respectively, identified 90% of reads that were attributed to 30% lambda bacteriophage, 30% Escherichia coli, and 30% M. musculus (Figure 1B). Extensive bioinformatics analysis determined comparable 2D and 1D read accuracies between flight and ground runs (Figure 1C), and data collected from the ISS were able to construct directed assemblies of E.coli and lambda genomes at 100% and M. musculus mitochondrial genome at 96.7%. These findings validate sequencing as a viable option for potential on-orbit applications such as environmental microbial monitoring and disease diagnosis. Current microbial monitoring of the ISS applies culture-based techniques that provide colony forming unit (CFU) data for air, water, and surface samples. The identity of the cultured microorganisms in unknown until sample return and ground-based analysis, a process that can take up to 60 days. For sequencing to benefit ISS applications, spaceflight-compatible sample preparation techniques are required. Subsequent to the testing of the MinION on-orbit, a sample-to-sequence method was developed using miniPCR and basic pipetting, which was only recently proven to be effective in microgravity. The work presented here details the in- flight sample preparation process and the first application of DNA sequencing on the ISS to identify unknown ISS-derived microorganisms

MinION Analysis and Reference Consortium: Phase 1 data release and analysis

Author: Benedict Paten
Bonnie L. Brown
Camilla L.C. Ip
David A. Eccles
David Buck
Elizabeth M. Batty
Ewan Birney
Hans J. Jansen
Hugh E. Olsen
Jared T. Simpson
John M. Urban
John R. Tyson
Justin O'Grady
Mariateresa de Cesare
Matthew Loose
MinION Analysis and Reference Consortium
Miten Jain
Paolo Piazza
Richard M. Leggett
Rory J. Bowden
Sara Goodwin
Solomon Mwaigwisya
Terrance P. Snutch
Vadim Zalunin
Publication venue: 'F1000 Research Ltd'
Publication date: 01/10/2015
Field of study

The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

University of East Anglia digital repository

Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization

Author: A Siepel
A Siepel
AS Schwartz
B Paten
B Paten
B Rhead
Benedict Paten
C Lee
CN Dewey
David Haussler
DF Feng
G Myers
I Lumb
J Ma
JE Stajich
JS Pedersen
K Katoh
K Katoh
K Kryukov
K Liu
K Reinert
KM Roskin
Krishna M Roskin
M Blanchette
M Hasegawa
M Waterman
N Bray
P Di Tommaso
RC Edgar
RK Bradley
S Griffiths-Jones
S Schwartz
T Kim
U Tönges
W Gentzsch
WJ Kent
WJ Kent
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on large datasets. As a solution, we present two general methods, Crumble and Prune, for breaking a phylogenetic alignment problem into smaller, more tractable sub-problems. We call Crumble and Prune <it>meta-alignment </it>methods because they use existing alignment algorithms and can be used with many current alignment programs. Crumble breaks long alignment problems into shorter sub-problems. Prune divides the phylogenetic tree into a collection of smaller trees to reduce the number of sequences in each alignment problem. These methods are orthogonal: they can be applied together to provide better scaling in terms of sequence length and in sequence depth. Both methods partition the problem such that many of the sub-problems can be solved independently. The results are then combined to form a solution to the full alignment problem. Results Crumble and Prune each provide a significant performance improvement with little loss of accuracy. In some cases, a gain in accuracy was observed. Crumble and Prune were tested on real and simulated data. Furthermore, we have implemented a system called Job-tree that allows hierarchical sub-problems to be solved in parallel on a compute cluster, significantly shortening the run-time. Conclusions These methods enabled us to solve gigabase alignment problems. These methods could enable a new generation of biologically realistic alignment algorithms to be applied to real world, large scale alignment problems.</p

Springer - Publisher Connector

Directory of Open Access Journals

arXiv.org e-Print Archive

Accurate reconstruction of insertion-deletion histories by statistical phylogenetics

Author: A Heger
A Löytynoja
A Löytynoja
A Siepel
A Siepel
A Siepel
AG Clark
AM Moses
Art F. Y. Poon
B Knudsen
B Paten
B Rannala
Benedict Paten
C Lee
C Strope
DG Higgins
EF Moore
FA Matsen
FR Kschischang
G Lunter
Gerton Lunter
I Holmes
I Miklós
Ian Holmes
J Felsenstein
JD Thompson
JL Thorne
JL Thorne
JS Pedersen
K Katoh
K Liu
KM Wong
KS Pollard
L Gomez-Valero
L Zhu
M Larkin
M Mohri
MA Suchard
N de la Chaux
O Kamneva
O Westesson
Oscar Westesson
P Markova-Raina
R Mills
RA Cartwright
RC Edgar
RK Bradley
RK Bradley
S Nelesen
S Saccone
S Sinha
T Beissbarth
X Qu
Z Wang
Z Yang
Z Yang
Z Yang
Z Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.Comment: 28 pages, 15 figures. arXiv admin note: text overlap with arXiv:1103.434

Public Library of Science (PLOS)

Directory of Open Access Journals