Search CORE

266 research outputs found

Haplotype-aware Diplotyping from Noisy Long Reads

Author: Ebler J.
Haukness M.
Marschall T.
Paten B.
Pesout T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Crossref

MPG.PuRe

A Unifying Model of Genome Evolution Under Parsimony

Author: A Bergeron
A Caprara
AE Darling
AW Xu
B Paten
B Paten
B Paten
B Raphael
Benedict Paten
C Chauve
D Bienstock
Daniel R Zerbino
David Haussler
E Tannier
G Bourque
Glenn Hickey
I Elias
J Edmonds
J Felsenstein
J Kim
J Ma
L Chindelevitch
LL Wang
M Alekseyev
M Bader
M Blanchette
M Shao
MD Braga
N El-Mabrouk
N El-Mabrouk
O Westesson
P Medvedev
S Hannenhalli
S Yancopoulos
S Yancopoulos
W Day
W Miller
YS Song
Publication venue
Publication date: 12/05/2014
Field of study

We present a data structure called a history graph that offers a practical basis for the analysis of genome evolution. It conceptually simplifies the study of parsimonious evolutionary histories by representing both substitutions and double cut and join (DCJ) rearrangements in the presence of duplications. The problem of constructing parsimonious history graphs thus subsumes related maximum parsimony problems in the fields of phylogenetic reconstruction and genome rearrangement. We show that tractable functions can be used to define upper and lower bounds on the minimum number of substitutions and DCJ rearrangements needed to explain any history graph. These bounds become tight for a special type of unambiguous history graph called an ancestral variation graph (AVG), which constrains in its combinatorial structure the number of operations required. We finally demonstrate that for a given history graph

G

, a finite set of AVGs describe all parsimonious interpretations of

G

, and this set can be explored with a few sampling moves.Comment: 52 pages, 24 figure

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Recommended from our members

Lossless Pangenome Indexing Using Tag Arrays

Author: Eskandar P
Paten B
Sirén J
Publication venue: eScholarship, University of California
Publication date: 15/08/2025
Field of study

Pangenome graphs represent the genomic variation by encoding multiple haplotypes within a unified graph structure. However, efficient and lossless indexing of such structures remains challenging due to the scale and complexity of pangenomic data. We present a practical and scalable indexing framework based on tag arrays, which annotate positions in the Burrows–Wheeler transform (BWT) with graph coordinates. Our method extends the FM-index with a run-length compressed tag structure that enables efficient retrieval of all unique graph locations where a query pattern appears. We introduce a novel construction algorithm that combines unique k-mers, graph-based extensions, and haplotype traversal to compute the tag array in a memory-efficient manner. To support large genomes, we process each chromosome independently and then merge the results into a unified index using properties of the multi-string BWT and r-index. Our evaluation on the HPRC graphs demonstrates that the tag array structure compresses effectively, scales well with added haplotypes, and preserves accurate mapping information across diverse regions of the genome. This indexing method enables lossless and haplotype-aware querying in complex pangenomes and offers a practical indexing layer to develop scalable aligners and downstream graph-based analysis tools

eScholarship - University of California

The landscape of Neandertal ancestry in present-day humans

Author: A Keinan
B Paten
C Sutton
C-I Wu
D Reich
DB Percival
DC Presgraves
FL Mendez
FL Mendez
G Hellenthal
G McVicker
HA Orr
HR Kunsch
J Lachance
JAOHA Coyne
JD Wall
JM Good
K Prüfer
K Prüfer
KD Pruitt
L Abi-Rached
LA Hindorff
M Ashburner
M Meyer
PK Tucker
RE Green
RH Byrd
RR Hudson
S Anders
S Gravel
S Myers
S Sankararaman
T Derrien
The 1000 Genomes Project Consortium
V Yotova
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Analyses of Neandertal genomes have revealed that Neandertals have contributed genetic variants to modern humans1–2. The antiquity of Neandertal gene flow into modern humans means that regions that derive from Neandertals in any one human today are usually less than a hundred kilobases in size. However, Neandertal haplotypes are also distinctive enough that several studies have been able to detect Neandertal ancestry at specific loci1,3–8. Here, we have systematically inferred Neandertal haplotypes in the genomes of 1,004 present-day humans12. Regions that harbor a high frequency of Neandertal alleles in modern humans are enriched for genes affecting keratin filaments suggesting that Neandertal alleles may have helped modern humans adapt to non-African environments. Neandertal alleles also continue to shape human biology, as we identify multiple Neandertal-derived alleles that confer risk for disease. We also identify regions of millions of base pairs that are nearly devoid of Neandertal ancestry and enriched in genes, implying selection to remove genetic material derived from Neandertals. Neandertal ancestry is significantly reduced in genes specifically expressed in testis, and there is an approximately 5-fold reduction of Neandertal ancestry on chromosome X, which is known to harbor a disproportionate fraction of male hybrid sterility genes20–22. These results suggest that part of the reduction in Neandertal ancestry near genes is due to Neandertal alleles that reduced fertility in males when moved to a modern human genetic background

Crossref

Harvard University - DASH

PubMed Central

eScholarship - University of California

MPG.PuRe

Recommended from our members

Genetic effects on gene expression across human tissues

Author: Abell Nathan S.
Abell Nathan S.
Addington Anjene
Addington Anjene M.
Aguet François
Aguet François
Akey Joshua M.
Ardlie Kristin G.
Ardlie Kristin G.
Balliu Brunilda
Balliu Brunilda
Barcus Mary E.
Barcus Mary E.
Barker Laura K.
Barshir Ruth
Basha Omer
Bates Daniel
Battle Alexis
Billy Li Jin
Bogu Gireesh K.
Branton Philip A.
Branton Philip A.
Bridge Jason
Bridge Jason
Brigham Lori E.
Brigham Lori E.
Brown Andrew
Brown Andrew A.
Brown Christopher D.
Bustamante Carlos D.
Carithers Latarsha J.
Castel Stephane E.
Castel Stephane E.
Chan Joanne
Chen Lin S.
Chen Lin S.
Chiang Colby
Claussnitzer Melina
Conrad Donald F.
Conrad Donald F.
Cox Nancy J.
Cox Nancy J.
Craft Brian
Cummings Beryl B.
Cummings Beryl B.
Damani Farhan N.
Davis David A.
Davis David A.
Davis Joe R.
Davis Joe R.
Delaneau Olivier
Delaneau Olivier
Demanelis Kathryn
Dermitzakis Emmanouil T.
Dermitzakis Emmanouil T.
Diegel Morgan
Doherty Jennifer A.
Engelhardt Barbara E.
Eskin Eleazar
Eskin Eleazar
Feinberg Andrew P.
Fernando Marian S.
Ferreira Pedro G.
Flicek Paul
Foster Barbara A.
Foster Barbara A.
Frésard Laure
Frésard Laure
Gamazon Eric R.
Gamazon Eric R.
Garrido-Martín Diego
Garrido-Martín Diego
Gelfand Ellen T.
Gelfand Ellen T.
Getz Gad
Getz Gad
Gewirtz Ariel D.H.
Gewirtz Ariel D.H.
Gillard Bryan M.
Gillard Bryan M.
Gliner Genna
Gliner Genna
Gloudemans Michael J.
Gloudemans Michael J.
Goldman Mary
Gould Sarah E.
Guan Ping
Guan Ping
Guigo Roderic
Guigó Roderic
Hadley Kane
Hadley Kane
Haeussler Maximilian
Hall Ira M.
Halow Jessica
Han Buhm
Han Buhm
Handsaker Robert E.
Hansen Kasper D.
Hariharan Pushpa
Hasz Richard
Hasz Richard
Haugen Eric
He Amy Z.
He Yuan
He Yuan
Hickey Peter F.
Hormozdiari Farhad
Hormozdiari Farhad
Hou Lei
Howald Cedric
Huang Katherine H.
Huang Katherine H.
Hunter Marcus
Hunter Marcus
Hunter Steven
Jasmine Farzana
Jewell Scott D.
Jewell Scott D.
Jian Ruiqi
Jiang Lihua
Jo Brian
Jo Brian
Johns Christopher
Johns Christopher
Johnson Audra
Johnson Mark
Johnson Mark
Juettemann Thomas
Kang Eun Yong
Karasik Ellen
Karasik Ellen
Karczewski Konrad J.
Kashin Seva
Kaul Rajinder
Kellis Manolis
Kellis Manolis
Kent W. James
Kibriya Muhammad G.
Kim Yungil
Kim-Hellmuth Sarah
Koester Susan
Koester Susan E.
Kopen Gene
Kopen Gene
Kumar Rachna
Kumar Rachna
Kyung Im Hae
Lappalainen Tuuli
Lappalainen Tuuli
Lee Christopher M.
Lee Kristen
Leinweber William F.
Leinweber William F.
Lek Monkol
Lek Monkol
Li Gen
Li Gen
Li Qin
Li Xiao
Li Xiao
Li Xiao
Li Xin
Li Xin
Lin Jessica
Lin Shin
Linder Sandra
Linke Caroline
Little A. Roger
Little A. Roger
Liu Boxiang
Liu Boxiang
Liu Yaping
Lockart Nicole C.
Lockhart Nicole C.
Lonsdale John T.
Lonsdale John T.
MacArthur Daniel G.
MacArthur Daniel G.
Mangul Serghei
Martin Casey
Mash Deborah C.
Mash Deborah C.
Matose Takunda
Maurano Matthew T.
McCarthy Mark I.
McCarthy Mark I.
McDonald Alisa
McDonald Alisa
McDowell Ian C.
McDowell Ian C.
McLean Jeffrey A.
Mestichelli Bernadette
Mestichelli Bernadette
Miklos Mark
Miklos Mark
Mohammadi Pejman
Mohammadi Pejman
Molinie Benoit
Monlong Jean
Montgomery Stephen B.
Montgomery Stephen B.
Montroy Robert G.
Montroy Robert G.
Moore Helen M.
Moore Helen M.
Mosavel Maghboeba
Moser Michael T.
Moser Michael T.
Muñoz-Aguirre Manuel
Myer Kevin
Myer Kevin
Ndungu Anne W.
Nedzel Jared L.
Nedzel Jared L.
Nelson Jemma
Neri Fidencio J.
Nguyen Duyen T.
Nguyen Duyen Y.
Nicolae Dan L.
Nierras Concepcion R.
Nobel Andrew B.
Nobel Andrew B.
Noble Michael S.
Noble Michael S.
Oliva Meritxell
Oliva Meritxell
Ongen Halit
Ongen Halit
Palowitch John J.
Palowitch John J.
Panousis Nikolaos
Papasaikas Panagiotis
Park Yongjin
Park YoSon
Park YoSon
Parsana Princy
Parsana Princy
Paten Benedict
Payne Anthony J.
Peterson Christine B.
Peterson Christine B.
Pierce Brandon L.
Qi Liqun
Quan Jie
Quon Gerald
Rao Abhi
Rao Abhi
Reverter Ferran
Rinaldi Nicola J.
Ripke Stephan
Rizzardi Lindsay F.
Robinson Karna L.
Roche Nancy V.
Roe Brian
Roe Bryan
Rohrer Daniel C.
Rohrer Daniel C.
Rosenbloom Kate R.
Ruffier Magali
Sabatti Chiara
Sabatti Chiara
Saha Ashis
Saha Ashis
Salvatore Michael
Salvatore Michael
Sammeth Michael
Sandstrom Richard
Scott Alexandra J.
Segrè Ayellet V.
Segrè Ayellet V.
Shabalin Andrey A.
Shabalin Andrey A.
Shad Saboor
Shad Saboor
Sheppard Dan
Shimko Tyler C.
Siminoff Laura A.
Singh Shilpi
Skol Andrew
Smith Anna M.
Smith Kevin S.
Snyder Michael P.
Sobin Leslie
Sobin Leslie
Sodaei Reza
Stamatoyannopoulos John
Stephens Matthew
Stranger Barbara E.
Stranger Barbara E.
Stranger Barbara E.
Strober Benjamin J.
Strober Benjamin J.
Struewing Jeffery P.
Struewing Jeffery P.
Sul Jae Hoon
Sul Jae Hoon
Sullivan Timothy J.
Tabor David E.
Tang Hua
Taylor Kieron
Teran Nicole A.
Thomas Jeffrey A.
Thomas Jeffrey A.
Tomaszewski Maria M.
Traino Heather M.
Trevanion Stephen J.
Trowbridge Casandra A.
Tsang Emily K.
Tsang Emily K.
Tsang Emily K.
Tukiainen Taru
Tukiainen Taru
Um Ki Sung
Undale Anita H.
Urbut Sarah
Valentino Kimberly M.
Valley Dana
Valley Dana R.
van de Bunt Martijn
Van Wittenberghe Nicholas
Vatanian Negin
Vaught Jimmie B.
Vivian John
Volpi Simona
Volpi Simona
Walters Gary
Walters Gary
Wang Gao
Wang Li
Wang Meng
Washington Michael
Washington Michael
Wen Xiaoquan
Wen Xiaoquan
Wheeler Joseph
Wheeler Joseph
Wright Fred A.
Wright Fred A.
Wu Fan
Xi Hualin S.
Yeger-Lotem Esti
Yong Kang Eun
Zappala Zachary
Zappala Zachary
Zaugg Judith B.
Zerbino Daniel R.
Zhang Hailei
Zhang Rui
Zhou Yi-Hui
Zhou Yi-Hui
Zhu Jingchun
Publication venue: Macmillan Publishers
Publication date: 01/01/2017
Field of study

Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.Postprint (published version

Princeton University Open Access Repository

UNIL IRIS | Institutional Research Information System

University of Miami: Scholarship@Miami

UPF Digital Repository

Archive ouverte UNIGE

DSpace@MIT

UPCommons. Portal del coneixement obert de la UPC

Oxford University Research Archive

Discovery Research Portal

UPCommons (Universitat Politècnica de Catalunya)

Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans

Author: A Eyre-Walker
A Eyre-Walker
A Eyre-Walker
A Eyre-Walker
A Hodgkinson
A Hodgkinson
A Kong
A Kong
Adam Eyre-Walker
B Arbeithuber
B Paten
B Schuster-Bockler
C Seoighe
C TEP
DF Conrad
DL Bodian
E Kenigsberg
F Chiaromonte
F Pratto
F Supek
G Bernardi
G Bernardi
G McVicker
GP Holmquist
H Jonsson
I Hellmann
I Hellmann
J Filipski
J Filipski
J Meunier
JB Haldane
JC Dohm
JJ Cai
JJ Michaelson
K Harris
K Harris
K Wolfe
KE Lohmueller
KH Wolfe
L Duret
L Duret
LC Francioli
M Blanchette
MJ Lercher
MW Nachman
NV Terekhanova
P Moorjani
P Polak
Peter F. Arndt
R Burgess
RE Thurman
RS Hansen
S Besenbacher
S Glemin
S Katzman
S Tyekucheva
Shamil R. Sunyaev
Thomas C. A. Smith
TI Gossmann
TN Phung
V Aggarwala
VM Schaibley
WS Wong
Y Benjamini
YH Woo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/03/2018
Field of study

It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investi- gate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show differ- ent patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that can- not be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore struc- ture of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between spe- cies is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered

Crossref

ZENODO

Directory of Open Access Journals

Dryad Digital Repository

Electronic Archiving System

Sussex Research Online

MPG.PuRe

The Francis Crick Institute

Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization

Author: A Siepel
A Siepel
AS Schwartz
B Paten
B Paten
B Rhead
Benedict Paten
C Lee
CN Dewey
David Haussler
DF Feng
G Myers
I Lumb
J Ma
JE Stajich
JS Pedersen
K Katoh
K Katoh
K Kryukov
K Liu
K Reinert
KM Roskin
Krishna M Roskin
M Blanchette
M Hasegawa
M Waterman
N Bray
P Di Tommaso
RC Edgar
RK Bradley
S Griffiths-Jones
S Schwartz
T Kim
U Tönges
W Gentzsch
WJ Kent
WJ Kent
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on large datasets. As a solution, we present two general methods, Crumble and Prune, for breaking a phylogenetic alignment problem into smaller, more tractable sub-problems. We call Crumble and Prune <it>meta-alignment </it>methods because they use existing alignment algorithms and can be used with many current alignment programs. Crumble breaks long alignment problems into shorter sub-problems. Prune divides the phylogenetic tree into a collection of smaller trees to reduce the number of sequences in each alignment problem. These methods are orthogonal: they can be applied together to provide better scaling in terms of sequence length and in sequence depth. Both methods partition the problem such that many of the sub-problems can be solved independently. The results are then combined to form a solution to the full alignment problem. Results Crumble and Prune each provide a significant performance improvement with little loss of accuracy. In some cases, a gain in accuracy was observed. Crumble and Prune were tested on real and simulated data. Furthermore, we have implemented a system called Job-tree that allows hierarchical sub-problems to be solved in parallel on a compute cluster, significantly shortening the run-time. Conclusions These methods enabled us to solve gigabase alignment problems. These methods could enable a new generation of biologically realistic alignment algorithms to be applied to real world, large scale alignment problems.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Bioreactors as engineering support to treat cardiac muscle and vascular disease

Author: A. Hansen
A. Malek
A. Ratcliffe
A. Sen
B. Learoyd
B. Nasseri
C. Bouten
C. Fink
C. Holubarsch
C. Holubarsch
C. Mun
C. Weinberg
C. Williams
D. Adams
D. Durand
D. Orlic
D. Schneck
D. Wendt
E. Figallo
E. Porrello
F. Consolo
F. Couet
F. Couet
F. Esposito
F. Lyons
G. Buckberg
G. Kensah
G. Kensah
G. Konig
G. Matsumura
G. Vunjak-Novakovic
H. Jawad
H. Jawad
H. Mayrovitz
H. Mertsching
H. Song
I. Martin
I. Martin
J. Chlupác
J. Dahlmann
J. Doyle
J. Karam
J. Krawiec
J. Leor
J. Paten
K. Bilodeau
K. Bilodeau
K. Dumont
K. Eagle
K. Irani
K. Iwasaki
K. Kikuchi
K. Pasumarthi
L. Field
L. Freed
L. Hidalgo-Bastida
L. Mahoney
L. Mortati
L. Mulieri
L. Mulieri
L. Niklason
L. Ptaszek
L. Yap
M. Brown
M. Cleary
M. Geeslin
M. Gonen-Wadmany
M. Israelowitz
M. Klingensmith
M. Lovett
M. Oz
M. Papadaki
M. Punchard
M. Radisic
M. Radisic
M. Radisic
M. Shachar
M. Slaughter
M. Strüber
N. Bursac
N. Fortuin
N. L'Heureux
N. Plunkett
N. Tandon
N. Tandon
N. Tandon
N. Tandon
O. Teebken
P. Akhyari
R. Akins
R. Akins
R. Archer
R. Birla
R. Carrier
R. Carrier
R. Carrier
R. Devereux
R. Egli
R. Gauvin
R. Hassink
R. Hassink
R. Hassink
R. Loverde
R. Maidhof
R. Maidhof
R. Nuccitelli
R. Ogawa
R. Olmer
R. Pörtner
R. Shadwick
S. Amensag
S. Hoerstrup
S. Rashid
S. Schaaf
S. Yazdani
T. Boudou
T. Brott
T. Dvir
T. Eschenhagen
T. Eschenhagen
T. Eschenhagen
T. Kuznetsova
T. Zhao
V. Barron
V. Mironov
V. Roger
W. Grayson
W. Sheridan
W. Smotherman
W. Zimmermann
W. Zimmermann
W. Zimmermann
W. Zimmermann
X. Zhang
Y. Barash
Y. Narita
Publication venue: Multi-Science Publishing Co Ltd.
Publication date: 01/01/2013
Field of study

Cardiovascular disease is the leading cause of morbidity and mortality in the Western World. The inability of fully differentiated, load-bearing cardiovascular tissues to in vivo regenerate and the limitations of the current treatment therapies greatly motivate the efforts of cardiovascular tissue engineering to become an effective clinical strategy for injured heart and vessels. For the effective production of organized and functional cardiovascular engineered constructs in vitro, a suitable dynamic environment is essential, and can be achieved and maintained within bioreactors. Bioreactors are technological devices that, while monitoring and controlling the culture environment and stimulating the construct, attempt to mimic the physiological milieu. In this study, a review of the current state of the art of bioreactor solutions for cardiovascular tissue engineering is presented, with emphasis on bioreactors and biophysical stimuli adopted for investigating the mechanisms influencing cardiovascular tissue development, and for eventually generating suitable cardiovascular tissue replacements

Crossref

Directory of Open Access Journals

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Recommended from our members

Computational solutions for omics data

Author: A Butte
A Chatr-aryamontri
A Franceschini
A Joshi
A Lan
A Mortazavi
A Subramanian
A Tanay
AC Jungkamp
AJ Pinho
AK Wong
AR Whitney
B Langmead
B Langmead
B Paten
Bonnie Berger
BP Kelley
C Huttenhower
C Kingsford
C Trapnell
C Trapnell
C Trapnell
C Wang
CH Yeang
CJ Vaske
CS Liao
D Croft
D Earl
D Kim
D Kim
D Park
DB Allison
DB Jaffe
DR Zerbino
E Banks
E Banks
E Cerami
E Nabieva
E Segal
E Yeger-Lotem
EJ Rossin
ER Mardis
ES Lander
ET Wang
F Hach
F Hach
F Markowetz
F Ozsolak
F Vandin
F Vandin
F Vezzi
GE Zinman
H Li
H Li
I Ulitsky
I Ulitsky
IA Adzhubei
J Butler
J Clarke
J Flannick
J Goecks
J Lamb
J Pandey
JC Marioni
JC Venter
Jian Peng
JT Dudley
JT Leek
JT Simpson
JT Simpson
K Rhrissorrakrai
KI Goh
KY Yeung
L Parts
LD Stein
LH Hartwell
LM Heiser
LR Meyer
M Ascano
M Burrows
M Garber
M Gross
M Gstaiger
M Hafner
M Hsi-Yang Fritz
M Kircher
M Koyuturk
M Narayanan
M Reich
M Schatz
M Schmid
M Sirota
M Steffen
M Yandell
MB Gerstein
MB Gerstein
MC Brandon
MC Schatz
MG Grabherr
MH Maathuis
ML Metzker
Mona Singh
N Atias
N de Souza
N Tuncbag
NP Palmer
NT Ingolia
O Hirose
O Litvin
O Ogasawara
O Stegle
O Vanunu
P Ferragina
P Flicek
P Jiang
P Kumar
P Lu
P Shannon
PA Pevzner
PE Compeau
PG Doyle
PO Brown
PR Loh
PR Schmid
R Colak
R Gaujoux
R Li
R Li
R Li
R Singh
RC Gentleman
S Anders
S Batzoglou
S Christley
S Deorowicz
S Erten
S Kohler
S Levy
S Navlakha
S Ng
S Suthram
SA Chowdhury
SD Kahn
SF Altschul
SG Tringe
SL Salzberg
SS Huang
SS Shen-Orr
T Barrett
T Ideker
T Michoel
TS Furey
U Manber
UD Akavia
W Ali
W Li
W Tembe
WJ Kent
X Liu
X Wang
X Zhou
Y Prat
Y Wang
Y Zhang
YA Kim
Z Tu
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2013
Field of study

High-throughput experimental technologies are generating increasingly massive and complex genomic data sets. The sheer enormity and heterogeneity of these data threaten to make the arising problems computationally infeasible. Fortunately, powerful algorithmic techniques lead to software that can answer important biomedical questions in practice. In this Review, we sample the algorithmic landscape, focusing on state-of-the-art techniques, the understanding of which will aid the bench biologist in analysing omics data. We spotlight specific examples that have facilitated and enriched analyses of sequence, transcriptomic and network data sets.National Institutes of Health (U.S.) (Grant GM081871

Princeton University Open Access Repository

DSpace@MIT

Crossref

PubMed Central

webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser

Author: A Löytynoja
A Löytynoja
A Löytynoja
Ari Löytynoja
B Paten
C Dessimoz
C Kosiol
D Maddison
H McWilliam
J Felsenstein
K Wong
M Hasegawa
Nick Goldman
R Development Core Team
S Whelan
W Fletcher
W Pearson
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Phylogeny-aware progressive alignment has been found to perform well in phylogenetic alignment benchmarks and to produce superior alignments for the inference of selection on codon sequences. Its implementation in the PRANK alignment program package also allows modelling of complex evolutionary processes and inference of posterior probabilities for sequence sites evolving under each distinct scenario, either simultaneously with the alignment of sequences or as a post-processing step for an existing alignment. This has led to software with many advanced features, and users may find it difficult to generate optimal alignments, visualise the full information in their alignment results, or post-process these results, e.g. by objectively selecting subsets of alignment sites. Results We have created a web server called webPRANK that provides an easy-to-use interface to the PRANK phylogeny-aware alignment algorithm. The webPRANK server supports the alignment of DNA, protein and codon sequences as well as protein-translated alignment of cDNAs, and includes built-in structure models for the alignment of genomic sequences. The resulting alignments can be exported in various formats widely used in evolutionary sequence analyses. The webPRANK server also includes a powerful web-based alignment browser for the visualisation and post-processing of the results in the context of a cladogram relating the sequences, allowing (e.g.) removal of alignment columns with low posterior reliability. In addition to <it>de novo </it>alignments, webPRANK can be used for the inference of ancestral sequences with phylogenetically realistic gap patterns, and for the annotation and post-processing of existing alignments. The webPRANK server is freely available on the web at <url>http://tinyurl.com/webprank</url> . Conclusions The webPRANK server incorporates phylogeny-aware multiple sequence alignment, visualisation and post-processing in an easy-to-use web interface. It widens the user base of phylogeny-aware multiple sequence alignment and allows the performance of all alignment-related activity for small sequence analysis projects using only a standard web browser.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central