Search CORE

41 research outputs found

Safe and complete contig assembly via omnitigs

Author: A Bankevich
A Guénoche
AR Rubinov
AS Motahari
C Kingsford
D Haussler
DR Zerbino
E Kapun
E Kapun
ES Lander
G Bresler
G Narzisi
I Lysov
JD Kececioglu
JR Miller
JT Simpson
JT Simpson
K Lam
K Sahlin
L Salmela
M Boetzer
M Boetzer
N Nagarajan
N Nagarajan
N Vyahhi
P Medvedev
P Medvedev
P Medvedev
PA Pevzner
PA Pevzner
R Chikhi
R Chikhi
R Luo
R Uricaru
RM Idury
SL Salzberg
Publication venue
Publication date: 16/08/2016
Field of study

Contig assembly is the first stage that most assemblers solve when reconstructing a genome from a set of reads. Its output consists of contigs -- a set of strings that are promised to appear in any genome that could have generated the reads. From the introduction of contigs 20 years ago, assemblers have tried to obtain longer and longer contigs, but the following question was never solved: given a genome graph

G

(e.g. a de Bruijn, or a string graph), what are all the strings that can be safely reported from

G

as contigs? In this paper we finally answer this question, and also give a polynomial time algorithm to find them. Our experiments show that these strings, which we call omnitigs, are 66% to 82% longer on average than the popular unitigs, and 29% of dbSNP locations have more neighbors in omnitigs than in unitigs.Comment: Full version of the paper in the proceedings of RECOMB 201

arXiv.org e-Print Archive

Crossref

Space-efficient and exact de Bruijn graph representation based on a Bloom filter

Author: A Bowe
A Kirsch
B Chazelle
C Kingsford
C Ye
G Marçais
G Rizk
G Rizk
G Sacomoto
Guillaume Rizk
J Pell
JR Miller
JT Simpson
MG Grabherr
P Peterlongo
P Peterlongo
R Chikhi
R Li
Rayan Chikhi
RL Warren
RM Idury
SL Salzberg
TC Conway
Y Peng
Z Iqbal
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Meraculous: De Novo Genome Assembly with Short Paired-End Reads

Author: A Edwards
A Edwards
B Ewing
D Hernandez
DA Wheeler
Daniel S. Rokhsar
DR Bentley
DR Bentley
DR Smith
DR Zerbino
DR Zerbino
ES Lander
EW Myers
EW Myers
EW Myers
Gary P. Schroth
GG Sutton
I Maccallum
Isaac Ho
J Butler
Jarrod A. Chapman
JC Roach
JL Weber
JT Simpson
K Hayashi
M Chaisson
M Margulies
M Pop
M Pop
MJ Chaisson
MJ Chaisson
ML Metzker
P Flicek
PA Pevzner
R Li
R Li
RL Warren
RM Idury
SC Schuster
SF Altschul
Shujun Luo
Sirisha Sunkara
Steven L. Salzberg
TW Jeffries
TW Jeffries
Publication venue: Public Library of Science
Publication date: 01/08/2011
Field of study

We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia stipitis. More than 95% of the genome is recovered, with no errors; half the assembled sequence is in contigs longer than 101 kilobases and in scaffolds longer than 269 kilobases. Incorporating fosmid ends recovers entire chromosomes. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. A novel memory-efficient hashing scheme is introduced. The resulting contigs are ordered and oriented using paired reads separated by ∼280 bp or ∼3.2 kbp, and many gaps between contigs can be closed using paired-end placements. Practical issues with the dataset are described, and prospects for assembling larger genomes are discussed

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

UNT Digital Library

Genetic diversity analysis of common beans based on molecular markers

Author: Beebe S
Belkhir K
Blair MW
Blair MW
Buntjer JB
Burle ML
Cattan-Toupance TL
Chacón SMI
Chacón SMI
Charcosset A
Dellaporta SL
Duarte JM
Díaz LM
Excoffier L
Excoffier LGL
Gaitán-Solís SE
Gepts P
Gepts P
Gepts P
Gepts P
Gepts P
Gómez O
Homar R. Gill-Langarica
Idury RM
José S. Muruaga-Martínez
Kwak M
Logozzo G
M.L. Patricia Vargas-Vázquez
Nei M
Netzahualcoyotl Mayek-Pérez
Newbury HJ
Papa R
Payró-de la Cruz CE
Peakall R
Perrier X
Rigoberto Rosales-Serna
Rosales-Serna R
Rossi M
Saitou N
Singh SP
Vargas MLP
Vargas-Vázquez MLP
Vos P
Voysest VO
Yu K
Publication venue: Sociedade Brasileira de Genética
Publication date: 01/01/2011
Field of study

A core collection of the common bean (Phaseolus vulgaris L.), representing genetic diversity in the entire Mexican holding, is kept at the INIFAP (Instituto Nacional de Investigaciones Forestales, Agricolas y Pecuarias, Mexico) Germplasm Bank. After evaluation, the genetic structure of this collection (200 accessions) was compared with that of landraces from the states of Oaxaca, Chiapas and Veracruz (10 genotypes from each), as well as a further 10 cultivars, by means of four amplified fragment length polymorphisms (AFLP) +3/+3 primer combinations and seven simple sequence repeats (SSR) loci, in order to define genetic diversity, variability and mutual relationships. Data underwent cluster (UPGMA) and molecular variance (AMOVA) analyses. AFLP analysis produced 530 bands (88.5% polymorphic) while SSR primers amplified 174 alleles, all polymorphic (8.2 alleles per locus). AFLP indicated that the highest genetic diversity was to be found in ten commercial-seed classes from two major groups of accessions from Central Mexico and Chiapas, which seems to be an important center of diversity in the south. A third group included genotypes from Nueva Granada, Mesoamerica, Jalisco and Durango races. Here, SSR analysis indicated a reduced number of shared haplotypes among accessions, whereas the highest genetic components of AMOVA variation were found within accessions. Genetic diversity observed in the common-bean core collection represents an important sample of the total Phaseolus genetic variability at the main Germplasm Bank of INIFAP. Molecular marker strategies could contribute to a better understanding of the genetic structure of the core collection as well as to its improvement and validation

Crossref

Directory of Open Access Journals

PubMed Central

Major prospects for exploring canine vector borne diseases and novel intervention methods using 'omic technologies

Author: A Bateman
A Benitez-Paez
A Chang
A Conesa
A Debrabant
A Krasky
A Stathopoulos
AL Hopkins
AM Wang
Andreas Hofmann
AR Jex
B Chevreux
BE Campbell
BO Villoutreix
Bronwyn E Campbell
C Cantacessi
C Cantacessi
C Cantacessi
C Cantacessi
C Cantacessi
C Dissous
C Iseli
C McInnes
C Soderlund
CE Suarez
Cinzia Cantacessi
CJ Bult
CL Barbieri
CQ Huang
CR Caffrey
D Hernandez
D Otranto
D Otranto
D Otranto
D Otranto
D Woods
DA Benson
DJ Woods
DL Wheeler
DM Lorber
Domenico Otranto
DR Bentley
DR Zerbino
E Vangrevelinghe
EA Lundquist
EM Schwarz
ER Mardis
ER Mardis
EW Myers
F Liotta
F Sanger
F Sanger
G Baneth
G Rastelli
G Stoesser
GG Sutton
H Nagarajan
H Peng
H Wieman
HP Price
I Lee
IJ Tsai
J Bajsa
J DeRisi
J Falgueras
J Gibbs
J Knobloch
JA Reinhardt
JB Gibbs
JC Engel
JJ Allocco
JL Siqueira-Neto
JM Cherry
JM Walker
JN Mills
JP McCarter
JR Miller
JS Gilleard
K Hofmann
K Lackovic
K Scheibye-Alsing
K van Golen
KL Seib
KY Chan
LE Lehtinen
LS Meena
M Ashburner
M Margulies
M Moran
M Rodriguez-Valle
M Shumway
MA Dorato
MA Doyle
MD Vibranovski
MJ Smout
ND Young
ND Young
ND Young
O Morozova
OA Asojo
OA Asojo
P Cohen
P Flicek
P Green
PA Konstantinopoulos
PD Karp
R Hammami
R Li
RL Warren
RM Idury
Robin B Gasser
S Coassin
S Hunter
S Marguerat
S Pepke
S Tweedie
SF Altschul
SG Gregory
SH Nagaraj
SH Nagaraj
SH Nagaraj
SS Virtanen
SW Clifton
T Jarvie
T Yarnitzky
TA de Beer
TD Harris
TG Geary
TK Attwood
TW Harris
V Gupta
V Pandey
W Zhong
World Health Organization
X Huang
X Huang
XJ Min
Y Belkaid
Y Fukunishi
Y Tateno
Z Wang
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Canine vector-borne diseases (CVBDs) are of major socioeconomic importance worldwide. Although many studies have provided insights into CVBDs, there has been limited exploration of fundamental molecular aspects of most pathogens, their vectors, pathogen-host relationships and disease and drug resistance using advanced, 'omic technologies. The aim of the present article is to take a prospective view of the impact that next-generation, 'omics technologies could have, with an emphasis on describing the principles of transcriptomic/genomic sequencing as well as bioinformatic technologies and their implications in both fundamental and applied areas of CVBD research. Tackling key biological questions employing these technologies will provide a 'systems biology' context and could lead to radically new intervention and management strategies against CVBDs

ResearchOnline@JCU

Crossref

Springer - Publisher Connector

ResearchOnline at James Cook University

PubMed Central

Archivio istituzionale della ricerca - Università di Bari

University of Melbourne Institutional Repository

A safe and complete algorithm for metagenomic assembly

Author: A Schrijver
Alexandru I. Tomescu
B Haider
C Kingsford
D Eppstein
DR Zerbino
E Boros
E Kapun
EW Myers
FM Pajouh
G Narzisi
GF Italiano
GW Tyson
HN Gabow
IP Lysov
J Butler
J Laserson
J Qin
JC Venter
JD Kececioglu
JR Miller
JT Simpson
JT Simpson
JT Simpson
K Cechlárová
K-M Chao
M Costa
M Crochemore
M Vingron
M Vingron
MC Costa
N Nagarajan
N Nagarajan
Nidia Obscura Acosta
P Medvedev
P Medvedev
P Veiga
PA Pevzner
PJ Turnbaugh
R Li
R Zenklusen
RM Idury
S Boisvert
S Koren
T Namiki
V Lacko
V Mäkinen
Veli Mäkinen
Y Peng
Y Peng
Z Iqbal
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Analysis of in situ diversity and population structure in Ethiopian cultivated Sorghum bicolor (L.) landraces using phenotypic traits and SSR markers

Author: A Adugna
A Adugna
A Adugna
A Ayana
A Ayana
A Ayana
A Barnaud
A Bekele
A Menkir
AH Abu-Assar
AJ Bohonak
AM Casa
B Ghebru
BVS Reddy
C Barro-Kondombo
C Schlotterer
CSA (Central Statistical Agency)
CSA (Central Statistical Agency)
D Bhattramakki
D Botstein
D Falush
DA Earl
DR Jordan
DT Rosenow
E Mutegi
ES Mace
F Rousset
F Rousset
F Sagnard
F Wilcoxon
FAOSTAT
G Ejeta
G Taramino
G Taramino
G Wilkes
GP Morris
H Nybom
H Shewayrga
HA Agrama
HE Cuevas
IBC (Institute of Biodiversity Conservation)
IBPGR/ ICRISAT
J Goudet
J Zongo
JA Anderson
JA Dahlberg
JE Erpelding
JK Pritchard
K Liu
K Ngugi
KB Ritter
KF Schertz
KJ Edwards
KK Nkongolo
L Excoffier
L Jin
LT Van Beuningen
M Deu
M Deu
M Geleta
M Li
MA Menz
ME Hellberg
ML Wang
N Mantel
P Ramu
PK Gupta
PR Aldrich
R Singh
R Uptmoor
RDM Page
RE Dean
RM Idury
RS Appa
S Evanno
S Kalinowski
S Wright
SH Hulbert
SM Brown
SP Wani
ST Kalinowski
T Shehzad
USAID
V Labeyrie
V Prasanth
VP Reddy
W Yang
WL Brown
Y Djé
YH Wang
YQ Wu
YX Cui
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Parameterized Pattern Matching

Author: A Amir
A Apostolico
BS Baker
BS Baker
C Hazay
D Harel
EM McCreight
GM Landau
K Fredriksson
RM Idury
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Forward-Backward Algorithm

Author: A Kundu
BH Juang
ES Lander
IL MacDonald
JP Hughes
L Kruglyak
L Kruglyak
LE Baum
LR Rabiner
MS Waterman
PA Devijver
RM Idury
SE Levinson
SY Kung
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2001
Field of study

Crossref

Chloroplast DNA Microsatellites Reveal Contrasting Phylogeographic Structure in Mahogany (Swietenia macrophylla King, Meliaceae) from Amazonia and Central America

Author: AC Newton
AC Verissimo
Andrew J. Lowe
BA Schaal
BD Hardesty
BD Rodan
C Erickson
Carlos Navarro
Christopher W. Dick
CW Birky
CW Dick
CW Dick
CW Dick
DA Hodell
DA Hodell
DE McCauley
DR Piperno
FB Lamb
GA Islebe
HJ Bandelt
J Cornelius
J Grogan
J Grogan
J Grogan
J Haffer
J Provan
JJ Doyle
JL Whitmore
K Weising
KH Wolfe
L Dupanloup
L Excoffier
LK Snook
LR Holdrige
M Nei
M Parssinen
Maristerra R. Lemes
MB Bush
MB Bush
MF Deguilloux
MJ Heckenberger
MJ Heckenberger
MR Lemes
MR Lemes
PA Colinvaux
PA Colinvaux
RA Ennos
RE Gullison
RJ Nevle
RJ Petit
RM Idury
Rogério Gribel
RR Novick
RT Pennington
S Cavers
Stephen Cavers
TC Whitmore
TD Pennington
W Balée
WM Denevan
WM Denevan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Big-leaf mahogany (Swietenia macrophylla) is one of the most valuable and overharvested timber trees of tropical America. A description of the organization of genetic variation across its broad range would be useful for management of genetic diversity and for understanding its demographic history. Here we report on a phylogeographic analysis of mahogany based on six polymorphic cpDNA simple sequence repeat loci (cpSSRs) genotyped in 16 populations distributed across the Brazilian Amazon and Mesoamerica (N = 245 individuals). Of the 31 cpDNA haplotypes identified, 15 occurred in Amazonia and 16 in Mesoamerica with no single haplotype shared between the two regions. The populations from Central America showed moderate differentiation (FST = 0.36) while within population genetic diversity was generally high (mean Nei's HE = 0.639). In contrast, the Amazonian populations were strongly differentiated (FST = 0.95) and contained low haplotype diversity (mean HE = 0.176), with the exception of the highly diverse Marajoara population from the Eastern Amazon (HE = 0.925). SAMOVA identified a single Mesoamerican phylogroup and four Amazonian phylogroups, indicating stronger phylogeographic structure within Amazonia. The results demonstrate high levels of cpDNA variation and differentiation of regional S. macrophylla populations, and provide the first evidence of a major phylogeographic break between Mesoamerican and South American mahogany populations

Crossref

Adelaide Research & Scholarship

Repositório do INPA

Deep Blue Documents at the University of Michigan

NERC Open Research Archive