Search CORE

62 research outputs found

Scaffold filling, contig fusion and comparative gene order inference

Author: Albert Victor A
Muñoz Adriana
Rounsley Steve
Sankoff David
Zheng Chunfang
Zhu Qian
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background There has been a trend in increasing the phylogenetic scope of genome sequencing without finishing the sequence of the genome. Increasing numbers of genomes are being published in scaffold or contig form. Rearrangement algorithms, however, including gene order-based phylogenetic tools, require whole genome data on gene order or syntenic block order. How then can we use rearrangement algorithms to compare genomes available in scaffold form only? Can the comparative evidence predict the location of unsequenced genes? Results Our method involves optimally filling in genes missing from the scaffolds, while incorporating the augmented scaffolds directly into the rearrangement algorithms as if they were chromosomes. This is accomplished by an exact, polynomial-time algorithm. We then correct for the number of extra fusion/fission operations required to make scaffolds comparable to full assemblies. We model the relationship between the ratio of missing genes actually absent from the genome versus merely unsequenced ones, on one hand, and the increase of genomic distance after scaffold filling, on the other. We estimate the parameters of this model through simulations and by comparing the angiosperm genomes <it>Ricinus communis </it>and <it>Vitis vinifera</it>. Conclusions The algorithm solves the comparison of genomes with 18,300 genes, including 4500 missing from one genome, in less than a minute on a MacBook, putting virtually all genomes within range of the method.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The University of Arizona

The Cassava Genome: Current Progress, Future Directions

Author: AA Raji
AC Roa
AP Chan
B Boher
BL Patil
Brian Desany
Chinnappa Kodira
Claude Fauquet
D Edwards
Daniel S. Rokhsar
F Awoleye
Fausto Rodriguez
GA Tuskan
H Ceballos
HM Lam
J Schmutz
Joseph Tohme
K Reilly
M Balat
Mohammed Mohiuddin
N Gill
NL Quinn
NM Springer
Pablo D. Rabinowicz
PM Schmitz
Pradeep Reddy Marri
RJ Elshire
RJ Hillocks
S Sraphet
S Tangphatsornruang
Simon Prochnik
Steve Rounsley
Timothy Harkins
VV Kapitonov
Publication venue: Springer-Verlag
Publication date: 01/01/2012
Field of study

The starchy swollen roots of cassava provide an essential food source for nearly a billion people, as well as possibilities for bioenergy, yet improvements to nutritional content and resistance to threatening diseases are currently impeded. A 454-based whole genome shotgun sequence has been assembled, which covers 69% of the predicted genome size and 96% of protein-coding gene space, with genome finishing underway. The predicted 30,666 genes and 3,485 alternate splice forms are supported by 1.4 M expressed sequence tags (ESTs). Maps based on simple sequence repeat (SSR)-, and EST-derived single nucleotide polymorphisms (SNPs) already exist. Thanks to the genome sequence, a high-density linkage map is currently being developed from a cross between two diverse cassava cultivars: one susceptible to cassava brown streak disease; the other resistant. An efficient genotyping-by-sequencing (GBS) approach is being developed to catalog SNPs both within the mapping population and among diverse African farmer-preferred varieties of cassava. These resources will accelerate marker-assisted breeding programs, allowing improvements in disease-resistance and nutrition, and will help us understand the genetic basis for disease resistance

Crossref

Springer - Publisher Connector

PubMed Central

CGSpace (CGIAR)

Impact of index hopping and bias towards the reference allele on accuracy of genotype calls from low-coverage sequencing

Author: Alan J. Mileham
AM Bolger
B Paten
C Xu
D Aird
DYC Brandt
GL Owens
Gregor Gorjanc
H Li
H Li
J Aitchison
JD Wall
JJ Egozcue
JM Hickey
John M. Hickey
M Costello
MA DePristo
Mara Battagin
Martin Johnsson
MG Ross
P Danecek
R Ros-Freixedes
R Ros-Freixedes
Roger Ros-Freixedes
RW Davies
S Gonen
S Hoecke Van den
Steve D. Rounsley
TS Korneliussen
X Chen
Y Benjamini
Y Guo
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Abstract Background Inherent sources of error and bias that affect the quality of sequence data include index hopping and bias towards the reference allele. The impact of these artefacts is likely greater for low-coverage data than for high-coverage data because low-coverage data has scant information and many standard tools for processing sequence data were designed for high-coverage data. With the proliferation of cost-effective low-coverage sequencing, there is a need to understand the impact of these errors and bias on resulting genotype calls from low-coverage sequencing. Results We used a dataset of 26 pigs sequenced both at 2× with multiplexing and at 30× without multiplexing to show that index hopping and bias towards the reference allele due to alignment had little impact on genotype calls. However, pruning of alternative haplotypes supported by a number of reads below a predefined threshold, which is a default and desired step of some variant callers for removing potential sequencing errors in high-coverage data, introduced an unexpected bias towards the reference allele when applied to low-coverage sequence data. This bias reduced best-guess genotype concordance of low-coverage sequence data by 19.0 absolute percentage points. Conclusions We propose a simple pipeline to correct the preferential bias towards the reference allele that can occur during variant discovery and we recommend that users of low-coverage sequence data be wary of unexpected biases that may be produced by bioinformatic tools that were designed for high-coverage sequence data

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Edinburgh Research Explorer

Repositori Obert UdL

HAL: Hyper Article en Ligne

A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

Author: Albert Victor A
Ayyampalayam Saravanaraj
Banks Jody
Barbazuk Brad
Bowers John E
Carlson John E
Chamala Srikar
Chanderbali Andre
Collura Kristi
dePamphilis Claude W
Duarte Jill
Estill James C
Goicoechea José Luis
Jiao Yuannian
Kudrna Dave
Leebens-Mack Jim
Luo Meizhong
Ma Hong
Mandoli Dina
Paterson Andrew H
Pires J Chris
Rounsley Steve
Sebastian Aswathy
Soltis Douglas E
Soltis Pamela S
Tang Haibao
Tomkins Jeffrey
Wing Rod A
Xiong Zhiyong
Yu Yeisoo
Zuccolo Andrea
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome.Results: Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella.Conclusions: When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution

Crossref

Springer - Publisher Connector

PubMed Central

The University of Arizona

Archivio della ricerca della Scuola Superiore Sant'Anna

UQ eSpace (University of Queensland)

The Genome of Nectria haematococca: Contribution of Supernumerary Chromosomes to Gene Expansion

The ascomycetous fungus Nectria haematococca, (asexual name Fusarium solani), is a member of a group of >50 species known as the “Fusarium solani species complex”. Members of this complex have diverse biological properties including the ability to cause disease on >100 genera of plants and opportunistic infections in humans. The current research analyzed the most extensively studied member of this complex, N. haematococca mating population VI (MPVI). Several genes controlling the ability of individual isolates of this species to colonize specific habitats are located on supernumerary chromosomes. Optical mapping revealed that the sequenced isolate has 17 chromosomes ranging from 530 kb to 6.52 Mb and that the physical size of the genome, 54.43 Mb, and the number of predicted genes, 15,707, are among the largest reported for ascomycetes. Two classes of genes have contributed to gene expansion: specific genes that are not found in other fungi including its closest sequenced relative, Fusarium graminearum; and genes that commonly occur as single copies in other fungi but are present as multiple copies in N. haematococca MPVI. Some of these additional genes appear to have resulted from gene duplication events, while others may have been acquired through horizontal gene transfer. The supernumerary nature of three chromosomes, 14, 15, and 17, was confirmed by their absence in pulsed field gel electrophoresis experiments of some isolates and by demonstrating that these isolates lacked chromosome-specific sequences found on the ends of these chromosomes. These supernumerary chromosomes contain more repeat sequences, are enriched in unique and duplicated genes, and have a lower G+C content in comparison to the other chromosomes. Although the origin(s) of the extra genes and the supernumerary chromosomes is not known, the gene expansion and its large genome size are consistent with this species' diverse range of habitats. Furthermore, the presence of unique genes on supernumerary chromosomes might account for individual isolates having different environmental niches

Public Library of Science (PLOS)

HAL AMU

Directory of Open Access Journals

PubMed Central

University of Kentucky

Purdue E-Pubs

VTT Research System

ProdInra

International Migration, Integration and Social Cohesion online publications

UvA-DARE

The Future of Rice Genomics: Sequencing the Collective Oryza Genome

Author: A Coghlan
A Navarro
A Roulin
A Widmer
B Piegu
DF Conrad
Drosophila Sequencing Consortium
DS Brar
DS Brar
F Lu
G Bertoni
H Kim
H Kim
International Rice Genome Sequencing Project (IRGSP)
J Yu
Jetty Siva S. Ammiraju
Jose Luis Goicoechea
JS Ammiraju
JS Ammiraju
JSS Ammiraju
JSS Ammiraju
JW Thomas
K Livingstone
L Feuk
LH Rieseberg
LH Rieseberg
MAF Noor
Mingsheng Chen
Pradeep Reddy Marri
Q Zhang
QF Zhang
R Wing
Rice Chromosome 3 Sequencing Consortium
Rod A. Wing
S Ge
S Rounsley
SA Goff
Scott Jackson
Steve Rounsley
Yeisoo Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

High-resolution linkage map and chromosome-scale genome assembly for cassava (Manihot esculenta Crantz) from 10 populations

Cassava (Manihot esculenta Crantz) is a major staple crop in Africa, Asia, and South America, and its starchy roots provide nourishment for 800 million people worldwide. Although native to South America, cassava was brought to Africa 400–500 years ago and is now widely cultivated across sub-Saharan Africa, but it is subject to biotic and abiotic stresses. To assist in the rapid identification of markers for pathogen resistance and crop traits, and to accelerate breeding programs, we generated a framework map for M. esculenta Crantz from reduced representation sequencing [genotyping-by-sequencing (GBS)]. The composite 2412-cM map integrates 10 biparental maps (comprising 3480 meioses) and organizes 22,403 genetic markers on 18 chromosomes, in agreement with the observed karyotype. We used the map to anchor 71.9% of the draft genome assembly and 90.7% of the predicted protein-coding genes. The chromosome-anchored genome sequence will be useful for breeding improvement by assisting in the rapid identification of markers linked to important traits, and in providing a framework for genomic selectionenhanced breeding of this important crop.Bill and Melinda Gates Foundation (BMGF) Grant OPPGD1493. University of Arizona. CGIAR Research Program on Roots, Tubers, and Bananas. Next Generation Cassava Breeding grant OPP1048542 from BMGF and the United Kingdom Department for International Development. BMGF grant OPPGD1016 to IITA. National Institutes of Health S10 Instrumentation Grants S10RR029668 and S10RR027303.http://www.g3journal.orghb201

UPSpace at the University of Pretoria

Sequence variation, evolutionary constraint, and selection at the CD163 gene in pigs

Author: A McKenna
A Wilm
AB Gussow
BL Aken
BO Fabriek
D Bates
D Smedley
DJ Holtkamp
E Eisenberg
Gregor Gorjanc
H Yang
I Bartha
J Hermisson
JG Calvert
JM Hickey
JM Rodriguez
JM Smith
John M. Hickey
Jonathan Lightner
KD Wells
KE Eilertson
Kimberly Kelly
KM Whitworth
KM Whitworth
LC Bover
M Kristiansen
M Nei
MAM Groenen
Martin Johnsson
Matt A. Campbell
MM Heuvel Van den
NR Garud
P Cingolani
P Danecek
P Philippidis
R Ros-Freixedes
R Ros-Freixedes
RD Finn
Roger Ros-Freixedes
S Gonen
S Petrovski
Steve Rounsley
Sudhir Naswa
W McLaren
WJ Kent
Y Caì
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Abstract Background In this work, we investigated sequence variation, evolutionary constraint, and selection at the CD163 gene in pigs. A functional CD163 protein is required for infection by porcine reproductive and respiratory syndrome virus, which is a serious pathogen with major impacts on pig production. Results We used targeted pooled sequencing of the exons of CD163 to detect sequence variants in 35,000 pigs of diverse genetic backgrounds and to search for potential stop-gain and frameshift indel variants. Then, we used whole-genome sequence data from three pig lines to calculate: a variant intolerance score that measures the tolerance of genes to protein coding variation; an estimate of selection on protein-coding variation over evolutionary time; and haplotype diversity statistics to detect recent selective sweeps during breeding. Conclusions Using a deep survey of sequence variation in the CD163 gene in domestic pigs, we found no potential knockout variants. The CD163 gene was moderately intolerant to variation and showed evidence of positive selection in the pig lineage, but no evidence of recent selective sweeps during breeding

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Edinburgh Research Explorer

Repositori Obert UdL

HAL: Hyper Article en Ligne