Search CORE

3,368 research outputs found

Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication

Author: Brockmann H Jane
Havlak Paul
Lv Jie
Nossa Carlos
Putnam Nicholas H
Vincent Kim
Yue Jia-Xing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/09/2013
Field of study

Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of "living fossils." As arthropods, they belong to the Ecdysozoa}, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes, and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Here we use a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers and 5,775 candidate conserved protein coding genes. Comparison to other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications (WGDs) ~ 300 MYA, followed by extensive chromosome fusion

arXiv.org e-Print Archive

Springer - Publisher Connector

PubMed Central

DSpace at Rice University

The EM Algorithm and the Rise of Computational Biology

Author: Citable Link
Jun S. Liu
Xiaodan Fan
Yuan Yuan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

In the past decade computational biology has grown from a cottage industry with a handful of researchers to an attractive interdisciplinary field, catching the attention and imagination of many quantitatively-minded scientists. Of interest to us is the key role played by the EM algorithm during this transformation. We survey the use of the EM algorithm in a few important computational biology problems surrounding the "central dogma"; of molecular biology: from DNA to RNA and then to proteins. Topics of this article include sequence motif discovery, protein sequence alignment, population genetics, evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

Author: A Bairoch
A Christoffels
A Gurevich
A Kozomara
A McKenna
A Mitchell
A Morgulis
A Morgulis
A Pradhan
A Reiner
A Rodriguez-Mari
A Stamatakis
A Yates
AI Makunin
AJ Enright
AL Price
AL Price
Alan Christoffels
Aleksey Komissarov
Alexey Tupikin
Amy Hin Yan Tong
Andrey A. Yurchenko
AR Quinlan
B Langmead
B Star
C Berthelot
C Camacho
C Holt
C Wang
Chen-Shan Chin
CS Chin
D Brawand
D Ellinghaus
DA Benson
Darrell Green
DC Hardie
Dean R. Jerry
DH Alexander
Doreen Lau
DR Kelley
DRS-K C. Jerry
E Casacuberta
E. TG Staristina
EW Myers
F Abascal
F Chen
F Yang
FC Jones
FJ Krsticevic
Fritz J. Sedlazeck
G Abrusan
G Benson
G Lin
G Marcais
G Parra
G Parra
G Tamazian
GH Yue
GH Yue
Gopikrishna Gopalapillai
Gregory W. Vurture
GS Slater
GT Valente
H Li
H Saiga
Heiner Kuhl
HH Kazazian Jr.
I Braasch
Inna S. Kuznetsova
IS Kuznetsova
J Castresana
J Eid
J Huerta-Cepas
J Jurka
J Lin
James P. Drake
JG Ruby
JN Volff
JN Volff
Jolly M. Saju
Jonas Korlach
JS Chew
Junhui Jiang
K Howe
K Katoh
K Prufer
Kathiresan Purushothaman
KD Pruitt
KJ Hoff
KP Koepfli
KW Tzung
Lawrence S. Hon
László Orbán
M Blanchette
M Kanehisa
M Kasahara
M Kolmogorov
M Krzywinski
M Martin
M Schartl
M Tarailoâ-Graovac
M Tine
MA Larkin
Mario Jonas
Marsel Kabilov
Matthew Boitano
MB Stocks
MG Grabherr
Michael C. Schatz
MJ Chaisson
MR Friedlander
N Siegel
Natascha M. Thevasagayam
NM Thevasagayam
O Jaillon
O Otero
P Cingolani
P Ravi
P Schattner
P Shannon
P Xu
Paul M. Richardson
PE Warburton
Peter Van Heusden
R Kajitani
R Lorenz
R Luo
R Moore
R Pethiyagoda
R Poulter
R She
R Sreenivasan
Ramkumar Lachumanan
RD Ward
RD Ward
Richard Hall
RJ Roberts
S Chen
S Guindon
S Hoegg
S Hoegg
S Koren
S Vij
S Zhou
Sai Rama Sridatta Prakki
Sarah Mwangi
SF Altschul
Shubha Vij
Si Lok
Si Yan Ngoh
Siddharth Singh
Simon Moxon
SM Kielbasa
Sridhar Sivasubbu
Stanley Kimbung Mbandi
Stephen J. O'Brien
Stephen W. Turner
T Anantharaman
Tamás Dalmay
Tansyn H. Noble
TD Wu
TF DeLuca
TH O'Hare
TLO Davis
TS Anantharaman
Tyler Garvin
U Consortium
U Grimholt
V Douard
V Ravi
Vinaya Kumar Katneni
Vinod Scaria
Vladimir Trifonov
W Xue
WC Liew
Woei Chang Liew
WS Davidson
X Huang
X Zheng
XG Wang
XG Wang
Xueyan Shen
Y Guiguen
Y Han
Y Hashiguchi
Y Moriya
Y Sato
Y Sato
Y Sato
Z Lai
Ø Hammer
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

Public Library of Science (PLOS)

ResearchOnline@JCU

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

ResearchOnline at James Cook University

PubMed Central

Research Repository

Repository of the Academy's Library

University of East Anglia digital repository

NSU Works

MPG.PuRe

SUFFIX TREE, MINWISE HASHING AND STREAMING ALGORITHMS FOR BIG DATA ANALYSIS IN BIOINFORMATICS

Author: Behera Sairam
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 04/12/2020
Field of study

In this dissertation, we worked on several algorithmic problems in bioinformatics using mainly three approaches: (a) a streaming model, (b) sux-tree based indexing, and (c) minwise-hashing (minhash) and locality-sensitive hashing (LSH). The streaming models are useful for large data problems where a good approximation needs to be achieved with limited space usage. We developed an approximation algorithm (Kmer-Estimate) using the streaming approach to obtain a better estimation of the frequency of k-mer counts. A k-mer, a subsequence of length k, plays an important role in many bioinformatics analyses such as genome distance estimation. We also developed new methods that use sux tree, a trie data structure, for alignment-free, non-pairwise algorithms for a conserved non-coding sequence (CNS) identification problem. We provided two different algorithms: STAG-CNS to identify exact-matched CNSs and DiCE to identify CNSs with mismatches. Using our algorithms, CNSs among various grass species were identified. A different approach was employed for identification of longer CNSs ( 100 bp, mostly found in animals). In our new method (MinCNE), the minhash approach was used to estimate the Jaccard similarity. Using also LSH, k-mers extracted from genomic sequences were clustered and CNSs were identified. Another new algorithm (MinIsoClust) that also uses minhash and LSH techniques was developed for an isoform clustering problem. Isoforms are generated from the same gene but by alternative splicing. As the isoform sequences share some exons but in different combinations, regular sequencing clustering methods do not work well. Our algorithm generates clusters for isoform sequences based on their shared minhash signatures. Finally, we discuss de novo transcriptome assembly algorithms and how to improve the assembly accuracy using ensemble approaches. First, we did a comprehensive performance analysis on different transcriptome assemblers using simulated benchmark datasets. Then, we developed a new ensemble approach (Minsemble) for the de novo transcriptome assembly problem that integrates isoform-clustering using minhash technique to identify potentially correct transcripts from various de novo transcriptome assemblers. Minsemble identified more correctly assembled transcripts as well as genes compared to other de novo and ensemble methods. Adviser: Jitender S. Deogu

Recommended from our members

Genomic signatures of heterokaryosis in the oomycete pathogen Bremia lactucae.

Author: Bertier Lien D
Cavanaugh Keri
Fletcher Kyle
Gil Juliana
Kenefick Aubrey
Michelmore Richard
Reyes-Chin-Wo Sebastian
Tsuchida Cayla
Wong Joan
Wood Kelsey J
Zhang Lin
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Lettuce downy mildew caused by Bremia lactucae is the most important disease of lettuce globally. This oomycete is highly variable and rapidly overcomes resistance genes and fungicides. The use of multiple read types results in a high-quality, near-chromosome-scale, consensus assembly. Flow cytometry plus resequencing of 30 field isolates, 37 sexual offspring, and 19 asexual derivatives from single multinucleate sporangia demonstrates a high incidence of heterokaryosis in B. lactucae. Heterokaryosis has phenotypic consequences on fitness that may include an increased sporulation rate and qualitative differences in virulence. Therefore, selection should be considered as acting on a population of nuclei within coenocytic mycelia. This provides evolutionary flexibility to the pathogen enabling rapid adaptation to different repertoires of host resistance genes and other challenges. The advantages of asexual persistence of heterokaryons may have been one of the drivers of selection that resulted in the loss of uninucleate zoospores in multiple downy mildews

eScholarship - University of California

Hiking in the energy landscape in sequence space: a bumpy road to good folders

Author: Broglia R. A.
Shakhnovich E. I.
Tiana G.
Publication venue
Publication date: 25/05/1999
Field of study

With the help of a simple 20 letters, lattice model of heteropolymers, we investigate the energy landscape in the space of designed good-folder sequences. Low-energy sequences form clusters, interconnected via neutral networks, in the space of sequences. Residues which play a key role in the foldability of the chain and in the stability of the native state are highly conserved, even among the chains belonging to different clusters. If, according to the interaction matrix, some strong attractive interactions are almost degenerate (i.e. they can be realized by more than one type of aminoacid contacts) sequence clusters group into a few super-clusters. Sequences belonging to different super-clusters are dissimilar, displaying very small (

\approx 10%

) similarity, and residues in key-sites are, as a rule, not conserved. Similar behavior is observed in the analysis of real protein sequences.Comment: 17 pages 5 figures Corrected typos added auxiliary informatio

arXiv.org e-Print Archive

AIR Universita degli studi di Milano