Search CORE

95 research outputs found

GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses

Author: Besemer John
Borodovsky Mark
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

The task of gene identification frequently confronting researchers working with both novel and well studied genomes can be conveniently and reliably solved with the help of the GeneMark web software (). The website provides interfaces to the GeneMark family of programs designed and tuned for gene prediction in prokaryotic, eukaryotic and viral genomic sequences. Currently, the server allows the analysis of nearly 200 prokaryotic and >10 eukaryotic genomes using species-specific versions of the software and pre-computed gene models. In addition, genes in prokaryotic sequences from novel genomes can be identified using models derived on the spot upon sequence submission, either by a relatively simple heuristic approach or by the full-fledged self-training program GeneMarkS. A database of reannotations of >1000 viral genomes by the GeneMarkS program is also available from the web site. The GeneMark website is frequently updated to provide the latest versions of the software and gene models

CiteSeerX

Crossref

PubMed Central

Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

Author: Altunbasak Yucel
Aydin Zafer
Borodovsky Mark
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous) proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present. RESULTS: In this paper, we further refine and extend the hidden semi-Markov model (HSMM) initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition. CONCLUSIONS: We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable to current similarity search methods

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

2K09 and thereafter : the coming era of integrative bioinformatics, systems biology and intelligent computing for functional genomics and personalized medicine research

Author: Arabnia Hamid R
Athey Brian D
Bajcsy Ruzena
Borodovsky Mark
Deng Youping
Dunker A Keith
Ersoy Okan K
Ghafoor Arif
Li Guo-zheng
Liu Yunlong
Niemierko Andrzej
Xu Dong
Yang Jack Y
Zhang Aidong
Zhang Joe C
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Significant interest exists in establishing synergistic research in bioinformatics, systems biology and intelligent computing. Supported by the United States National Science Foundation (NSF), International Society of Intelligent Biological Medicine (http://www.ISIBM.org), International Journal of Computational Biology and Drug Design (IJCBDD) and International Journal of Functional Informatics and Personalized Medicine, the ISIBM International Joint Conferences on Bioinformatics, Systems Biology and Intelligent Computing (ISIBM IJCBS 2009) attracted more than 300 papers and 400 researchers and medical doctors world-wide. It was the only inter/multidisciplinary conference aimed to promote synergistic research and education in bioinformatics, systems biology and intelligent computing. The conference committee was very grateful for the valuable advice and suggestions from honorary chairs, steering committee members and scientific leaders including Dr. Michael S. Waterman (USC, Member of United States National Academy of Sciences), Dr. Chih-Ming Ho (UCLA, Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Wing H. Wong (Stanford, Member of United States National Academy of Sciences), Dr. Ruzena Bajcsy (UC Berkeley, Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Qu Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Andrzej Niemierko (Harvard), Dr. A. Keith Dunker (Indiana), Dr. Brian D. Athey (Michigan), Dr. Weida Tong (FDA, United States Department of Health and Human Services), Dr. Cathy H. Wu (Georgetown), Dr. Dong Xu (Missouri), Drs. Arif Ghafoor and Okan K Ersoy (Purdue), Dr. Mark Borodovsky (Georgia Tech, President of ISIBM), Dr. Hamid R. Arabnia (UGA, Vice-President of ISIBM), and other scientific leaders. The committee presented the 2009 ISIBM Outstanding Achievement Awards to Dr. Joydeep Ghosh (UT Austin), Dr. Aidong Zhang (Buffalo) and Dr. Zhi-Hua Zhou (Nanjing) for their significant contributions to the field of intelligent biological medicine

Aquila Digital Community

Crossref

IUPUIScholarWorks

Harvard University - DASH

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Deep Blue Documents at the University of Michigan

The \u3ci\u3eChlorella variabilis\u3c/i\u3e NC64A Genome Reveals Adaptation to Photosymbiosis, Coevolution with Viruses, and Cryptic Sex

Author: Agarkova Irina
Blanc Guillaume
Borodovsky Mark
Claverie Jean-Michel
Duncan Garry A.
Dunigan David D
Grigoriev Igor V.
Gurnon James
Kuo Alan
Lindquist Erika
Lucas Susan
Pangilinan Jasmyn
Polle Juergen
Salamov Asaf
Terry Astrid
Van Etten James L
Yamada Takashi
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/09/2010
Field of study

Chlorella variabilis NC64A, a unicellular photosynthetic green alga (Trebouxiophyceae), is an intracellular photobiont of Paramecium bursaria and a model system for studying virus/algal interactions. We sequenced its 46-Mb nuclear genome, revealing an expansion of protein families that could have participated in adaptation to symbiosis. NC64A exhibits variations in GC content across its genome that correlate with global expression level, average intron size, and codon usage bias. Although Chlorella species have been assumed to be asexual and nonmotile, the NC64A genome encodes all the known meiosis-specific proteins and a subset of proteins found in flagella. We hypothesize that Chlorella might have retained a flagella-derived structure that could be involved in sexual reproduction. Furthermore, a survey of phytohormone pathways in chlorophyte algae identified algal orthologs of Arabidopsis thaliana genes involved in hormone biosynthesis and signaling, suggesting that these functions were established prior to the evolution of land plants. We show that the ability of Chlorella to produce chitinous cell walls likely resulted from the capture of metabolic genes by horizontal gene transfer from algal viruses, prokaryotes, or fungi. Analysis of the NC64A genome substantially advances our understanding of the green lineage evolution, including the genomic interplay with viruses and symbiosis between eukaryotes

DigitalCommons@University of Nebraska

The genome sequence and transcriptome of Potentilla micrantha and their comparison to Fragaria vesca (the woodland strawberry)

Author: Alonge Michael
Barghini Elena
Borodovsky Mark
Brilli Matteo
Buti Matteo
Cavallini Andrea
Cestaro Alessandro
Engelen Kristof
Giongo Lara
James Sargent Daniel
Lomsadze Alexandre
Mascagni Flavia
Moretto Marco
Natali Lucia
Sonego Paolo
Varotto Claudio
Velasco Riccardo
Ward Judson A.
Šurbanovski Nada
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Background The genus Potentilla is closely related to that of Fragaria, the economically important strawberry genus. Potentilla micrantha is a species that does not develop berries, but shares numerous morphological and ecological characteristics with F. vesca. These similarities make P. micrantha an attractive choice for comparative genomics studies with F. vesca Findings In this study, the Potentilla micrantha genome was sequenced and annotated, and RNA-Seq data from the different developmental stages of flowering and fruiting were used to develop a set of gene predictions. A 327 Mbp sequence and annotation of the genome of P. micrantha, spanning 2,674 sequence contigs, with an N50 size of 335,712, estimated to cover 80% of the total genome size of the species was developed. The genus Potentilla has a characteristically larger genome size than Fragaria, but the recovered sequence scaffolds were remarkably collinear at the micro-syntenic level with the genome of F. vesca, its closest sequenced relative. A total of 33,602 genes were predicted, and 95.1% of BUSCO genes were complete within the presented sequence. Thus, we argue that the majority of the gene-rich regions of the genome have been sequenced Conclusions Comparisons of RNA-Seq data from the stages of floral and fruit development revealed genes differentially expressed between P. micrantha and F. vesca. The data presented are a valuable resource for future studies of berry development in Fragaria and the Rosaceae and they also shed light on the evolution of genome size and organization in this family

Archivio istituzionale della ricerca - Fondazione Edmund Mach

AIR Universita degli studi di Milano

Florence Research

Archivio della Ricerca - Università di Pisa

A chromosome-length genome assembly and annotation of blackberry (Rubus argutus, cv. "Hillquist")

Author: Aiden Erez Lieberman
Andres Javier
Armour Mitchell
Aryal Rishi
Ashrafi Hamid
Bassil Nahla
Borodovsky Mark
Britton Caitlin
Bruna Tomas
Buti Matteo
Cavallini Andrea
Davik Jahn
Dudchenko Olga
Fernandez Gina E.
Hytönen Timo
Lomsadze Alexandre
Mascagni Flavia
Mead Daniel
Natali Lucia
Olukolu Bode
Pham Melanie
Poorten Thomas
Sargent Daniel James
Usai Gabriele
Weisz David
Worthington Margaret
Publication venue
Publication date: 04/11/2022
Field of study

Blackberries (Rubus spp.) are the fourth most economically important berry crop worldwide. Genome assemblies and annotations have been developed for Rubus species in subgenus Idaeobatus, including black raspberry (R. occidentalis), red raspberry (R. idaeus), and R. chingii, but very few genomic resources exist for blackberries and their relatives in subgenus Rubus. Here we present a chromosome-length assembly and annotation of the diploid blackberry germplasm accession "Hillquist" (R. argutus). "Hillquist" is the only known source of primocane-fruiting (annual-fruiting) in tetraploid fresh-market blackberry breeding programs and is represented in the pedigree of many important cultivars worldwide. The "Hillquist" assembly, generated using Pacific Biosciences long reads scaffolded with high-throughput chromosome conformation capture sequencing, consisted of 298 Mb, of which 270 Mb (90%) was placed on 7 chromosome-length scaffolds with an average length of 38.6 Mb. Approximately 52.8% of the genome was composed of repetitive elements. The genome sequence was highly collinear with a novel maternal haplotype-resolved linkage map of the tetraploid blackberry selection A-2551TN and genome assemblies of R. chingii and red raspberry. A total of 38,503 protein-coding genes were predicted, of which 72% were functionally annotated. Eighteen flowering gene homologs within a previously mapped locus aligning to an 11.2 Mb region on chromosome Ra02 were identified as potential candidate genes for primocane-fruiting. The utility of the "Hillquist" genome has been demonstrated here by the development of the first genotyping-by-sequencing-based linkage map of tetraploid blackberry and the identification of possible candidate genes for primocane-fruiting. This chromosome-length assembly will facilitate future studies in Rubus biology, genetics, and genomics and strengthen applied breeding programs.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

NIBIO Brage

Metagenomics of the Svalbard Reindeer Rumen Microbiome Reveals Abundance of Polysaccharide Utilization Loci

Author: Alasdair K. Mackenzie
Alice C. McHardy
BJ Haas
BL Cantarel
C Lozupone
CG Orpin
CJ Duan
CJ Duan
CP Rosewarne
D Dodd
D Hyatt
EA Bayer
EC Martens
EC Martens
F Warnecke
FD Ciccarelli
G Suen
H Noguchi
Ivan Gregor
J Vogel
JA Shipman
JG Caporaso
JG Caporaso
JM Brulc
JR Cole
KR Patil
LV Mello
M Borodovsky
M Hamady
M Hess
M Sekelja
MA Sundset
Mark Morrison
Mark R. Liles
MN Price
Monica A. Sundset
P Shannon
PB Pope
Phillip B. Pope
RC Edgar
SY Ding
TZ DeSantis
V Gomez-Alvarez
Vincent G.H. Eijsink
VM Markowitz
VM Markowitz
W Sørmo
Wendy Smith
ZA Popper
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Lignocellulosic biomass remains a largely untapped source of renewable energy predominantly due to its recalcitrance and an incomplete understanding of how this is overcome in nature. We present here a compositional and comparative analysis of metagenomic data pertaining to a natural biomass-converting ecosystem adapted to austere arctic nutritional conditions, namely the rumen microbiome of Svalbard reindeer (Rangifer tarandus platyrhynchus). Community analysis showed that deeply-branched cellulolytic lineages affiliated to the Bacteroidetes and Firmicutes are dominant, whilst sequence binning methods facilitated the assemblage of metagenomic sequence for a dominant and novel Bacteroidales clade (SRM-1). Analysis of unassembled metagenomic sequence as well as metabolic reconstruction of SRM-1 revealed the presence of multiple polysaccharide utilization loci-like systems (PULs) as well as members of more than 20 glycoside hydrolase and other carbohydrate-active enzyme families targeting various polysaccharides including cellulose, xylan and pectin. Functional screening of cloned metagenome fragments revealed high cellulolytic activity and an abundance of PULs that are rich in endoglucanases (GH5) but devoid of other common enzymes thought to be involved in cellulose degradation. Combining these results with known and partly re-evaluated metagenomic data strongly indicates that much like the human distal gut, the digestive system of herbivores harbours high numbers of deeply branched and as-yet uncultured members of the Bacteroidetes that depend on PUL-like systems for plant biomass degradation

Crossref

Directory of Open Access Journals

PubMed Central

Munin - Open Research Archive

NORA - Norwegian Open Research Archives

MPG.PuRe

University of Queensland eSpace

FigShare