Search CORE

171 research outputs found

PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines

Author: A Kloczkowski
AA Salamov
B Rost
B Rost
B Rost
C Cole
D Frishman
Dariusz Plewczynski
DG Kneller
H Lin
J Garnier
J Garnier
J Guo
JA Cuff
JF Gibrat
K Wu
LM Jonathon
M Ouali
Mahantapas Kundu
Mita Nasipuri
N Qian
P Chatterjee
Piyali Chatterjee
PY Chou
RD King
SF Altschul
Subhadip Basu
TD Jones
Publication venue: Springer-Verlag
Publication date: 01/01/2011
Field of study

Secondary structure prediction is a crucial task for understanding the variety of protein structures and performed biological functions. Prediction of secondary structures for new proteins using their amino acid sequences is of fundamental importance in bioinformatics. We propose a novel technique to predict protein secondary structures based on position-specific scoring matrices (PSSMs) and physico-chemical properties of amino acids. It is a two stage approach involving multiclass support vector machines (SVMs) as classifiers for three different structural conformations, viz., helix, sheet and coil. In the first stage, PSSMs obtained from PSI-BLAST and five specially selected physicochemical properties of amino acids are fed into SVMs as features for sequence-to-structure prediction. Confidence values for forming helix, sheet and coil that are obtained from the first stage SVM are then used in the second stage SVM for performing structure-to-structure prediction. The two-stage cascaded classifiers (PSP_MCSVM) are trained with proteins from RS126 dataset. The classifiers are finally tested on target proteins of critical assessment of protein structure prediction experiment-9 (CASP9). PSP_MCSVM with brainstorming consensus procedure performs better than the prediction servers like Predator, DSC, SIMPA96, for randomly selected proteins from CASP9 targets. The overall performance is found to be comparable with the current state-of-the art. PSP_MCSVM source code, train-test datasets and supplementary files are available freely in public domain at: http://sysbio.icm.edu.pl/secstruct and http://code.google.com/p/cmater-bioinfo

Crossref

Springer - Publisher Connector

PubMed Central

Identification and In Vivo Characterization of NvFP-7R, a Developmentally Regulated Red Fluorescent Protein of Nematostella vectensis

Author: A Kusserow
A Miyawaki
A Salih
AA Pakhomov
AA Salamov
AH Wikramanayake
Aissam Ikmi
C Hand
CH Mazel
CV Palmer
CV Palmer
D Shcherbo
DJ Miller
DM Chudakov
DQ Matus
E Renfer
F Rentzsch
F Yang
G Genikhovich
HT Kao
IV Kelmanson
JD Thompson
JH Fritzenwanker
JR Finnerty
K Tamura
M Mavrakis
M Murate
M Ormo
M Saina
Matthew C. Gibson
ME Protas
MQ Martindale
NC Shaner
NC Shaner
NH Putnam
NO Alieva
Patrick Callaerts
PM Burton
RC Edgar
RM Wachter
Y Sun
YA Labas
Publication venue: Public Library of Science
Publication date: 01/07/2010
Field of study

In recent years, the sea anemone Nematostella vectensis has emerged as a critical model organism for comparative genomics and developmental biology. Although Nematostella is a member of the anthozoan cnidarians (known for producing an abundance of diverse fluorescent proteins (FPs)), endogenous patterns of Nematostella fluorescence have not been described and putative FPs encoded by the genome have not been characterized.We described the spatiotemporal expression of endogenous red fluorescence during Nematostella development. Spatially, there are two patterns of red fluorescence, both restricted to the oral endoderm in developing polyps. One pattern is found in long fluorescent domains associated with the eight mesenteries and the other is found in short fluorescent domains situated between tentacles. Temporally, the long domains appear simultaneously at the 12-tentacle stage. In contrast, the short domains arise progressively between the 12- and 16-tentacle stage. To determine the source of the red fluorescence, we used bioinformatic approaches to identify all possible putative Nematostella FPs and a Drosophila S2 cell culture assay to validate NvFP-7R, a novel red fluorescent protein. We report that both the mRNA expression pattern and spectral signature of purified NvFP-7R closely match that of the endogenous red fluorescence. Strikingly, the red fluorescent pattern of NvFP-7R exhibits asymmetric expression along the directive axis, indicating that the nvfp-7r locus senses the positional information of the body plan. At the tissue level, NvFP-7R exhibits an unexpected subcellular localization and a complex complementary expression pattern in apposed epithelia sheets comprising each endodermal mesentery.These experiments not only identify NvFP-7R as a novel red fluorescent protein that could be employed as a research tool; they also uncover an unexpected spatio-temporal complexity of gene expression in an adult cnidarian. Perhaps most importantly, our results define Nematostella as a new model organism for understanding the biological function of fluorescent proteins in vivo

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome

Author: A Kasprzyk
AA Salamov
AV Zimin
Bovine Genome Sequencing and Analysis Consortium
C Michael Dickens
CG Elsik
CG Elsik
Christine G Elsik
Christopher P Childers
CJ Mungall
Donald C Vile
G Parra
GS Slater
Jaideep P Sundaram
Justin T Reese
K Eilbeck
KD Pruitt
Kevin L Childs
LD Stein
MS Boguski
P Flicek
RJ Wilson
SE Lewis
SF Altschul
TD Wu
The UniProt Consortium
V Solovyev
Y Liu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Texas A&M Repository

Using ESTs to improve the accuracy of de novo gene prediction

Author: A Krogh
AA Salamov
AC Siepel
C Wei
Chaochun Wei
DR Maglott
E Birney
I Korf
JE Allen
JE Allen
KD Pruitt
KD Pruitt
KL Howe
L Stein
LW Hillier
M Stanke
MG Reese
Michael R Brent
MJ van Baren
MR Brent
MS Boguski
P Flicek
R Guigo
R Guigó
R Mott
RA Gibbs
RH Brown
RH Waterston
S Foissac
SS Gross
The MGC Project Team
TW Harris
TW Harris
VV Solovyev
WJ Kent
Publication venue: BioMed Central
Publication date: 01/07/2006
Field of study

BACKGROUND: ESTs are a tremendous resource for determining the exon-intron structures of genes, but even extensive EST sequencing tends to leave many exons and genes untouched. Gene prediction systems based exclusively on EST alignments miss these exons and genes, leading to poor sensitivity. De novo gene prediction systems, which ignore ESTs in favor of genomic sequence, can predict such "untouched" exons, but they are less accurate when predicting exons to which ESTs align. TWINSCAN is the most accurate de novo gene finder available for nematodes and N-SCAN is the most accurate for mammals, as measured by exact CDS gene prediction and exact exon prediction. RESULTS: TWINSCAN_EST is a new system that successfully combines EST alignments with TWINSCAN. On the whole C. elegans genome TWINSCAN_EST shows 14% improvement in sensitivity and 13% in specificity in predicting exact gene structures compared to TWINSCAN without EST alignments. Not only are the structures revealed by EST alignments predicted correctly, but these also constrain the predictions without alignments, improving their accuracy. For the human genome, we used the same approach with N-SCAN, creating N-SCAN_EST. On the whole genome, N-SCAN_EST produced a 6% improvement in sensitivity and 1% in specificity of exact gene structure predictions compared to N-SCAN. CONCLUSION: TWINSCAN_EST and N-SCAN_EST are more accurate than TWINSCAN and N-SCAN, while retaining their ability to discover novel genes to which no ESTs align. Thus, we recommend using the EST versions of these programs to annotate any genome for which EST information is available. TWINSCAN_EST and N-SCAN_EST are part of the TWINSCAN open source software package

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio

Author: AA Salamov
AG Clark
Björn Hammesfahr
BM Tyler
C Burge
C Wei
E Birney
E Picardi
E van Nimwegen
ER Mardis
F Odronitz
F Odronitz
F Odronitz
F Odronitz
G Butler
GS Slater
Holger Pillmann
Klas Hatje
M Deutsch
M Srivastava
M Stanke
M Stanke
Martin Kollmar
MJ Benton
MJ Gardner
N Goto
O Keller
Oliver Keller
RF Yeh
SE Prochnik
SF Altschul
SJ Yoon
Stephan Waack
SW Roy
V Solovyev
VN Babenko
WJ Kent
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Obtaining transcripts of homologs of closely related organisms and retrieving the reconstructed exon-intron patterns of the genes is a very important process during the analysis of the evolution of a protein family and the comparative analysis of the exon-intron structure of a certain gene from different species. Due to the ever-increasing speed of genome sequencing, the gap to genome annotation is growing. Thus, tools for the correct prediction and reconstruction of genes in related organisms become more and more important. The tool Scipio, which can also be used via the graphical interface WebScipio, performs significant hit processing of the output of the Blat program to account for sequencing errors, missing sequence, and fragmented genome assemblies. However, Scipio has so far been limited to high sequence similarity and unable to reconstruct short exons. Results Scipio and WebScipio have fundamentally been extended to better reconstruct very short exons and intron splice sites and to be better suited for cross-species gene structure predictions. The Needleman-Wunsch algorithm has been implemented for the search for short parts of the query sequence that were not recognized by Blat. Those regions might either be short exons, divergent sequence at intron splice sites, or very divergent exons. We have shown the benefit and use of new parameters with several protein examples from completely different protein families in searches against species from several kingdoms of the eukaryotes. The performance of the new Scipio version has been tested in comparison with several similar tools. Conclusions With the new version of Scipio very short exons, terminal and internal, of even just one amino acid can correctly be reconstructed. Scipio is also able to correctly predict almost all genes in cross-species searches even if the ancestors of the species separated more than 100 Myr ago and if the protein sequence identity is below 80%. For our test cases Scipio outperforms all other software tested. WebScipio has been restructured and provides easy access to the genome assemblies of about 640 eukaryotic species. Scipio and WebScipio are freely accessible at <url>http://www.webscipio.org</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

TOPAZ1, a Novel Germ Cell-Specific Expressed Gene Conserved during Evolution across Vertebrates

Author: A Aravin
A Baillet
A Girard
A Lingel
A Lingel
A McLaren
A Rajkovic
A Roy
A Suzuki
A Swain
AA Aravin
AA Aravin
AA Aravin
AA Salamov
Adrienne Baillet
Alix Luangpraseuth
B Capel
B Gondos
B Gondos
B Mandon-Pepin
B Petre-Lazar
Béatrice Mandon-Pépin
C Olesen
C Tingen
CL Small
Corinne Cotinot
CR Nicholas
D Wang
DB Menke
DJ Ballow
DJ Trombly
DK Bishop
DL Pittman
Dominique Thépot
E Trautmann
EL Anderson
Elodie Poumerol
Eric Pailhoux
F Bernex
F Yang
Gabriel Livera
H Li
H Wartenberg
HR Sawyer
I Konishi
J Bowles
J Bowles
J Bowles
J Brennecke
J Koubova
JD Thompson
Jean Nicolas Volff
JJ Song
JK Henderson
JM Berg
K Ohta
K Saito
K Yoshida
K Zheng
KS Yan
L Cerutti
L Herrera
L Ma
M Bullejos
M Mark
MA Carmell
MA Edson
ME Pepling
ME Pepling
ME Pepling
N Galtier
N Saitou
N Zamudio
NC Lau
NC Lau
PJ Wang
Renee A. Reijo Pera
RJ Frost
Ronan Le Bouffant
RS Brown
S Grimmond
S Joshi
S Kumar
S Kuramochi-Miyagawa
S Kuramochi-Miyagawa
S Kuramochi-Miyagawa
S Nef
SA Pangas
SL Michel
SM Soyal
SP Krotz
ST Grivna
TM Hall
VV Vagin
W Deng
WS Lai
Y Choi
Y Choi
Y Choi
Y Lin
Y Unhavaithaya
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

BACKGROUND: We had previously reported that the Suppression Subtractive Hybridization (SSH) approach was relevant for the isolation of new mammalian genes involved in oogenesis and early follicle development. Some of these transcripts might be potential new oocyte and granulosa cell markers. We have now characterized one of them, named TOPAZ1 for the Testis and Ovary-specific PAZ domain gene. PRINCIPAL FINDINGS: Sheep and mouse TOPAZ1 mRNA have 4,803 bp and 4,962 bp open reading frames (20 exons), respectively, and encode putative TOPAZ1 proteins containing 1,600 and 1653 amino acids. They possess PAZ and CCCH domains. In sheep, TOPAZ1 mRNA is preferentially expressed in females during fetal life with a peak during prophase I of meiosis, and in males during adulthood. In the mouse, Topaz1 is a germ cell-specific gene. TOPAZ1 protein is highly conserved in vertebrates and specifically expressed in mouse and sheep gonads. It is localized in the cytoplasm of germ cells from the sheep fetal ovary and mouse adult testis. CONCLUSIONS: We have identified a novel PAZ-domain protein that is abundantly expressed in the gonads during germ cell meiosis. The expression pattern of TOPAZ1, and its high degree of conservation, suggests that it may play an important role in germ cell development. Further characterization of TOPAZ1 may elucidate the mechanisms involved in gametogenesis, and particularly in the RNA silencing process in the germ lin

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

Patterns of Sequence Divergence and Evolution of the S1 Orthologous Regions between Asian and African Cultivated Rice Species

Author: A Garavito
A Gusti
A Navarro
AA Hoffmann
AA Salamov
Alain Ghesquière
Andrea Garavito
C Chaparro
C Rizzon
CI Wu
CP Martinez
D Bikard
D Vaughan
DQ Fuller
E Bossolini
EL Sonnhammer
F Lu
F Thibaud-Nissen
Frédérick Gavory
G Second
G Xu
H Kim
HJ Muller
HY Lee
J Chen
J Jurka
J Ma
J Orjuela
JA Coyne
JA Pesin
JD Thompson
JF Abril
Joe Tohme
JP Masly
JS Ammiraju
JS Ammiraju
K Doi
K Rutherford
LH Rieseberg
M Jain
M Sweeney
MA Noor
Mathias Lorieux
Miguel Blazquez
N Jiang
N Osada
N Sarla
NJ Brideau
P Rice
Q Zhu
R Portères
RJ Kulathinal
Romain Guyot
S Kikuchi
S Ouyang
S Ouyang
SS Murray
Sylvie Samain
T Dobzhansky
T Tang
TJ Carver
TL Turner
W Bateson
Y Koide
Y Long
Y Mizuta
Y Sano
Y Sano
Y Wang
Y Yamagata
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

A strong postzygotic reproductive barrier separates the recently diverged Asian and African cultivated rice species, Oryza sativa and O. glaberrima. Recently a model of genetic incompatibilities between three adjacent loci: S1A, S1 and S1B (called together the S1 regions) interacting epistatically, was postulated to cause the allelic elimination of female gametes in interspecific hybrids. Two candidate factors for the S1 locus (including a putative F-box gene) were proposed, but candidates for S1A and S1B remained undetermined. Here, to better understand the basis of the evolution of regions involved in reproductive isolation, we studied the genic and structural changes accumulated in the S1 regions between orthologous sequences. First, we established an 813 kb genomic sequence in O. glaberrima, covering completely the S1A, S1 and the majority of the S1B regions, and compared it with the orthologous regions of O. sativa. An overall strong structural conservation was observed, with the exception of three isolated regions of disturbed collinearity: (1) a local invasion of transposable elements around a putative F-box gene within S1, (2) the multiple duplication and subsequent divergence of the same F-box gene within S1A, (3) an interspecific chromosomal inversion in S1B, which restricts recombination in our O. sativa×O. glaberrima crosses. Beside these few structural variations, a uniform conservative pattern of coding sequence divergence was found all along the S1 regions. Hence, the S1 regions have undergone no drastic variation in their recent divergence and evolution between O. sativa and O. glaberrima, suggesting that a small accumulation of genic changes, following a Bateson-Dobzhansky-Muller (BDM) model, might be involved in the establishment of the sterility barrier. In this context, genetic incompatibilities involving the duplicated F-box genes as putative candidates, and a possible strengthening step involving the chromosomal inversion might participate to the reproductive barrier between Asian and African rice species

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

CGSpace

Horizon / Pleins textes

Comparative analysis of information contents relevant to recognition of introns in many species

Author: A Levine
AA Patel
AA Salamov
AE Vinogradov
C Burge
C Guthrie
CB Burge
CJ Langford
DL Black
DV Lu
E Pruesse
EV Koonin
G Ast
G Kol
GJ Goodall
GS Slater
H Banerjee
Hiroaki Iwata
J Felsenstein
J MacQueen
JA Berglund
JD Beggs
JM Izquierdo
JU Pontius
K Katoh
K Katoh
K Wiebauer
KL Fox-Walsh
L Collins
LP Lim
M Borodovsky
M Chen
M Davila Lopez
M Lynch
M Marz
N Sheth
NF Kaufer
NL Harris
O Gotoh
O Gotoh
Osamu Gotoh
P Puigbo
PA Sharp
PHA Sneath
PP Gardner
S Kullback
S Kullback
SH Schwartz
SL Salzberg
SP Lloyd
TR Gregory
V Anantharaman
V Brendel
V Douris
W Zhu
WH Majoros
Y Kapustin
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Histoplasma capsulatum proteome response to decreased iron availability

Author: AA Salamov
Alan G Smulian
Brittany Catron
CC Philpott
CY Lan
D Becker
Daniel S Spellman
DH Howard
DH Howard
DL Aylor
E Birney
E Masse
Francisco J Gomez
George S Deepe
HL Allen
J Mattow
J Sieper
J Tamarit
KI Orsborn
L Ding
L Hwang
L Shi
LG Eissenberg
LG Eissenberg
LH Hwang
LT Nakamura
Margarita Hernandez
Michael S Winters
MM Timmerman
MM Timmerman
MP Nittler
MV Cano
O Kniemeyer
PC Albuquerque
PJ Anderson
Qilin Chan
R Allendoerfer
R Zarnowski
R Zarnowski
SL Newman
SL Newman
SL Newman
SM Twine
T Lian
TE Lane
TE Lane
TE Lane
Thomas A Neubert
W Mandell
WW Fish
X Huang
XW Zhou
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background A fundamental pathogenic feature of the fungus <it>Histoplasma capsulatum </it>is its ability to evade innate and adaptive immune defenses. Once ingested by macrophages the organism is faced with several hostile environmental conditions including iron limitation. <it>H. capsulatum </it>can establish a persistent state within the macrophage. A gap in knowledge exists because the identities and number of proteins regulated by the organism under host conditions has yet to be defined. Lack of such knowledge is an important problem because until these proteins are identified it is unlikely that they can be targeted as new and innovative treatment for histoplasmosis. Results To investigate the proteomic response by <it>H. capsulatum </it>to decreasing iron availability we have created <it>H. capsulatum </it>protein/genomic databases compatible with current mass spectrometric (MS) search engines. Databases were assembled from the <it>H. capsulatum </it>G217B strain genome using gene prediction programs and expressed sequence tag (EST) libraries. Searching these databases with MS data generated from two dimensional (2D) in-gel digestions of proteins resulted in over 50% more proteins identified compared to searching the publicly available fungal databases alone. Using 2D gel electrophoresis combined with statistical analysis we discovered 42 <it>H. capsulatum </it>proteins whose abundance was significantly modulated when iron concentrations were lowered. Altered proteins were identified by mass spectrometry and database searching to be involved in glycolysis, the tricarboxylic acid cycle, lysine metabolism, protein synthesis, and one protein sequence whose function was unknown. Conclusion We have created a bioinformatics platform for <it>H. capsulatum </it>and demonstrated the utility of a proteomic approach by identifying a shift in metabolism the organism utilizes to cope with the hostile conditions provided by the host. We have shown that enzyme transcripts regulated by other fungal pathogens in response to lowering iron availability are also regulated in <it>H. capsulatum </it>at the protein level. We also identified <it>H. capsulatum </it>proteins sensitive to iron level reductions which have yet to be connected to iron availability in other pathogens. These data also indicate the complexity of the response by <it>H. capsulatum </it>to nutritional deprivation. Finally, we demonstrate the importance of a strain specific gene/protein database for <it>H. capsulatum </it>proteomic analysis.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Multiomics in the central Arctic Ocean for benchmarking biodiversity change.

Author: Balmonte J-P
Barry K
Barry K
Bertilsson S
Boulton W
Bowman J
Bratbak G
Buck M
Chamberlain EJ
Chen I-MA
Clum A
Copeland A
Creamean J
Cunliffe M
Daum C
Ebenhöh O
Eggers SL
Eloe-Fadrosh E
Fong AA
Foster B
Foster B
Gardner J
Gradinger R
Granskog MA
Grigoriev IV
Havermans C
Hill T
Hoppe CJM
Huntemann M
Ivanova N
Korte K
Kuo A
Kyrpides NC
Larsen A
Leggett RM
Metfies K
Mock T
Moulton V
Mukherjee S
Müller O
Nicolaus A
Oldenburg E
Palaniappan K
Popa O
Reddy TBK
Rogge S
Roux S
Salamov A
Schäfer H
Shoemaker K
Snoeijs-Leijonmalm P
Torstensson A
Vader A
Valentin K
Varghese N
Woyke T
Wu D
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2022
Field of study

Multiomics approaches need to be applied in the central Arctic Ocean to benchmark biodiversity change and to identify novel species and their genes. As part of MOSAiC, EcoOmics will therefore be essential for conservation and sustainable bioprospecting in one of the least explored ecosystems on Earth

Plymouth Electronic Archive and Research Library