11 research outputs found
PPStruct: a database of plant protein stuctures and annotations
Motivation: During the last ten years, the development of highthroughput sequencing, has generated a huge amount of genome sequences. Giving biological meaning to this data depends entirely on the capacity to develop instruments for its interpretation and organization. Moreover, once the protein sequences have been identified, functional
annotation requires dedicated usage of an enormous amount of bioinformatics resources
and specialized databases. Sequence annotation is often inaccurate and reliable predictions can only be obtained by using structure based functional annotation methods. These methods require the threedimensional structure of the identified proteins. The experimental solution of protein structures is very time consuming and cannot be applied to all proteins in a genome, but has to be replaced with computational homology models. It is
currently estimated that well over half of the known protein sequences can be predicted in this way. Plant genomics, despite its importance, started later than animal genomics. Currently the are less than ten plant genomes available in a genome browsers and few more at the “draft genome” level. In light of this limited amount of available data, any consideration regarding peculiar plant characteristics has to be considered temporary and seen with caution. Plant genomes were so far mostly annotated by hand, with an enormous expenditure of financial and human resources. Genome annotation for plants has to transit
from prevalently manual towards fully automated annotation, with possible manual
supervision, and is in serious need for the creation of new tools to permit this transition. Methods: PPStruct database and website was designed with a multitier architecture, using separate modules for data management, data processing and presentation functions. To simplify development and maintenance, all tiers handle the common JSON (JavaScript Object Notation) format, thereby eliminating the need for data conversion. The MongoDB
database engine is used for data storage and Node.js as middleware between data and presentation. PPStruct exposes its resources through RESTful web services, by using the
Restify library for Node.js. The Angular.js framework and Bootstrap library were selected to provide the overall lookandfeel. Additional information is added to entries by querying the PDB and UNIPROT web services. Currently the genomes available at the database were annotated for the following features: Domain assignment: InterPro tools set (Hunter et al., 2009) Secondary structure: fastSS (Walsh et al., unpublished) Disordered regions: MobiDB
(Di Domenico et al., Bioinformatics 2012) Homology modelling: HOMER (URL: http://protein.bio.unipd.it/homer/)
Results: Here we present PPStruct, a pipeline and a database dedicated to plant functional annotation. Our effort takes into account several specific aspects exploiting plant differences. The protein structure level is brought into play with the aim to better explain the effects of phenotypic differences at the molecular level. Reliable models are built for each gene transcript identified and the models will be used to better define the function of each protein. PPStruct website is currently under development but will be available soon
from URL: http://ppstruct.bio.unipd.it
Unfoldome variation upon plant-pathogen interactions: strawberry infection by Colletotrichum acutatum
Intrinsically disordered proteins (IDPs) are proteins that lack secondary and/or tertiary structure under physiological conditions. These proteins are very abundant in eukaryotic proteomes and play crucial roles in all molecular mechanisms underlying the response to environmental challenges. In plants, different IDPs involved in stress response have been identified and characterized. Nevertheless, a comprehensive evaluation of protein disorder in plant proteomes under abiotic or biotic stresses is not available so far. In the present work the transcriptome dataset of strawberry (Fragaria x ananassa) fruits interacting with the fungal pathogen Colletotrichum acutatum was actualized onto the woodland strawberry (Fragaria vesca) genome. The obtained cDNA sequences were translated into protein sequences, which were subsequently subjected to disorder analysis. The results, providing the first estimation of disorder abundance associated to plant infection, showed that the proteome activated in the strawberry red fruit during the active fungal propagation is remarkably depleted in disorder. On the other hand, in the resistant white fruit, no significant disorder reduction is observed in the proteins expressed in response to fungal infection. Four representative proteins, FvSMP, FvPRKRIP, FvPCD-4 and FvFAM32A-like, predicted as mainly disordered and never experimentally characterized before, were isolated, and the absence of structure was validated at the secondary and tertiary level using circular dichroism and differential scanning fluorimetry. Their quaternary structure was also established using light scattering. The results are discussed considering the role of protein disorder in plant defens
A CRY FROM THE KRILL
Antarctic krill (Euphausia superba) inhabit a region with strong seasonality in several
parameters, such as photoperiod, light intensity, extent of sea ice, and food availability.
In particular, seasonal changes in environmental light regimes have been shown to
strongly influence krill metabolism, representing control signals for seasonal regulation
of physiology of this key Southern Ocean species. Here, we report the identification of
a cryptochrome gene, a cardinal component of the clockwork machinery in several
organisms. EsCRY appears to be an ortholog of mammalian-like CRYs and clusters
with the insect CRY2 subfamily. EsCRY has the canonical bipartite CRY structure, with
a conserved N-terminal domain and a highly divergent C-terminus, that bears several
binding motifs, some of them shared with insect CRY2 and others peculiar for EsCRY.
We have evaluated the temporal expression of Escry both at mRNA and protein levels
in individuals harvested from the Ross Sea at different times throughout the 24 h cycle
during the Antarctic summer. We observed a daily fluctuation in abundance for Escry
mRNA in the head, with high levels around 06:00 h, which is not mirrored by a cycle
in the corresponding protein. Our findings represent a first step toward establishing
the presence of an endogenous circadian time-keeping mechanism that might allow
this organism to synchronize its physiology and behavior to the Antarctic light regimes
Characterization of intellectual disability and autism comorbidity through gene panel sequencing
none24simixedAspromonte M.C.; Bellini M.; Gasparini A.; Carraro M.; Bettella E.; Polli R.; Cesca F.; Bigoni S.; Boni S.; Carlet O.; Negrin S.; Mammi I.; Milani D.; Peron A.; Sartori S.; Toldo I.; Soli F.; Turolla L.; Stanzial F.; Benedicenti F.; Marino-Buslje C.; Tosatto S.C.E.; Murgia A.; Leonardi E.Aspromonte, M. C.; Bellini, Mariagrazia; Gasparini, A.; Carraro, M.; Bettella, E.; Polli, R.; Cesca, F.; Bigoni, S.; Boni, S.; Carlet, O.; Negrin, S.; Mammi, I.; Milani, Duccio; Peron, A.; Sartori, S.; Toldo, I.; Soli, F.; Turolla, L.; Stanzial, F.; Benedicenti, F.; MARINO BUSLJE, CRISTINA ESTER; Tosatto, S. C. E.; Murgia, A.; Leonardi, E
Performance of computational methods for the evaluation of pericentriolar material 1 missense variants in CAGI-5
The CAGI-5 pericentriolar material 1 (PCM1) challenge aimed to predict the effect of 38 transgenic human missense mutations in the PCM1 protein implicated in schizophrenia. Participants were provided with 16 benign variants (negative controls), 10 hypomorphic, and 12 loss of function variants. Six groups participated and were asked to predict the probability of effect and standard deviation associated to each mutation. Here, we present the challenge assessment. Prediction performance was evaluated using different measures to conclude in a final ranking which highlights the strengths and weaknesses of each group. The results show a great variety of predictions where some methods performed significantly better than others. Benign variants played an important role as negative controls, highlighting predictors biased to identify disease phenotypes. The best predictor, Bromberg lab, used a neural-network-based method able to discriminate between neutral and non-neutral single nucleotide polymorphisms. The CAGI-5 PCM1 challenge allowed us to evaluate the state of the art techniques for interpreting the effect of novel variants for a difficult target protein
Modeling Structural Heterogeneity in Proteins From X-Ray Data
Abstract: In a crystallographic experiment, a protein is precipitated to obtain a crystalline sample (crystal) containing many copies of the molecule. An electron density map (edm) is calculated from diffraction images obtained from focusing X-rays through the sample at different angles. This involves iterative phase determination and density calculation. The protein conformation is modeled by placing the atoms in 3-D space to best match the electron density. In practice, the copies of a protein in a crystal are not exactly in the same conformation. Consequently the obtained edm, which corresponds to the cumulative distribution of atomic positions over all conformations, is blurred. Existing modeling methods compute an “average ” protein conformation by maximizing its fit with the edm and explain structural heterogeneity in the crystal with a harmonic distribution of the position of each atom. However, proteins undergo coordinated conformational variations leading to substantial correlated changes in atomic positions. These variations are biologically important. This paper presents a sample-select approach to model structural heterogeneity by computing an ensemble of conformations (along with occupancies) that, collectively, provide a near-optimal explanation of the edm. The focus is on deformable protein fragments, mainly loops and side-chains. Tests were successfully conducted on simulated and experimental edms.
DOME: recommendations for supervised machine learning validation in biology
SCOPUS: no.jinfo:eu-repo/semantics/publishe