8,001 research outputs found
Using Neural Networks for Relation Extraction from Biomedical Literature
Using different sources of information to support automated extracting of
relations between biomedical concepts contributes to the development of our
understanding of biological systems. The primary comprehensive source of these
relations is biomedical literature. Several relation extraction approaches have
been proposed to identify relations between concepts in biomedical literature,
namely, using neural networks algorithms. The use of multichannel architectures
composed of multiple data representations, as in deep neural networks, is
leading to state-of-the-art results. The right combination of data
representations can eventually lead us to even higher evaluation scores in
relation extraction tasks. Thus, biomedical ontologies play a fundamental role
by providing semantic and ancestry information about an entity. The
incorporation of biomedical ontologies has already been proved to enhance
previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1
FragKB: Structural and Literature Annotation Resource of Conserved Peptide Fragments and Residues
BACKGROUND: FragKB (Fragment Knowledgebase) is a repository of clusters of structurally similar fragments from proteins. Fragments are annotated with information at the level of sequence, structure and function, integrating biological descriptions derived from multiple existing resources and text mining. METHODOLOGY: FragKB contains approximately 400,000 conserved fragments from 4,800 representative proteins from PDB. Literature annotations are extracted from more than 1,700 articles and are available for over 12,000 fragments. The underlying systematic annotation workflow of FragKB ensures efficient update and maintenance of this database. The information in FragKB can be accessed through a web interface that facilitates sequence and structural visualization of fragments together with known literature information on the consequences of specific residue mutations and functional annotations of proteins and fragment clusters. FragKB is accessible online at http://ubio.bioinfo.cnio.es/biotools/fragkb/. SIGNIFICANCE: The information presented in FragKB can be used for modeling protein structures, for designing novel proteins and for functional characterization of related fragments. The current release is focused on functional characterization of proteins through inspection of conservation of the fragments
Improved mutation tagging with gene identifiers applied to membrane protein stability prediction
Background
The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets.
Results
We developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks. Vice versa, we could show that gene recognition improved from 77% to 91% F-measure when considering mutation information given in the text. To demonstrate practical relevance, we utilize mutation information from text to evaluate a novel solvation energy based model for the prediction of stabilizing regions in membrane proteins. For five G protein-coupled receptors we identified 35 relevant single mutations and associated phenotypes, of which none had been annotated in the UniProt or PDB database. In 71% reported phenotypes were in compliance with the model predictions, supporting a relation between mutations and stability issues in membrane proteins.
Conclusion
We present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest. We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model
Recommended from our members
*-DCC: A platform to collect, annotate, and explore a large variety of sequencing experiments.
BackgroundOver the past few years the variety of experimental designs and protocols for sequencing experiments increased greatly. To ensure the wide usability of the produced data beyond an individual project, rich and systematic annotation of the underlying experiments is crucial.FindingsWe first developed an annotation structure that captures the overall experimental design as well as the relevant details of the steps from the biological sample to the library preparation, the sequencing procedure, and the sequencing and processed files. Through various design features, such as controlled vocabularies and different field requirements, we ensured a high annotation quality, comparability, and ease of annotation. The structure can be easily adapted to a large variety of species. We then implemented the annotation strategy in a user-hosted web platform with data import, query, and export functionality.ConclusionsWe present here an annotation structure and user-hosted platform for sequencing experiment data, suitable for lab-internal documentation, collaborations, and large-scale annotation efforts
An integrative methodology based on protein-protein interaction networks for identification and functional annotation of disease-relevant genes applied to channelopathies
Biologically data-driven networks have become powerful analytical tools that handle massive,
heterogeneous datasets generated from biomedical fields. Protein-protein interaction networks can identify the
most relevant structures directly tied to biological functions. Functional enrichments can then be performed based
on these structural aspects of gene relationships for the study of channelopathies. Channelopathies refer to a
complex group of disorders resulting from dysfunctional ion channels with distinct polygenic manifestations. This
study presents a semi-automatic workflow using protein-protein interaction networks that can identify the most
relevant genes and their biological processes and pathways in channelopathies to better understand their
etiopathogenesis. In addition, the clinical manifestations that are strongly associated with these genes are also
identified as the most characteristic in this complex group of diseases. This research provides a systems biology approach to extract information from interaction networks
of gene expression. We show how large-scale computational integration of heterogeneous datasets, PPI network
analyses, functional databases and published literature may support the detection and assessment of possible
potential therapeutic targets in the disease. Applying our workflow makes it feasible to spot the most relevant
genes and unknown relationships in channelopathies and shows its potential as a first-step approach to identify
both genes and functional interactions in clinical-knowledge scenarios of target diseases.This work was supported by funds from MINECO-FEDER (TIN2016–81041-R to
E.R.), European Human Brain Project SGA2 (H2020-RIA 785907 to M.J.S.), Junta
de AndalucĂa (BIO-302 to F.J.E.) and MEIC (Systems Medicine Excellence Network,
SAF2015–70270-REDT to F.J.E.)
Updates in metabolomics tools and resources: 2014-2015
Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table
Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers
Thomas PE, Klinger R, Furlong LI, Hofmann-Apitius M, Friedrich CM. Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers. BMC Bioinformatics. 2011;12(Suppl 4): S4
Quantitative and functional post-translational modification proteomics reveals that TREPH1 plays a role in plant thigmomorphogenesis
Plants can sense both intracellular and extracellular mechanical forces and
can respond through morphological changes. The signaling components responsible
for mechanotransduction of the touch response are largely unknown. Here, we
performed a high-throughput SILIA (stable isotope labeling in
Arabidopsis)-based quantitative phosphoproteomics analysis to profile changes
in protein phosphorylation resulting from 40 seconds of force stimulation in
Arabidopsis thaliana. Of the 24 touch-responsive phosphopeptides identified,
many were derived from kinases, phosphatases, cytoskeleton proteins, membrane
proteins and ion transporters. TOUCH-REGULATED PHOSPHOPROTEIN1 (TREPH1) and MAP
KINASE KINASE 2 (MKK2) and/or MKK1 became rapidly phosphorylated in
touch-stimulated plants. Both TREPH1 and MKK2 are required for touch-induced
delayed flowering, a major component of thigmomorphogenesis. The treph1-1 and
mkk2 mutants also exhibited defects in touch-inducible gene expression. A
non-phosphorylatable site-specific isoform of TREPH1 (S625A) failed to restore
touch-induced flowering delay of treph1-1, indicating the necessity of S625 for
TREPH1 function and providing evidence consistent with the possible functional
relevance of the touch-regulated TREPH1 phosphorylation. Bioinformatic analysis
and biochemical subcellular fractionation of TREPH1 protein indicate that it is
a soluble protein. Altogether, these findings identify new protein players in
Arabidopsis thigmomorphogenesis regulation, suggesting that protein
phosphorylation may play a critical role in plant force responses
Big Data Analytics in Immunology: A Knowledge-Based Approach
With the vast amount of immunological data available, immunology research is entering the big data era. These data vary in granularity, quality, and complexity and are stored in various formats, including publications, technical reports, and databases. The challenge is to make the transition from data to actionable knowledge and wisdom and bridge the knowledge gap and application gap. We report a knowledge-based approach based on a framework called KB-builder that facilitates data mining by enabling fast development and deployment of web-accessible immunological data knowledge warehouses. Immunological knowledge discovery relies heavily on both the availability of accurate, up-to-date, and well-organized data and the proper analytics tools. We propose the use of knowledge-based approaches by developing knowledgebases combining well-annotated data with specialized analytical tools and integrating them into analytical workflow. A set of well-defined workflow types with rich summarization and visualization capacity facilitates the transformation from data to critical information and knowledge. By using KB-builder, we enabled streamlining of normally time-consuming processes of database development. The knowledgebases built using KB-builder will speed up rational vaccine design by providing accurate and well-annotated data coupled with tailored computational analysis tools and workflow
- …