19,969 research outputs found
Extraction of Transcript Diversity from Scientific Literature
Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term âalternative splicingâ to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/
GPCR-OKB: the G protein coupled receptor oligomer knowledge base
Rapid expansion of available data about G Protein Coupled Receptor (GPCR) dimers/oligomers over the past few years requires an effective system to organize this information electronically. Based on an ontology derived from a community dialog involving colleagues using experimental and computational methodologies, we developed the GPCR-Oligomerization Knowledge Base (GPCR-OKB). GPCR-OKB is a system that supports browsing and searching for GPCR oligomer data. Such data were manually derived from the literature. While focused on GPCR oligomers, GPCR-OKB is seamlessly connected to GPCRDB, facilitating the correlation of information about GPCR protomers and oligomers
Recommended from our members
Burkholderia Hep_Hag autotransporter (BuHA) proteins elicit a strong antibody response during experimental glanders but not human melioidosis
Background
The bacterial biothreat agents Burkholderia mallei and Burkholderia pseudomallei are the cause of glanders and melioidosis, respectively. Genomic and epidemiological studies have shown that B. mallei is a recently emerged, host restricted clone of B. pseudomallei.
Results
Using bacteriophage-mediated immunoscreening we identified genes expressed in vivo during experimental equine glanders infection. A family of immunodominant antigens were identified that share protein domain architectures with hemagglutinins and invasins. These have been designated Burkholderia Hep_Hag autotransporter (BuHA) proteins. A total of 110/207 positive clones (53%) of a B. mallei expression library screened with sera from two infected horses belonged to this family. This contrasted with 6/189 positive clones (3%) of a B. pseudomallei expression library screened with serum from 21 patients with culture-proven melioidosis.
Conclusion
Members of the BuHA proteins are found in other Gram-negative bacteria and have been shown to have important roles related to virulence. Compared with other bacterial species, the genomes of both B. mallei and B. pseudomallei contain a relative abundance of this family of proteins. The domain structures of these proteins suggest that they function as multimeric surface proteins that modulate interactions of the cell with the host and environment. Their effect on the cellular immune response to B. mallei and their potential as diagnostics for glanders requires further study
Phenotypic responses to interspecies competition and commensalism in a naturally derived microbial co-culture
The fundamental question of whether different microbial species will co-exist or compete in a given environment depends on context, composition and environmental constraints. Model microbial systems can yield some general principles related to this question. In this study we employed a naturally occurring co-culture composed of heterotrophic bacteria, Halomonas sp. HL-48 and Marinobacter sp. HL- 58, to ask two fundamental scientific questions: 1) how do the phenotypes of two naturally co-existing species respond to partnership as compared to axenic growth? and 2) how do growth and molecular phenotypes of these species change with respect to competitive and commensal interactions? We hypothesized â and confirmed â that co-cultivation under glucose as the sole carbon source would result in competitive interactions. Similarly, when glucose was swapped with xylose, the interactions became commensal because Marinobacter HL-58 was supported by metabolites derived from Halomonas HL- 48. Each species responded to partnership by changing both its growth and molecular phenotype as assayed via batch growth kinetics and global transcriptomics. These phenotypic responses depended on nutrient availability and so the environment ultimately controlled how they responded to each other. This simplified model community revealed that microbial interactions are context-specific and different environmental conditions dictate how interspecies partnerships will unfold
A text-mining system for extracting metabolic reactions from full-text articles
Background: Increasingly biological text mining research is focusing on the extraction of complex relationships
relevant to the construction and curation of biological networks and pathways. However, one important category of
pathwayâmetabolic pathwaysâhas been largely neglected.
Here we present a relatively simple method for extracting metabolic reaction information from free text that scores
different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence
and location of stemmed keywords. This method extends an approach that has proved effective in the context of the
extraction of proteinâprotein interactions.
Results: When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our
method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the
well-known protein-protein interaction extraction task.
Conclusions: We conclude that automated metabolic pathway construction is more tractable than has often been
assumed, and that (as in the case of proteinâprotein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed
Kernel methods in genomics and computational biology
Support vector machines and kernel methods are increasingly popular in
genomics and computational biology, due to their good performance in real-world
applications and strong modularity that makes them suitable to a wide range of
problems, from the classification of tumors to the automatic annotation of
proteins. Their ability to work in high dimension, to process non-vectorial
data, and the natural framework they provide to integrate heterogeneous data
are particularly relevant to various problems arising in computational biology.
In this chapter we survey some of the most prominent applications published so
far, highlighting the particular developments in kernel methods triggered by
problems in biology, and mention a few promising research directions likely to
expand in the future
Modules in the photoreceptor RGS9-1âąGÎČ5L GTPase-accelerating protein complex control effector coupling, GTPase acceleration, protein folding, and stability
RGS (regulators of G protein signaling proteins regulate G protein signaling by accelerating GTP hydrolysis, but little is known about regulation of GTPase-accelerating protein (GAP) activities or roles of domains and subunits outside the catalytic cores. RGS9-1 is the GAP required for rapid recovery of light responses in vertebrate photoreceptors and the only mammalian RGS protein with a defined physiological function. It belongs to an RGS subfamily whose members have multiple domains, including G gamma -like domains that bind G(beta5) proteins. Members of this subfamily play important roles in neuronal signaling, Within the GAP complex organized around the RGS domain of RGS9-1, we have identified a functional role for the G gamma -like-G(beta 5L) complex in regulation of GAP activity by an effector subunit, cGMP phosphodiesterase gamma and in protein folding and stability of RGS9-1, The C-terminal domain of RGS9-1 also plays a major role in conferring effector stimulation. The sequence of the RGS domain determines whether the sign of the effector effect will be positive or negative. These roles were observed in, vitro using full-length proteins or fragments for RGS9-1, RGS7, G(beta 5S), and G(beta 5s), The dependence of RGS9-1 on Gp, co-expression for folding, stability, and function has been confirmed in vivo using transgenic Xenopus laevis, These results reveal how multiple domains and regulatory polypeptides work together to fine tune G(t alpha) inactivation
Text-mining assisted regulatory annotation
Text-mining technologies can be integrated with genome annotation systems, increasing the availability of annotated cis-regulatory data
- âŠ