11 research outputs found

    Literature-Based Enrichment Insights into Redox Control of Vascular Biology

    Get PDF
    In cellular physiology and signaling, reactive oxygen species (ROS) play one of the most critical roles. ROS overproduction leads to cellular oxidative stress. This may lead to an irrecoverable imbalance of redox (oxidation-reduction reaction) function that deregulates redox homeostasis, which itself could lead to several diseases including neurodegenerative disease, cardiovascular disease, and cancers. In this study, we focus on the redox effects related to vascular systems in mammals. To support research in this domain, we developed an online knowledge base, DES-RedoxVasc, which enables exploration of information contained in the biomedical scientific literature. The DES-RedoxVasc system analyzed 233399 documents consisting of PubMed abstracts and PubMed Central full-text articles related to different aspects of redox biology in vascular systems. It allows researchers to explore enriched concepts from 28 curated thematic dictionaries, as well as literature-derived potential associations of pairs of such enriched concepts, where associations themselves are statistically enriched. For example, the system allows exploration of associations of pathways, diseases, mutations, genes/proteins, miRNAs, long ncRNAs, toxins, drugs, biological processes, molecular functions, etc. that allow for insights about different aspects of redox effects and control of processes related to the vascular system. Moreover, we deliver case studies about some existing or possibly novel knowledge regarding redox of vascular biology demonstrating the usefulness of DES-RedoxVasc. DES-RedoxVasc is the first compiled knowledge base using text mining for the exploration of this topic

    DES-mutation : system for exploring links of mutations and diseases

    Get PDF
    During cellular division DNA replicates and this process is the basis for passing genetic information to the next generation. However, the DNA copy process sometimes produces a copy that is not perfect, that is, one with mutations. The collection of all such mutations in the DNA copy of an organism makes it unique and determines the organism's phenotype. However, mutations are often the cause of diseases. Thus, it is useful to have the capability to explore links between mutations and disease. We approached this problem by analyzing a vast amount of published information linking mutations to disease states. Based on such information, we developed the DES-Mutation knowledgebase which allows for exploration of not only mutation-disease links, but also links between mutations and concepts from 27 topic-specific dictionaries such as human genes/proteins, toxins, pathogens, etc. This allows for a more detailed insight into mutation-disease links and context. On a sample of 600 mutation-disease associations predicted and curated, our system achieves precision of 72.83%. To demonstrate the utility of DES-Mutation, we provide case studies related to known or potentially novel information involving disease mutations. To our knowledge, this is the first mutation-disease knowledgebase dedicated to the exploration of this topic through text-mining and data-mining of different mutation types and their associations with terms from multiple thematic dictionaries

    Combining Position Weight Matrices and Document-Term Matrix for Efficient Extraction of Associations of Methylated Genes and Diseases from Free Text

    Get PDF
    <div><p>Background</p><p>In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually.</p> <p>Methodology</p><p>We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs) for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text.</p> <p>Conclusion</p><p>The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at <a href="http://www.cbrc.kaust.edu.sa/demgd/" target="_blank">http://www.cbrc.kaust.edu.sa/demgd/</a>. The data is available for online browsing and download.</p> </div

    The structure of PWMs.

    No full text
    <p>We can generate six PWMs, and each matrix corresponds to a pattern order. For example, the first PWM to the left corresponds to the pattern order (, , ). Each row corresponds to a word, and each column corresponds to a segment, and cells of the matrix represent the frequency of words in each segment.</p

    DEMGD system architecture.

    No full text
    <p>The input to the system is the Input Text, and the output is Summary Tables and Full Reports. The system consists of four modules: Text Pre-processing, Structured Data Representation, Classification and Associations Extraction.</p

    Dataset representation using PWMs.

    No full text
    <p>Each pattern in a sentence is represented with twelve features and a class label. The first six features correspond to the scores generated from the positive PWMs, and the following six features correspond to the scores generated from the negative PWMs.</p

    Computing the scores.

    No full text
    <p>The figure shows an example of a normalized PWM. To compute the score, we sum the weights of one word from each column. For example, the word ā€˜promoterā€™ appears in the first segment, so we take its weight from the first column in the PWM. The same step is applied to the second, and the third segments. However, five words appear in the last segment, so we take maximum weight. The score of the pattern is 0.2336+0.1619+0.1724+0.1315=0.5994. </p

    PWM generation.

    No full text
    <p>The PWM summarizes frequency of words in each segment. For example, the words ā€˜CPGā€™ and ā€˜islandā€™ appear in the first segment of the sentence, so the rows that correspond to these words and the first column is incremented by one. Similarly, the same step is applied to words in the remaining three segments. The same matrix is updated using other sentences with the same pattern order.</p

    DES-ROD: Exploring Literature to Develop New Links between RNA Oxidation and Human Diseases

    Get PDF
    Normal cellular physiology and biochemical processes require undamaged RNA molecules. However, RNAs are frequently subjected to oxidative damage. Overproduction of reactive oxygen species (ROS) leads to RNA oxidation and disturbs redox (oxidation-reduction reaction) homeostasis. When oxidation damage affects RNA carrying protein-coding information, this may result in the synthesis of aberrant proteins as well as a lower efficiency of translation. Both of these, as well as imbalanced redox homeostasis, may lead to numerous human diseases. The number of studies on the effects of RNA oxidative damage in mammals is increasing by year due to the understanding that this oxidation fundamentally leads to numerous human diseases. To enable researchers in this field to explore information relevant to RNA oxidation and effects on human diseases, we developed DES-ROD, an online knowledgebase that contains processed information from 298,603 relevant documents that consist of PubMed abstracts and PubMed Central full-text articles. The system utilizes concepts/terms from 38 curated thematic dictionaries mapped to the analyzed documents. Researchers can explore enriched concepts, as well as enriched pairs of putatively associated concepts. In this way, one can explore mutual relationships between any combinations of two concepts from used dictionaries. Dictionaries cover a wide range of biomedical topics, such as human genes and proteins, pathways, Gene Ontology categories, mutations, noncoding RNAs, enzymes, toxins, metabolites, and diseases. This makes insights into different facets of the effects of RNA oxidation and the control of this process possible. The usefulness of the DES-ROD system is demonstrated by case studies on some known information, as well as potentially novel information involving RNA oxidation and diseases. DES-ROD is the first knowledgebase based on text and data mining that focused on the exploration of RNA oxidation and human diseases
    corecore