9 research outputs found

    Discovery of protein-protein interactions using a combination of linguistic, statistical and graphical information

    Get PDF
    BACKGROUND: The rapid publication of important research in the biomedical literature makes it increasingly difficult for researchers to keep current with significant work in their area of interest. RESULTS: This paper reports a scalable method for the discovery of protein-protein interactions in Medline abstracts, using a combination of text analytics, statistical and graphical analysis, and a set of easily implemented rules. Applying these techniques to 12,300 abstracts, a precision of 0.61 and a recall of 0.97 were obtained, (f = 0.74) and when allowing for two-hop and three-hop relations discovered by graphical analysis, the precision was 0.74 (f = 0.83). CONCLUSION: This combination of linguistic and statistical approaches appears to provide the highest precision and recall thus far reported in detecting protein-protein relations using text analytic approaches

    PPLook: an automated data mining tool for protein-protein interaction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Extracting and visualizing of protein-protein interaction (PPI) from text literatures are a meaningful topic in protein science. It assists the identification of interactions among proteins. There is a lack of tools to extract PPI, visualize and classify the results.</p> <p>Results</p> <p>We developed a PPI search system, termed PPLook, which automatically extracts and visualizes protein-protein interaction (PPI) from text. Given a query protein name, PPLook can search a dataset for other proteins interacting with it by using a keywords dictionary pattern-matching algorithm, and display the topological parameters, such as the number of nodes, edges, and connected components. The visualization component of PPLook enables us to view the interaction relationship among the proteins in a three-dimensional space based on the OpenGL graphics interface technology. PPLook can also provide the functions of selecting protein semantic class, counting the number of semantic class proteins which interact with query protein, counting the literature number of articles appearing the interaction relationship about the query protein. Moreover, PPLook provides heterogeneous search and a user-friendly graphical interface.</p> <p>Conclusions</p> <p>PPLook is an effective tool for biologists and biosystem developers who need to access PPI information from the literature. PPLook is freely available for non-commercial users at <url>http://meta.usc.edu/softs/PPLook</url>.</p

    An Integrated Web-based System for MEDLINE Analysis: A Case Study of Chronic Kidney Disease

    Get PDF
    In the era of big data, medical researchers attempt to utilize some analysis techniques like machine learning and text mining on their large-scale corpora to save valuable labor work and time. Consequently, many data analysis platforms are built to support medical professionals such as Pubtator, GeneWays, BioContext, etc. These platforms are helpful to medical entities recognition and relation extraction, but there is not an integrated platform to support researchers’ various needs, and medical projects are isolated from each other, which is hard to be shared and reused. As a result, we present an integrated system containing ‘name entity recognition’, ‘document categorization’ and ‘association extraction’. Besides, we add the concept of ‘socialization’ making projects reusable for further analyses. A case study of chronic kidney disease was adopted to indicate the effectiveness of the proposed system

    RetroMine, or how to provide in-depth retrospective studies from Medline in a glance: the hepcidin use-case

    No full text
    International audienceThe rapid expansion of biomedical literature has provoked an increased development of advanced text mining tools to rapidly extract relevant events from the continuously increasing amount of knowledge published periodically in PubMed. However, bioinvestigators are still reluctant to use these tools for two reasons: i) a large volume of events is often extracted upon a query, and this volume is hard to manage, and ii) background events dominate search results and overshadow more pertinent published information, especially for domain experts. In this paper, we propose an approach that incorporates the temporal dimension of published events to the process of information extraction to improve data selection and prioritize more pertinent periodically published knowledge for scientists. Indeed, instead of providing the total knowledge associated with a PubMed query, which is usually a mix of trivial background information and non-background information, we propose a method that incorporates time and selects non background and highly relevant biological entities and events published over time for bioinvestigators. Before excluding background events from the total knowledge extracted, a quantification of their amount is also provided. This work is illustrated by a case study regarding Hepcidin gene publications over a decade, a duration that is sufficiently long enough to generate alternative views on the overall data extracted

    ГЕННЫЕ СЕТИ

    Get PDF
    Исследования последнего десятилетия свидетельствуют о том, что подавляющее большинство фенотипических признаков человека, животных, растений и микроорганизмов (молекулярных, биохимических, клеточных, физиологических, морфологических, поведенческих и т. д.) контролируются очень сложным образом и что в основе их формирования лежат генные сети, т. е. группы координированно функционирующих генов, взаимодействующих друг с другом как через свои первичные продукты (РНК и белки), так и через разнообразные метаболиты и другие вторичные продукты функционирования генных сетей

    Integration of Text Mining with Systems Biology Provides New Insight into the Pathogenesis of Diabetic Neuropathy.

    Full text link
    Diabetic neuropathy (DN) is the most common complication of diabetes affecting approximately 60% of all diabetic patients leading to significant mortality, morbidity, and poor quality of life. Though more than 50% of patients with DN develop substantial nerve damage prior to noticeable symptoms, no biomarkers for predicting the onset or progression of DN are currently available. Here we present a biomarker discovery platform integrating literature mining and a systems biology approach to identify potential DN biomarkers. A web-based target identification and functional analysis tool, SciMiner (http://jdrf.neurology.med.umich.edu/SciMiner), was developed that identifies targets using a context specific analysis of MEDLINE abstracts and full texts. A comprehensive list of 1,026 targets from diabetes and reactive oxygen species (ROS) related literature was compiled by SciMiner. The expression levels of nine genes, selected from the over-represented ROS-diabetes targets, were measured in the dorsal root ganglia (DRG) of diabetic and non-diabetic DBA/2J mice. Eight genes exhibited significant differential expression and the directions of expression change in six of those genes paralleled enhanced oxidative stress in the DRG, suggesting the involvement of ROS related targets in DN. A microarray analysis was also performed on sural nerve biopsies from two DN patient groups with fast or slow DN progression to identify gene expression profiles related to DN progression. In the fast progressing DN, defense response and inflammatory response related genes were up-regulated, while lipid metabolic process and peroxisome proliferator-activated receptor (PPAR) signaling pathway related genes were down-regulated. We also developed mRNA expression signatures that predict DN progression in humans with a high prediction accuracy. Ridge-regression based predictive models with 14 genes achieved a prediction accuracy of 92% (correct prediction of 11 out of 12 patients). Our results identifying the unique gene signatures of progressive DN and compiling ROS-diabetes targets can facilitate the development of new mechanism-based therapies and predictive biomarkers of DN.Ph.D.BioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/77941/1/juhur_1.pd
    corecore