8 research outputs found
Extracting causal relations on HIV drug resistance from literature
<p>Abstract</p> <p>Background</p> <p>In HIV treatment it is critical to have up-to-date resistance data of applicable drugs since HIV has a very high rate of mutation. These data are made available through scientific publications and must be extracted manually by experts in order to be used by virologists and medical doctors. Therefore there is an urgent need for a tool that partially automates this process and is able to retrieve relations between drugs and virus mutations from literature.</p> <p>Results</p> <p>In this work we present a novel method to extract and combine relationships between HIV drugs and mutations in viral genomes. Our extraction method is based on natural language processing (NLP) which produces grammatical relations and applies a set of rules to these relations. We applied our method to a relevant set of PubMed abstracts and obtained 2,434 extracted relations with an estimated performance of 84% for F-score. We then combined the extracted relations using logistic regression to generate resistance values for each <drug, mutation> pair. The results of this relation combination show more than 85% agreement with the Stanford HIVDB for the ten most frequently occurring mutations. The system is used in 5 hospitals from the Virolab project <url>http://www.virolab.org</url> to preselect the most relevant novel resistance data from literature and present those to virologists and medical doctors for further evaluation.</p> <p>Conclusions</p> <p>The proposed relation extraction and combination method has a good performance on extracting HIV drug resistance data. It can be used in large-scale relation extraction experiments. The developed methods can also be applied to extract other type of relations such as gene-protein, gene-disease, and disease-mutation.</p
Graph-based learning model for detection of SMS spam on smart phones
Short Message Service (SMS) has been increasingly exploited through spam propagation schemes in recent years. This paper presents a new method for graph-based learning and classification of spam SMS on mobile devices and smart phones. Our approach is based on modeling the content and patterns of SMS syntax into a direct ed-weighted graph through exploiting modern composition style of messages. The graph attributes are then used to classify spam messages in real-time by using KL-Divergence measure. Experimental results on two real-world datasets show that our proposed method achieves high detection accuracy with
less false alarm rate to detect spam messages. Moreover, our approach requires relatively less memory and processing power, making it suitable to deploy on resource-constrained mobile devices and smart phones.status: publishe
SIMOnt: A Security Information Management Ontology Framework
Oneď€ of the major problems that plague all ontology designers is the absence of an unambiguous and exhaustive list of values that may be used to define concepts and their inter-relationships. This is mostly due to the fact that most of the documents describing the concepts and their relationships are either unstructured or semi-structured in nature. Moreover, due to the dynamic nature of the document repository on the Web, any agent or system designed for reasoning with these concepts should be able to adapt to changes and an ontology should be upgradeable with information extracted from text documents. In this paper, we have proposed the design of a Security Information Management Ontology (SIMOnto) which utilizes natural language processing and statistical analysis to mine an exhaustive list of concepts and their relationships in an automatic way. Concepts are extracted using TF-IDF and LSA techniques whereas, relations between them are mined using semantic and co-occurrence based analyses. The mined concepts and relations are presented to domain experts for validation before creation of ontology using ProtĂ©gĂ©