Search CORE

4,568 research outputs found

ChemTextMiner: An open source tool kit for mining medical literature abstracts

Author: Deepak Pandit
Esha Jain
Ganesh Nainaru
Muthukumarasamy Karthikeyan
Renu Vyas
Sunil Nalwade
Yogesh Pandit
Publication venue
Publication date: 20/09/2011
Field of study

Text mining involves recognizing patterns from a wealth of information hidden latent in unstructured text and deducing explicit relationships among data entities by using data mining tools. Text mining of Biomedical literature is essential for building biological network connecting genes, proteins, drugs, therapeutic categories, side effects etc. related to diseases of interest. We present an approach for textmining biomedical literature mostly in terms of not so obvious hidden relationships and build biological network applied for the textmining of important human diseases like MTB, Malaria, Alzheimer and Diabetes. The methods, tools and data used for building biological networks using a distributed computing environment previously used for ChemXtreme[1] and ChemStar[2] applications are also described

Crossref

Nature Precedings

Usability and acceptability of four systematic review automation software packages: A mixed method design

Author: Beller Elaine
Cleo Gina
Islam Farhana
Julien Blair
Scott Anna Mae
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/06/2019
Field of study

Bond University Research Portal

PubMed-Scale Event Extraction for Post-Translational Modifications, Epigenetics and Protein Structural Relations

Author: Ananiadou S
Björne J
Ginter F
Ohta T
Pyysalo S
Salakoski T
Van de Peer Y
Van Landeghem S
Publication venue
Publication date: 01/01/2012
Field of study

The University of Manchester - Institutional Repository

Drug prescription support in dental clinics through drug corpus mining

Author: Goh Wee Pheng
Tao Xiaohui
Xie Haoran
Yong Jianming
Zhang Ji
Zhang Wenping
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The rapid increase in the volume and variety of data poses a challenge to safe drug prescription for the dentist. The increasing number of patients that take multiple drugs further exerts pressure on the dentist to make the right decision at point-of-care. Hence, a robust decision support system will enable dentists to make decisions on drug prescription quickly and accurately. Based on the assumption that similar drug pairs have a higher similarity ratio, this paper suggests an innovative approach to obtain the similarity ratio between the drug that the dentist is going to prescribe and the drug that the patient is currently taking. We conducted experiments to obtain the similarity ratios of both positive and negative drug pairs, by using feature vectors generated from term similarities and word embeddings of biomedical text corpus. This model can be easily adapted and implemented for use in a dental clinic to assist the dentist in deciding if a drug is suitable for prescription, taking into consideration the medical profile of the patients. Experimental evaluation of our model’s association of the similarity ratio between two drugs yielded a superior F score of 89%. Hence, such an approach, when integrated within the clinical work flow, will reduce prescription errors and thereby increase the health outcomes of patients

University of Southern Queensland Repository

Modeling text with generalizable Gaussian mixtures

Author: Hansen Lars Kai
Kjems Ulrik
Kolenda Thomas
Larsen Jan
Nielsen Finn Årup
Sigurdsson Sigurdur
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1999
Field of study

We apply and discuss generalizable Gaussian mixture (GGM) models for textmining. The model automatically adapts model complexity for a given text representation. We show that the generalizability of these models depends on the dimensionality of the representation and the sample size. We discuss the relation between supervised and unsupervised learning in text data. Finally, we implement a novelty detector based on the density model. 1. INTRODUCTION Information retrieval is a very active research field which is starting to adapt advanced machine learning techniques for solving hard real world problems [17, 18]. Textmining or pattern recognition in text data is used to categorize text according to topic, to spot new topics, and in a broader sense to create more intelligent searches, e.g., by WWW search engines [12, ?, 14]. Textmining proceeds by pattern recognition based on text features, typically document summary statistics. While there are numerous highlevel language models for extr..

CiteSeerX

Online Research Database In Technology

Recommended from our members

Extracting protein-protein interaction based on discriminative training of the Hidden Vctor State model

Author: He Yulan
Zhou Deyu
Publication venue
Publication date: 01/06/2008
Field of study

The knowledge about gene clusters and protein interactions is important for biological researchers to unveil the mechanism of life. However, large quantity of the knowledge often hides in the literature, such as journal articles, reports, books and so on. Many approaches focusing on extracting information from unstructured text, such as pattern matching, shallow and deep parsing, have been proposed especially for extracting protein-protein interactions (Zhou and He, 2008). A semantic parser based on the Hidden Vector State (HVS) model for extracting protein-protein interactions is presented in (Zhou et al., 2008). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. Maximum Likelihood estimation (MLE) is used to derive the parameters of the HVS model. In this paper, we propose a discriminative approach based on parse error measure to train the HVS model. To adjust the HVS model to achieve minimum parse error rate, the generalized probabilistic descent (GPD) algorithm (Kuo et al., 2002) is used. Experiments have been conducted on the GENIA corpus. The results demonstrate modest improvements when the discriminatively trained HVS model outperforms its MLE trained counterpart by 2.5% in F-measure on the GENIA corpus

Open Research Online (Open)

Topic Map Generation Using Text Mining

Author: Böhm Karsten
Heyer Gerhard
Quasthoff Uwe
Wolff Christian
Publication venue: Springer Verlag
Publication date: 01/01/2002
Field of study

Starting from text corpus analysis with linguistic and statistical analysis algorithms, an infrastructure for text mining is described which uses collocation analysis as a central tool. This text mining method may be applied to different domains as well as languages. Some examples taken form large reference databases motivate the applicability to knowledge management using declarative standards of information structuring and description. The ISO/IEC Topic Map standard is introduced as a candidate for rich metadata description of information resources and it is shown how text mining can be used for automatic topic map generation

University of Regensburg Publication Server

ZENODO

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Doing Things Twice (Or Differently): Strategies to Identify Studies for Targeted Validation

Author: Sarma Gopal P.
Publication venue
Publication date: 21/04/2018
Field of study

The "reproducibility crisis" has been a highly visible source of scientific controversy and dispute. Here, I propose and review several avenues for identifying and prioritizing research studies for the purpose of targeted validation. Of the various proposals discussed, I identify scientific data science as being a strategy that merits greater attention among those interested in reproducibility. I argue that the tremendous potential of scientific data science for uncovering high-value research studies is a significant and rarely discussed benefit of the transition to a fully open-access publishing model.Comment: 4 page

arXiv.org e-Print Archive

MediArXiv Pre-print

Crossref

OSF Preprints

Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy

Author: Bekhuis T
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/04/2006
Field of study

Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. © 2006Bekhuis; licensee BioMed Central Ltd

Springer - Publisher Connector

PubMed Central

D-Scholarship@Pitt

Knowledge Organization Research in the last two decades: 1988-2008

Author: Ibekwe-Sanjuan Fidelia
Sanjuan Eric
Publication venue
Publication date: 28/02/2010
Field of study

We apply an automatic topic mapping system to records of publications in knowledge organization published between 1988-2008. The data was collected from journals publishing articles in the KO field from Web of Science database (WoS). The results showed that while topics in the first decade (1988-1997) were more traditional, the second decade (1998-2008) was marked by a more technological orientation and by the appearance of more specialized topics driven by the pervasiveness of the Web environment

arXiv.org e-Print Archive

Portail HAL Lumière Lyon 2

HAL-Lyon 3