16,891 research outputs found
A comparison of machine learning techniques for detection of drug target articles
Important progress in treating diseases has been possible thanks to the identification of drug targets. Drug targets are the molecular structures whose abnormal activity, associated to a disease, can be modified by drugs, improving the health of patients. Pharmaceutical industry needs to give priority to their identification and validation in order to reduce the long and costly drug development times. In the last two decades, our knowledge about drugs, their mechanisms of action and drug targets has rapidly increased. Nevertheless, most of this knowledge is hidden in millions of medical articles and textbooks. Extracting knowledge from this large amount of unstructured information is a laborious job, even for human experts. Drug target articles identification, a crucial first step toward the automatic extraction of information from texts, constitutes the aim of this paper. A comparison of several machine learning techniques has been performed in order to obtain a satisfactory classifier for detecting drug target articles using semantic information from biomedical resources such as the Unified Medical Language System. The best result has been achieved by a Fuzzy Lattice Reasoning classifier, which reaches 98% of ROC area measure.This research paper is supported by Projects TIN2007-67407-
C03-01, S-0505/TIC-0267 and MICINN project TEXT-ENTERPRISE
2.0 TIN2009-13391-C04-03 (Plan I + D + i), as well as for the Juan
de la Cierva program of the MICINN of SpainPublicad
Global disease monitoring and forecasting with Wikipedia
Infectious disease is a leading threat to public health, economic stability,
and other key social structures. Efforts to mitigate these impacts depend on
accurate and timely monitoring to measure the risk and progress of disease.
Traditional, biologically-focused monitoring techniques are accurate but costly
and slow; in response, new techniques based on social internet data such as
social media and search queries are emerging. These efforts are promising, but
important challenges in the areas of scientific peer review, breadth of
diseases and countries, and forecasting hamper their operational usefulness.
We examine a freely available, open data source for this use: access logs
from the online encyclopedia Wikipedia. Using linear models, language as a
proxy for location, and a systematic yet simple article selection procedure, we
tested 14 location-disease combinations and demonstrate that these data
feasibly support an approach that overcomes these challenges. Specifically, our
proof-of-concept yields models with up to 0.92, forecasting value up to
the 28 days tested, and several pairs of models similar enough to suggest that
transferring models from one location to another without re-training is
feasible.
Based on these preliminary results, we close with a research agenda designed
to overcome these challenges and produce a disease monitoring and forecasting
system that is significantly more effective, robust, and globally comprehensive
than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein
and adjust novelty claims accordingly; revise title; various revisions for
clarit
Forecasting the Progression of Alzheimer's Disease Using Neural Networks and a Novel Pre-Processing Algorithm
Alzheimer's disease (AD) is the most common neurodegenerative disease in
older people. Despite considerable efforts to find a cure for AD, there is a
99.6% failure rate of clinical trials for AD drugs, likely because AD patients
cannot easily be identified at early stages. This project investigated machine
learning approaches to predict the clinical state of patients in future years
to benefit AD research. Clinical data from 1737 patients was obtained from the
Alzheimer's Disease Neuroimaging Initiative (ADNI) database and was processed
using the "All-Pairs" technique, a novel methodology created for this project
involving the comparison of all possible pairs of temporal data points for each
patient. This data was then used to train various machine learning models.
Models were evaluated using 7-fold cross-validation on the training dataset and
confirmed using data from a separate testing dataset (110 patients). A neural
network model was effective (mAUC = 0.866) at predicting the progression of AD
on a month-by-month basis, both in patients who were initially cognitively
normal and in patients suffering from mild cognitive impairment. Such a model
could be used to identify patients at early stages of AD and who are therefore
good candidates for clinical trials for AD therapeutics.Comment: 10 pages; updated acknowledgement
People on Drugs: Credibility of User Statements in Health Communities
Online health communities are a valuable source of information for patients
and physicians. However, such user-generated resources are often plagued by
inaccuracies and misinformation. In this work we propose a method for
automatically establishing the credibility of user-generated medical statements
and the trustworthiness of their authors by exploiting linguistic cues and
distant supervision from expert sources. To this end we introduce a
probabilistic graphical model that jointly learns user trustworthiness,
statement credibility, and language objectivity. We apply this methodology to
the task of extracting rare or unknown side-effects of medical drugs --- this
being one of the problems where large scale non-expert data has the potential
to complement expert medical knowledge. We show that our method can reliably
extract side-effects and filter out false statements, while identifying
trustworthy users that are likely to contribute valuable medical information
- …