Search CORE

9,673 research outputs found

Mining Frequency of Drug Side Effects Over a Large Twitter Dataset Using Apache Spark

Author: Hsu Dennis
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2017
Field of study

Despite clinical trials by pharmaceutical companies as well as current FDA reporting systems, there are still drug side effects that have not been caught. To find a larger sample of reports, a possible way is to mine online social media. With its current widespread use, social media such as Twitter has given rise to massive amounts of data, which can be used as reports for drug side effects. To process these large datasets, Apache Spark has become popular for fast, distributed batch processing. In this work, we have improved on previous pipelines in sentimental analysis-based mining, processing, and extracting tweets with drug-caused side effects. We have also added a new ensemble classifier using a combination of sentiment analysis features to increase the accuracy of identifying drug-caused side effects. In addition, the frequency count for the side effects is also provided. Furthermore, we have also implemented the same pipeline in Apache Spark to improve the speed of processing of tweets by 2.5 times, as well as to support the process of large tweet datasets. As the frequency count of drug side effects opens a wide door for further analysis, we present a preliminary study on this issue, including the side effects of simultaneously using two drugs, and the potential danger of using less-common combination of drugs. We believe the pipeline design and the results present in this work would have great implication on studying drug side effects and on big data analysis in general

SJSU ScholarWorks

The benefits of in silico modeling to identify possible small-molecule drugs and their off-target interactions

Author: Blomberg N
Choi SH
Hastings J
Hirschey J
Mire Zloh
Stewart B Kirton
Wang X
Publication venue: 'Future Science Ltd'
Publication date: 30/01/2019
Field of study

Accepted for publication in a future issue of Future Medicinal Chemistry.The research into the use of small molecules as drugs continues to be a key driver in the development of molecular databases, computer-aided drug design software and collaborative platforms. The evolution of computational approaches is driven by the essential criteria that a drug molecule has to fulfill, from the affinity to targets to minimal side effects while having adequate absorption, distribution, metabolism, and excretion (ADME) properties. A combination of ligand- and structure-based drug development approaches is already used to obtain consensus predictions of small molecule activities and their off-target interactions. Further integration of these methods into easy-to-use workflows informed by systems biology could realize the full potential of available data in the drug discovery and reduce the attrition of drug candidates.Peer reviewe

Crossref

University of Hertfordshire Research Archive

People on Drugs: Credibility of User Statements in Health Communities

Author: Danescu-Niculescu-Mizil Cristian
Mukherjee Subhabrata
Weikum Gerhard
Publication venue
Publication date: 06/05/2017
Field of study

Online health communities are a valuable source of information for patients and physicians. However, such user-generated resources are often plagued by inaccuracies and misinformation. In this work we propose a method for automatically establishing the credibility of user-generated medical statements and the trustworthiness of their authors by exploiting linguistic cues and distant supervision from expert sources. To this end we introduce a probabilistic graphical model that jointly learns user trustworthiness, statement credibility, and language objectivity. We apply this methodology to the task of extracting rare or unknown side-effects of medical drugs --- this being one of the problems where large scale non-expert data has the potential to complement expert medical knowledge. We show that our method can reliably extract side-effects and filter out false statements, while identifying trustworthy users that are likely to contribute valuable medical information

arXiv.org e-Print Archive

MPG.PuRe

Ontology of core data mining entities

Author: A Bernstein
A Golbraikh
A Karalic
B Smith
B Smith
B Smith
C Silla
C Vens
D Demšar
D Kocev
D Kocev
D Qi
D Young
DJ Hand
F Serban
G Madjarov
G Tsoumakas
GH Bakir
H Mannila
HP Kriegel
I Slavkov
J Vanschoren
K Button
Larisa Soldatova
LN Soldatova
M Courtot
M Ford
M Žáková
MA Avery
MA Avery
MF López
O Spjuth
P Robinson
Panče Panov
Q Yang
R Caruana
R Guha
R Guha
RD King
RD King
RR Brinkman
Sašo Džeroski
T Dietterich
V Podpečan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/07/2014
Field of study

In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

Crossref

Brunel University Research Archive