7,833 research outputs found
Evaluation of linear classifiers on articles containing pharmacokinetic evidence of drug-drug interactions
Background. Drug-drug interaction (DDI) is a major cause of morbidity and
mortality. [...] Biomedical literature mining can aid DDI research by
extracting relevant DDI signals from either the published literature or large
clinical databases. However, though drug interaction is an ideal area for
translational research, the inclusion of literature mining methodologies in DDI
workflows is still very preliminary. One area that can benefit from literature
mining is the automatic identification of a large number of potential DDIs,
whose pharmacological mechanisms and clinical significance can then be studied
via in vitro pharmacology and in populo pharmaco-epidemiology. Experiments. We
implemented a set of classifiers for identifying published articles relevant to
experimental pharmacokinetic DDI evidence. These documents are important for
identifying causal mechanisms behind putative drug-drug interactions, an
important step in the extraction of large numbers of potential DDIs. We
evaluate performance of several linear classifiers on PubMed abstracts, under
different feature transformation and dimensionality reduction methods. In
addition, we investigate the performance benefits of including various
publicly-available named entity recognition features, as well as a set of
internally-developed pharmacokinetic dictionaries. Results. We found that
several classifiers performed well in distinguishing relevant and irrelevant
abstracts. We found that the combination of unigram and bigram textual features
gave better performance than unigram features alone, and also that
normalization transforms that adjusted for feature frequency and document
length improved classification. For some classifiers, such as linear
discriminant analysis (LDA), proper dimensionality reduction had a large impact
on performance. Finally, the inclusion of NER features and dictionaries was
found not to help classification.Comment: Pacific Symposium on Biocomputing, 201
Enhanced Industrial Machinery Condition Monitoring Methodology based on Novelty Detection and Multi-Modal Analysis
This paper presents a condition-based monitoring methodology based on novelty detection applied to industrial machinery. The proposed approach includes both, the classical classification of multiple a priori known scenarios, and the innovative detection capability of new operating modes not previously available. The development of condition-based monitoring methodologies considering the isolation capabilities of unexpected scenarios represents, nowadays, a trending topic able to answer the demanding requirements of the future industrial processes monitoring systems. First, the method is based on the temporal segmentation of the available physical magnitudes, and the estimation of a set of time-based statistical features. Then, a double feature reduction stage based on Principal Component Analysis and Linear Discriminant Analysis is applied in order to optimize the classification and novelty detection performances. The posterior combination of a Feed-forward Neural Network and One-Class Support Vector Machine allows the proper interpretation of known and unknown operating conditions. The effectiveness of this novel condition monitoring scheme has been verified by experimental results obtained from an automotive industry machine.Postprint (published version
VizRank: Data Visualization Guided by Machine Learning
Data visualization plays a crucial role in identifying interesting patterns in exploratory data analysis. Its use is, however, made difficult by the large number of possible data projections showing different attribute subsets that must be evaluated by the data analyst. In this paper, we introduce a method called VizRank, which is applied on classified data to automatically select the most useful data projections. VizRank can be used with any visualization method that maps attribute values to points in a two-dimensional visualization space. It assesses possible data projections and ranks them by their ability to visually discriminate between classes. The quality of class separation is estimated by computing the predictive accuracy of k-nearest neighbor classifier on the data set consisting of x and y positions of the projected data points and their class information. The paper introduces the method and presents experimental results which show that VizRank's ranking of projections highly agrees with subjective rankings by data analysts. The practical use of VizRank is also demonstrated by an application in the field of functional genomics
Identifying hidden contexts
In this study we investigate how to identify hidden contexts from the data in classification tasks.
Contexts are artifacts in the data, which do not predict the class label directly.
For instance, in speech recognition task speakers might have different accents, which do not directly discriminate between the spoken words.
Identifying hidden contexts is considered as data preprocessing task, which can help to build more accurate classifiers, tailored for particular contexts and give an insight into the data structure.
We present three techniques to identify hidden contexts, which hide class label information from the input data and partition it using clustering techniques.
We form a collection of performance measures to ensure that the resulting contexts are valid.
We evaluate the performance of the proposed techniques on thirty real datasets.
We present a case study illustrating how the identified contexts can be used to build specialized more accurate classifiers
An introduction to time-resolved decoding analysis for M/EEG
The human brain is constantly processing and integrating information in order
to make decisions and interact with the world, for tasks from recognizing a
familiar face to playing a game of tennis. These complex cognitive processes
require communication between large populations of neurons. The non-invasive
neuroimaging methods of electroencephalography (EEG) and magnetoencephalography
(MEG) provide population measures of neural activity with millisecond precision
that allow us to study the temporal dynamics of cognitive processes. However,
multi-sensor M/EEG data is inherently high dimensional, making it difficult to
parse important signal from noise. Multivariate pattern analysis (MVPA) or
"decoding" methods offer vast potential for understanding high-dimensional
M/EEG neural data. MVPA can be used to distinguish between different conditions
and map the time courses of various neural processes, from basic sensory
processing to high-level cognitive processes. In this chapter, we discuss the
practical aspects of performing decoding analyses on M/EEG data as well as the
limitations of the method, and then we discuss some applications for
understanding representational dynamics in the human brain
- …