221,771 research outputs found
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Cross-Modal Data Programming Enables Rapid Medical Machine Learning
Labeling training datasets has become a key barrier to building medical
machine learning models. One strategy is to generate training labels
programmatically, for example by applying natural language processing pipelines
to text reports associated with imaging studies. We propose cross-modal data
programming, which generalizes this intuitive strategy in a
theoretically-grounded way that enables simpler, clinician-driven input,
reduces required labeling time, and improves with additional unlabeled data. In
this approach, clinicians generate training labels for models defined over a
target modality (e.g. images or time series) by writing rules over an auxiliary
modality (e.g. text reports). The resulting technical challenge consists of
estimating the accuracies and correlations of these rules; we extend a recent
unsupervised generative modeling technique to handle this cross-modal setting
in a provably consistent way. Across four applications in radiography, computed
tomography, and electroencephalography, and using only several hours of
clinician time, our approach matches or exceeds the efficacy of
physician-months of hand-labeling with statistical significance, demonstrating
a fundamentally faster and more flexible way of building machine learning
models in medicine
- …