478,477 research outputs found
Active Transfer Learning with Zero-Shot Priors: Reusing Past Datasets for Future Tasks
How can we reuse existing knowledge, in the form of available datasets, when
solving a new and apparently unrelated target task from a set of unlabeled
data? In this work we make a first contribution to answer this question in the
context of image classification. We frame this quest as an active learning
problem and use zero-shot classifiers to guide the learning process by linking
the new task to the existing classifiers. By revisiting the dual formulation of
adaptive SVM, we reveal two basic conditions to choose greedily only the most
relevant samples to be annotated. On this basis we propose an effective active
learning algorithm which learns the best possible target classification model
with minimum human labeling effort. Extensive experiments on two challenging
datasets show the value of our approach compared to the state-of-the-art active
learning methodologies, as well as its potential to reuse past datasets with
minimal effort for future tasks
Mining Object Parts from CNNs via Active Question-Answering
Given a convolutional neural network (CNN) that is pre-trained for object
classification, this paper proposes to use active question-answering to
semanticize neural patterns in conv-layers of the CNN and mine part concepts.
For each part concept, we mine neural patterns in the pre-trained CNN, which
are related to the target part, and use these patterns to construct an And-Or
graph (AOG) to represent a four-layer semantic hierarchy of the part. As an
interpretable model, the AOG associates different CNN units with different
explicit object parts. We use an active human-computer communication to
incrementally grow such an AOG on the pre-trained CNN as follows. We allow the
computer to actively identify objects, whose neural patterns cannot be
explained by the current AOG. Then, the computer asks human about the
unexplained objects, and uses the answers to automatically discover certain CNN
patterns corresponding to the missing knowledge. We incrementally grow the AOG
to encode new knowledge discovered during the active-learning process. In
experiments, our method exhibits high learning efficiency. Our method uses
about 1/6-1/3 of the part annotations for training, but achieves similar or
better part-localization performance than fast-RCNN methods.Comment: Published in CVPR 201
Learning in Unlabelled Networks – An Active Learning and Inference Approach
The task of determining labels of all network nodes based on the knowledge about network structure and labels of some training subset of nodes is called the within-network classification. It may happen that none of the labels of the nodes is known and additionally there is no information about number of classes to which nodes can be assigned. In such a case a subset of nodes has to be selected for initial label acquisition. The question that arises is: "labels of which nodes should be collected and used for learning in order to provide the best classification accuracy for the whole network?". Active learning and inference is a practical framework to study this problem.
A set of methods for active learning and inference for within network classification is proposed and validated. The utility score calculation for each node based on network structure is the first step in the process. The scores enable to rank the nodes. Based on the ranking, a set of nodes, for which the labels are acquired, is selected (e.g. by taking top or bottom N from the ranking). The new measure-neighbour methods proposed in the paper suggest not obtaining labels of nodes from the ranking but rather acquiring labels of their neighbours. The paper examines 29 distinct formulations of utility score and selection methods reporting their impact on the results of two collective classification algorithms: Iterative Classification Algorithm and Loopy Belief Propagation.
We advocate that the accuracy of presented methods depends on the structural properties of the examined network. We claim that measure-neighbour methods will work better than the regular methods for networks with higher clustering coefficient and worse than regular methods for networks with low clustering coefficient. According to our hypothesis, based on clustering coefficient we are able to recommend appropriate active learning and inference method
Active Learning with Partial Feedback
While many active learning papers assume that the learner can simply ask for
a label and receive it, real annotation often presents a mismatch between the
form of a label (say, one among many classes), and the form of an annotation
(typically yes/no binary feedback). To annotate examples corpora for multiclass
classification, we might need to ask multiple yes/no questions, exploiting a
label hierarchy if one is available. To address this more realistic setting, we
propose active learning with partial feedback (ALPF), where the learner must
actively choose both which example to label and which binary question to ask.
At each step, the learner selects an example, asking if it belongs to a chosen
(possibly composite) class. Each answer eliminates some classes, leaving the
learner with a partial label. The learner may then either ask more questions
about the same example (until an exact label is uncovered) or move on
immediately, leaving the first example partially labeled. Active learning with
partial labels requires (i) a sampling strategy to choose (example, class)
pairs, and (ii) learning from partial labels between rounds. Experiments on
Tiny ImageNet demonstrate that our most effective method improves 26%
(relative) in top-1 classification accuracy compared to i.i.d. baselines and
standard active learners given 30% of the annotation budget that would be
required (naively) to annotate the dataset. Moreover, ALPF-learners fully
annotate TinyImageNet at 42% lower cost. Surprisingly, we observe that
accounting for per-example annotation costs can alter the conventional wisdom
that active learners should solicit labels for hard examples.Comment: ICLR 201
Implicitly Constrained Semi-Supervised Linear Discriminant Analysis
Semi-supervised learning is an important and active topic of research in
pattern recognition. For classification using linear discriminant analysis
specifically, several semi-supervised variants have been proposed. Using any
one of these methods is not guaranteed to outperform the supervised classifier
which does not take the additional unlabeled data into account. In this work we
compare traditional Expectation Maximization type approaches for
semi-supervised linear discriminant analysis with approaches based on intrinsic
constraints and propose a new principled approach for semi-supervised linear
discriminant analysis, using so-called implicit constraints. We explore the
relationships between these methods and consider the question if and in what
sense we can expect improvement in performance over the supervised procedure.
The constraint based approaches are more robust to misspecification of the
model, and may outperform alternatives that make more assumptions on the data,
in terms of the log-likelihood of unseen objects.Comment: 6 pages, 3 figures and 3 tables. International Conference on Pattern
Recognition (ICPR) 2014, Stockholm, Swede
Interpretation of Neural Networks is Fragile
In order for machine learning to be deployed and trusted in many
applications, it is crucial to be able to reliably explain why the machine
learning algorithm makes certain predictions. For example, if an algorithm
classifies a given pathology image to be a malignant tumor, then the doctor may
need to know which parts of the image led the algorithm to this classification.
How to interpret black-box predictors is thus an important and active area of
research. A fundamental question is: how much can we trust the interpretation
itself? In this paper, we show that interpretation of deep learning predictions
is extremely fragile in the following sense: two perceptively indistinguishable
inputs with the same predicted label can be assigned very different
interpretations. We systematically characterize the fragility of several
widely-used feature-importance interpretation methods (saliency maps, relevance
propagation, and DeepLIFT) on ImageNet and CIFAR-10. Our experiments show that
even small random perturbation can change the feature importance and new
systematic perturbations can lead to dramatically different interpretations
without changing the label. We extend these results to show that
interpretations based on exemplars (e.g. influence functions) are similarly
fragile. Our analysis of the geometry of the Hessian matrix gives insight on
why fragility could be a fundamental challenge to the current interpretation
approaches.Comment: Published as a conference paper at AAAI 201
Interpretable machine learning for inferring the phase boundaries in a nonequilibrium system
Still under debate is the question of whether machine learning is capable of
going beyond black-box modeling for complex physical systems. We investigate
the generalizing and interpretability properties of learning algorithms. To
this end, we use supervised and unsupervised learning to infer the phase
boundaries of the active Ising model, starting from an ensemble of
configurations of the system. We illustrate that unsupervised learning
techniques are powerful at identifying the phase boundaries in the control
parameter space, even in situations of phase coexistence. It is demonstrated
that supervised learning with neural networks is capable of learning the
characteristics of the phase diagram, such that the knowledge obtained at a
limited set of control variables can be used to determine the phase boundaries
across the phase diagram. In this way, we show that properly designed
supervised learning provides predictive power to regions in the phase diagram
that are not included in the training phase of the algorithm. We stress the
importance of introducing interpretability methods in order to perform a
physically relevant classification of the phases with deep learning
- …