833 research outputs found
Recommended from our members
Active learning with committees : an approach to efficient learning in text categorization using linear threshold algorithms
We developed and investigated machine learning methods that require
minimal preprocessing of the input data, use few training examples, run fast, and
still obtain high levels of accuracy.
Most approaches to designing machine learning programs are based on the
supervised learning paradigm – training examples are chosen randomly and given
to the learner. We explore the "active learning" paradigm – the learner
automatically selects the more informative training examples. Our domain of
interest is text categorization, but most of the methods developed are quite general.
The purpose of text categorization is to assign each document in a collection
to appropriate categories. Most existing text categorization methods require large
amounts of time to prepare the documents for learning and large numbers of
examples for training. Humans must assign correct categories to documents before
they can be used for training; this costs time and money. Our goal is to develop
machine learning methods that, when compared to other methods currently available, are more efficient in time and space, use fewer training documents, and
are as accurate.
We developed the Active Learning with Committees (ALC) framework –
inspired by the Query by Committee approach of Freund, Seung, et al. A
"committee" is a group of learners that jointly participate in learning and in
predicting the classes of new examples. We perform minimal preprocessing of the
documents and thus the domain is noisy, high dimensional, and has large numbers
of irrelevant attributes. We use linear threshold learning algorithms to obtain
computational efficiency with respect to these large numbers of attributes, with
specific algorithms being chosen because they also generalize well when large
numbers of attributes are irrelevant.
We developed and analyzed several ALC systems. Our results show that it is
possible to design active learning systems that scale up to large numbers of features
and obtain accuracies comparable to the supervised learning methods while using
an order of magnitude fewer examples and an order of magnitude less time. The
ALC methods developed have run times on the order of seconds, typically use only
5 - 7% of the training documents, and are as accurate as their supervised
counterparts
Nonparametric weighted stochastic block models
We present a Bayesian formulation of weighted stochastic block models that
can be used to infer the large-scale modular structure of weighted networks,
including their hierarchical organization. Our method is nonparametric, and
thus does not require the prior knowledge of the number of groups or other
dimensions of the model, which are instead inferred from data. We give a
comprehensive treatment of different kinds of edge weights (i.e. continuous or
discrete, signed or unsigned, bounded or unbounded), as well as arbitrary
weight transformations, and describe an unsupervised model selection approach
to choose the best network description. We illustrate the application of our
method to a variety of empirical weighted networks, such as global migrations,
voting patterns in congress, and neural connections in the human brain.Comment: 19 pages, 11 figures. Code is freely available as part of graph-tool
at https://graph-tool.skewed.de . See also the HOWTO at
https://graph-tool.skewed.de/static/doc/demos/inference/inference.htm
Deep Structured Models for Large Scale Object Co-detection and Segmentation
Structured decisions are often required for a large variety of
image and scene understanding tasks in computer vision, with few
of them being object detection, localization, semantic
segmentation and many more. Structured prediction deals with
learning inherent structure by incorporating contextual
information from several images and multiple tasks. However, it
is very challenging when dealing with large scale image datasets
where performance is limited by high computational costs and
expressive power of the underlying representation learning
techniques. In this thesis,
we present efficient and effective deep structured models for
context-aware object detection, co-localization and
instance-level semantic segmentation.
First, we introduce a principled formulation for object
co-detection using a fully-connected conditional random field
(CRF). We build an explicit graph whose vertices represent object
candidates (instead of pixel values) and edges encode the object
similarity via simple, yet effective pairwise potentials. More
specifically, we design a weighted mixture of Gaussian kernels
for class-specific object similarity, and formulate kernel
weights estimation as a least-squares regression problem. Its
solution can therefore be obtained in closed-form. Furthermore,
in contrast with traditional co-detection approaches, it has been
shown that inference in such fully-connected CRFs can be
performed efficiently using an approximate mean-field method with
high-dimensional Gaussian filtering. This lets us effectively
leverage information in multiple images.
Next, we extend our class-specific co-detection framework to
multiple object categories. We model object candidates with rich,
high-dimensional features learned using a deep convolutional
neural network. In particular, our max-margin and directloss
structural boosting algorithms enable us to learn the most
suitable features that best encode pairwise similarity
relationships within our CRF framework. Furthermore, it
guarantees that the time and space complexity is O(n t) where n
is the total number of candidate boxes in the pool and t the
number of mean-field iterations.
Moreover, our experiments evidence the importance of learning
rich similarity measures to account for the contextual relations
across object classes and instances. However, all these methods
are based on precomputed object candidates (or proposals), thus
localization performance is limited by the quality of
bounding-boxes.
To address this, we present an efficient object proposal
co-generation technique that leverages the collective power of
multiple images. In particular, we design a deep neural network
layer that takes unary and pairwise features as input, builds a
fully-connected CRF and produces mean-field marginals as output.
It also lets us backpropagate the gradient through entire network
by unrolling the iterations of CRF inference. Furthermore, this
layer simplifies the end-to-end learning, thus effectively
benefiting from multiple candidates to co-generate high-quality
object proposals.
Finally, we develop a multi-task strategy to jointly learn object
detection, localization and instance-level semantic segmentation
in a single network. In particular, we introduce a novel
representation based on the distance transform of the object
masks. To this end, we design a new residual-deconvolution
architecture that infers such a representation and decodes it
into the final binary object mask. We show that the predicted
masks can go beyond the scope of the bounding boxes and that the
multiple tasks can benefit from each other.
In summary, in this thesis, we exploit the joint power of
multiple images as well as multiple tasks to improve
generalization performance of structured learning. Our novel deep
structured models, similarity learning techniques and
residual-deconvolution architecture can be used to make accurate
and reliable inference for key vision tasks. Furthermore, our
quantitative and qualitative experiments on large scale
challenging image datasets demonstrate the superiority of the
proposed approaches over the state-of-the-art methods
- …