68,805 research outputs found
Automatic annotation for weakly supervised learning of detectors
PhDObject detection in images and action detection in videos are among the most widely studied
computer vision problems, with applications in consumer photography, surveillance, and automatic
media tagging. Typically, these standard detectors are fully supervised, that is they require
a large body of training data where the locations of the objects/actions in images/videos have
been manually annotated. With the emergence of digital media, and the rise of high-speed internet,
raw images and video are available for little to no cost. However, the manual annotation
of object and action locations remains tedious, slow, and expensive. As a result there has been
a great interest in training detectors with weak supervision where only the presence or absence
of object/action in image/video is needed, not the location. This thesis presents approaches for
weakly supervised learning of object/action detectors with a focus on automatically annotating
object and action locations in images/videos using only binary weak labels indicating the presence
or absence of object/action in images/videos.
First, a framework for weakly supervised learning of object detectors in images is presented.
In the proposed approach, a variation of multiple instance learning (MIL) technique for automatically
annotating object locations in weakly labelled data is presented which, unlike existing
approaches, uses inter-class and intra-class cue fusion to obtain the initial annotation. The initial
annotation is then used to start an iterative process in which standard object detectors are used to
refine the location annotation. Finally, to ensure that the iterative training of detectors do not drift
from the object of interest, a scheme for detecting model drift is also presented. Furthermore,
unlike most other methods, our weakly supervised approach is evaluated on data without manual
pose (object orientation) annotation.
Second, an analysis of the initial annotation of objects, using inter-class and intra-class cues,
is carried out. From the analysis, a new method based on negative mining (NegMine) is presented
for the initial annotation of both object and action data. The NegMine based approach is a
much simpler formulation using only inter-class measure and requires no complex combinatorial
optimisation but can still meet or outperform existing approaches including the previously pre3
sented inter-intra class cue fusion approach. Furthermore, NegMine can be fused with existing
approaches to boost their performance.
Finally, the thesis will take a step back and look at the use of generic object detectors as prior
knowledge in weakly supervised learning of object detectors. These generic object detectors are
typically based on sampling saliency maps that indicate if a pixel belongs to the background
or foreground. A new approach to generating saliency maps is presented that, unlike existing
approaches, looks beyond the current image of interest and into images similar to the current
image. We show that our generic object proposal method can be used by itself to annotate the
weakly labelled object data with surprisingly high accuracy
Analytic approach for reflected Brownian motion in the quadrant
Random walks in the quarter plane are an important object both of
combinatorics and probability theory. Of particular interest for their study,
there is an analytic approach initiated by Fayolle, Iasnogorodski and Malyshev,
and further developed by the last two authors of this note. The outcomes of
this method are explicit expressions for the generating functions of interest,
asymptotic analysis of their coefficients, etc. Although there is an important
literature on reflected Brownian motion in the quarter plane (the continuous
counterpart of quadrant random walks), an analogue of the analytic approach has
not been fully developed to that context. The aim of this note is twofold: it
is first an extended abstract of two recent articles of the authors of this
paper, which propose such an approach; we further compare various aspects of
the discrete and continuous analytic approaches.Comment: 19 pages, 5 figures. Extended abstract of the papers arXiv:1602.03054
and arXiv:1604.02918, to appear in Proceedings of the 27th International
Conference on Probabilistic, Combinatorial and Asymptotic Methods for the
Analysis of Algorithms, Krakow, Poland, 4-8 July 2016 arXiv admin note: text
overlap with arXiv:1602.0305
Analyzing Boltzmann Samplers for Bose-Einstein Condensates with Dirichlet Generating Functions
Boltzmann sampling is commonly used to uniformly sample objects of a
particular size from large combinatorial sets. For this technique to be
effective, one needs to prove that (1) the sampling procedure is efficient and
(2) objects of the desired size are generated with sufficiently high
probability. We use this approach to give a provably efficient sampling
algorithm for a class of weighted integer partitions related to Bose-Einstein
condensation from statistical physics. Our sampling algorithm is a
probabilistic interpretation of the ordinary generating function for these
objects, derived from the symbolic method of analytic combinatorics. Using the
Khintchine-Meinardus probabilistic method to bound the rejection rate of our
Boltzmann sampler through singularity analysis of Dirichlet generating
functions, we offer an alternative approach to analyze Boltzmann samplers for
objects with multiplicative structure.Comment: 20 pages, 1 figur
Training neural networks to encode symbols enables combinatorial generalization
Combinatorial generalization - the ability to understand and produce novel
combinations of already familiar elements - is considered to be a core capacity
of the human mind and a major challenge to neural network models. A significant
body of research suggests that conventional neural networks can't solve this
problem unless they are endowed with mechanisms specifically engineered for the
purpose of representing symbols. In this paper we introduce a novel way of
representing symbolic structures in connectionist terms - the vectors approach
to representing symbols (VARS), which allows training standard neural
architectures to encode symbolic knowledge explicitly at their output layers.
In two simulations, we show that neural networks not only can learn to produce
VARS representations, but in doing so they achieve combinatorial generalization
in their symbolic and non-symbolic output. This adds to other recent work that
has shown improved combinatorial generalization under specific training
conditions, and raises the question of whether specific mechanisms or training
routines are needed to support symbolic processing
- …