10,813 research outputs found
Insights from Classifying Visual Concepts with Multiple Kernel Learning
Combining information from various image features has become a standard
technique in concept recognition tasks. However, the optimal way of fusing the
resulting kernel functions is usually unknown in practical applications.
Multiple kernel learning (MKL) techniques allow to determine an optimal linear
combination of such similarity matrices. Classical approaches to MKL promote
sparse mixtures. Unfortunately, so-called 1-norm MKL variants are often
observed to be outperformed by an unweighted sum kernel. The contribution of
this paper is twofold: We apply a recently developed non-sparse MKL variant to
state-of-the-art concept recognition tasks within computer vision. We provide
insights on benefits and limits of non-sparse MKL and compare it against its
direct competitors, the sum kernel SVM and the sparse MKL. We report empirical
results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo
Annotation challenge data sets. About to be submitted to PLoS ONE.Comment: 18 pages, 8 tables, 4 figures, format deviating from plos one
submission format requirements for aesthetic reason
Automating the Surveillance of Mosquito Vectors from Trapped Specimens Using Computer Vision Techniques
Among all animals, mosquitoes are responsible for the most deaths worldwide.
Interestingly, not all types of mosquitoes spread diseases, but rather, a
select few alone are competent enough to do so. In the case of any disease
outbreak, an important first step is surveillance of vectors (i.e., those
mosquitoes capable of spreading diseases). To do this today, public health
workers lay several mosquito traps in the area of interest. Hundreds of
mosquitoes will get trapped. Naturally, among these hundreds, taxonomists have
to identify only the vectors to gauge their density. This process today is
manual, requires complex expertise/ training, and is based on visual inspection
of each trapped specimen under a microscope. It is long, stressful and
self-limiting. This paper presents an innovative solution to this problem. Our
technique assumes the presence of an embedded camera (similar to those in
smart-phones) that can take pictures of trapped mosquitoes. Our techniques
proposed here will then process these images to automatically classify the
genus and species type. Our CNN model based on Inception-ResNet V2 and Transfer
Learning yielded an overall accuracy of 80% in classifying mosquitoes when
trained on 25,867 images of 250 trapped mosquito vector specimens captured via
many smart-phone cameras. In particular, the accuracy of our model in
classifying Aedes aegypti and Anopheles stephensi mosquitoes (both of which are
deadly vectors) is amongst the highest. We present important lessons learned
and practical impact of our techniques towards the end of the paper
Multiple Instance Learning: A Survey of Problem Characteristics and Applications
Multiple instance learning (MIL) is a form of weakly supervised learning
where training instances are arranged in sets, called bags, and a label is
provided for the entire bag. This formulation is gaining interest because it
naturally fits various problems and allows to leverage weakly labeled data.
Consequently, it has been used in diverse application fields such as computer
vision and document classification. However, learning from bags raises
important challenges that are unique to MIL. This paper provides a
comprehensive survey of the characteristics which define and differentiate the
types of MIL problems. Until now, these problem characteristics have not been
formally identified and described. As a result, the variations in performance
of MIL algorithms from one data set to another are difficult to explain. In
this paper, MIL problem characteristics are grouped into four broad categories:
the composition of the bags, the types of data distribution, the ambiguity of
instance labels, and the task to be performed. Methods specialized to address
each category are reviewed. Then, the extent to which these characteristics
manifest themselves in key MIL application areas are described. Finally,
experiments are conducted to compare the performance of 16 state-of-the-art MIL
methods on selected problem characteristics. This paper provides insight on how
the problem characteristics affect MIL algorithms, recommendations for future
benchmarking and promising avenues for research
Objects that Sound
In this paper our objectives are, first, networks that can embed audio and
visual inputs into a common space that is suitable for cross-modal retrieval;
and second, a network that can localize the object that sounds in an image,
given the audio signal. We achieve both these objectives by training from
unlabelled video using only audio-visual correspondence (AVC) as the objective
function. This is a form of cross-modal self-supervision from video.
To this end, we design new network architectures that can be trained for
cross-modal retrieval and localizing the sound source in an image, by using the
AVC task. We make the following contributions: (i) show that audio and visual
embeddings can be learnt that enable both within-mode (e.g. audio-to-audio) and
between-mode retrieval; (ii) explore various architectures for the AVC task,
including those for the visual stream that ingest a single image, or multiple
images, or a single image and multi-frame optical flow; (iii) show that the
semantic object that sounds within an image can be localized (using only the
sound, no motion or flow information); and (iv) give a cautionary tale on how
to avoid undesirable shortcuts in the data preparation.Comment: Appears in: European Conference on Computer Vision (ECCV) 201
Topological exploration of artificial neuronal network dynamics
One of the paramount challenges in neuroscience is to understand the dynamics
of individual neurons and how they give rise to network dynamics when
interconnected. Historically, researchers have resorted to graph theory,
statistics, and statistical mechanics to describe the spatiotemporal structure
of such network dynamics. Our novel approach employs tools from algebraic
topology to characterize the global properties of network structure and
dynamics.
We propose a method based on persistent homology to automatically classify
network dynamics using topological features of spaces built from various
spike-train distances. We investigate the efficacy of our method by simulating
activity in three small artificial neural networks with different sets of
parameters, giving rise to dynamics that can be classified into four regimes.
We then compute three measures of spike train similarity and use persistent
homology to extract topological features that are fundamentally different from
those used in traditional methods. Our results show that a machine learning
classifier trained on these features can accurately predict the regime of the
network it was trained on and also generalize to other networks that were not
presented during training. Moreover, we demonstrate that using features
extracted from multiple spike-train distances systematically improves the
performance of our method
- …