13,917 research outputs found
Weakly-Supervised Temporal Localization via Occurrence Count Learning
We propose a novel model for temporal detection and localization which allows
the training of deep neural networks using only counts of event occurrences as
training labels. This powerful weakly-supervised framework alleviates the
burden of the imprecise and time-consuming process of annotating event
locations in temporal data. Unlike existing methods, in which localization is
explicitly achieved by design, our model learns localization implicitly as a
byproduct of learning to count instances. This unique feature is a direct
consequence of the model's theoretical properties. We validate the
effectiveness of our approach in a number of experiments (drum hit and piano
onset detection in audio, digit detection in images) and demonstrate
performance comparable to that of fully-supervised state-of-the-art methods,
despite much weaker training requirements.Comment: Accepted at ICML 201
ARIGAN: Synthetic Arabidopsis Plants using Generative Adversarial Network
In recent years, there has been an increasing interest in image-based plant
phenotyping, applying state-of-the-art machine learning approaches to tackle
challenging problems, such as leaf segmentation (a multi-instance problem) and
counting. Most of these algorithms need labelled data to learn a model for the
task at hand. Despite the recent release of a few plant phenotyping datasets,
large annotated plant image datasets for the purpose of training deep learning
algorithms are lacking. One common approach to alleviate the lack of training
data is dataset augmentation. Herein, we propose an alternative solution to
dataset augmentation for plant phenotyping, creating artificial images of
plants using generative neural networks. We propose the Arabidopsis Rosette
Image Generator (through) Adversarial Network: a deep convolutional network
that is able to generate synthetic rosette-shaped plants, inspired by DCGAN (a
recent adversarial network model using convolutional layers). Specifically, we
trained the network using A1, A2, and A4 of the CVPPP 2017 LCC dataset,
containing Arabidopsis Thaliana plants. We show that our model is able to
generate realistic 128x128 colour images of plants. We train our network
conditioning on leaf count, such that it is possible to generate plants with a
given number of leaves suitable, among others, for training regression based
models. We propose a new Ax dataset of artificial plants images, obtained by
our ARIGAN. We evaluate this new dataset using a state-of-the-art leaf counting
algorithm, showing that the testing error is reduced when Ax is used as part of
the training data.Comment: 8 pages, 6 figures, 1 table, ICCV CVPPP Workshop 201
Learning Visual Reasoning Without Strong Priors
Achieving artificial visual reasoning - the ability to answer image-related
questions which require a multi-step, high-level process - is an important step
towards artificial general intelligence. This multi-modal task requires
learning a question-dependent, structured reasoning process over images from
language. Standard deep learning approaches tend to exploit biases in the data
rather than learn this underlying structure, while leading methods learn to
visually reason successfully but are hand-crafted for reasoning. We show that a
general-purpose, Conditional Batch Normalization approach achieves
state-of-the-art results on the CLEVR Visual Reasoning benchmark with a 2.4%
error rate. We outperform the next best end-to-end method (4.5%) and even
methods that use extra supervision (3.1%). We probe our model to shed light on
how it reasons, showing it has learned a question-dependent, multi-step
process. Previous work has operated under the assumption that visual reasoning
calls for a specialized architecture, but we show that a general architecture
with proper conditioning can learn to visually reason effectively.Comment: Full AAAI 2018 paper is at arXiv:1709.07871. Presented at ICML 2017's
Machine Learning in Speech and Language Processing Workshop. Code is at
http://github.com/ethanjperez/fil
- …