1,392 research outputs found
Large Scale Evolution of Convolutional Neural Networks Using Volunteer Computing
This work presents a new algorithm called evolutionary exploration of
augmenting convolutional topologies (EXACT), which is capable of evolving the
structure of convolutional neural networks (CNNs). EXACT is in part modeled
after the neuroevolution of augmenting topologies (NEAT) algorithm, with
notable exceptions to allow it to scale to large scale distributed computing
environments and evolve networks with convolutional filters. In addition to
multithreaded and MPI versions, EXACT has been implemented as part of a BOINC
volunteer computing project, allowing large scale evolution. During a period of
two months, over 4,500 volunteered computers on the Citizen Science Grid
trained over 120,000 CNNs and evolved networks reaching 98.32% test data
accuracy on the MNIST handwritten digits dataset. These results are even
stronger as the backpropagation strategy used to train the CNNs was fairly
rudimentary (ReLU units, L2 regularization and Nesterov momentum) and these
were initial test runs done without refinement of the backpropagation
hyperparameters. Further, the EXACT evolutionary strategy is independent of the
method used to train the CNNs, so they could be further improved by advanced
techniques like elastic distortions, pretraining and dropout. The evolved
networks are also quite interesting, showing "organic" structures and
significant differences from standard human designed architectures.Comment: 17 pages, 13 figures. Submitted to the 2017 Genetic and Evolutionary
Computation Conference (GECCO 2017
Scalable Compression of Deep Neural Networks
Deep neural networks generally involve some layers with mil- lions of
parameters, making them difficult to be deployed and updated on devices with
limited resources such as mobile phones and other smart embedded systems. In
this paper, we propose a scalable representation of the network parameters, so
that different applications can select the most suitable bit rate of the
network based on their own storage constraints. Moreover, when a device needs
to upgrade to a high-rate network, the existing low-rate network can be reused,
and only some incremental data are needed to be downloaded. We first
hierarchically quantize the weights of a pre-trained deep neural network to
enforce weight sharing. Next, we adaptively select the bits assigned to each
layer given the total bit budget. After that, we retrain the network to
fine-tune the quantized centroids. Experimental results show that our method
can achieve scalable compression with graceful degradation in the performance.Comment: 5 pages, 4 figures, ACM Multimedia 201
A Deep Siamese Network for Scene Detection in Broadcast Videos
We present a model that automatically divides broadcast videos into coherent
scenes by learning a distance measure between shots. Experiments are performed
to demonstrate the effectiveness of our approach by comparing our algorithm
against recent proposals for automatic scene segmentation. We also propose an
improved performance measure that aims to reduce the gap between numerical
evaluation and expected results, and propose and release a new benchmark
dataset.Comment: ACM Multimedia 201
Optimization of Convolutional Neural Network ensemble classifiers by Genetic Algorithms
Breast cancer exhibits a high mortality rate and it is the most invasive cancer in women. An analysis from histopathological images could predict this disease. In this way, computational image processing might support this task. In this work a proposal which employes deep learning convolutional neural networks is presented. Then, an ensemble of networks is considered in order to obtain an enhanced recognition performance of the system by the consensus of the networks of the ensemble. Finally, a genetic algorithm is also considered to choose the networks that belong to the ensemble. The proposal has been tested by carrying out several experiments with a set of benchmark images.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech
Spatio-Temporal Sentiment Hotspot Detection Using Geotagged Photos
We perform spatio-temporal analysis of public sentiment using geotagged photo
collections. We develop a deep learning-based classifier that predicts the
emotion conveyed by an image. This allows us to associate sentiment with place.
We perform spatial hotspot detection and show that different emotions have
distinct spatial distributions that match expectations. We also perform
temporal analysis using the capture time of the photos. Our spatio-temporal
hotspot detection correctly identifies emerging concentrations of specific
emotions and year-by-year analyses of select locations show there are strong
temporal correlations between the predicted emotions and known events.Comment: To appear in ACM SIGSPATIAL 201
Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images
We address the problem of fine-grained action localization from temporally
untrimmed web videos. We assume that only weak video-level annotations are
available for training. The goal is to use these weak labels to identify
temporal segments corresponding to the actions, and learn models that
generalize to unconstrained web videos. We find that web images queried by
action names serve as well-localized highlights for many actions, but are
noisily labeled. To solve this problem, we propose a simple yet effective
method that takes weak video labels and noisy image labels as input, and
generates localized action frames as output. This is achieved by cross-domain
transfer between video frames and web images, using pre-trained deep
convolutional neural networks. We then use the localized action frames to train
action recognition models with long short-term memory networks. We collect a
fine-grained sports action data set FGA-240 of more than 130,000 YouTube
videos. It has 240 fine-grained actions under 85 sports activities. Convincing
results are shown on the FGA-240 data set, as well as the THUMOS 2014
localization data set with untrimmed training videos.Comment: Camera ready version for ACM Multimedia 201
LoANs: Weakly Supervised Object Detection with Localizer Assessor Networks
Recently, deep neural networks have achieved remarkable performance on the
task of object detection and recognition. The reason for this success is mainly
grounded in the availability of large scale, fully annotated datasets, but the
creation of such a dataset is a complicated and costly task. In this paper, we
propose a novel method for weakly supervised object detection that simplifies
the process of gathering data for training an object detector. We train an
ensemble of two models that work together in a student-teacher fashion. Our
student (localizer) is a model that learns to localize an object, the teacher
(assessor) assesses the quality of the localization and provides feedback to
the student. The student uses this feedback to learn how to localize objects
and is thus entirely supervised by the teacher, as we are using no labels for
training the localizer. In our experiments, we show that our model is very
robust to noise and reaches competitive performance compared to a
state-of-the-art fully supervised approach. We also show the simplicity of
creating a new dataset, based on a few videos (e.g. downloaded from YouTube)
and artificially generated data.Comment: To appear in AMV18. Code, datasets and models available at
https://github.com/Bartzi/loan
Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks
Recently, Convolutional Neural Networks (ConvNets) have shown promising
performances in many computer vision tasks, especially image-based recognition.
How to effectively use ConvNets for video-based recognition is still an open
problem. In this paper, we propose a compact, effective yet simple method to
encode spatio-temporal information carried in skeleton sequences into
multiple images, referred to as Joint Trajectory Maps (JTM), and ConvNets
are adopted to exploit the discriminative features for real-time human action
recognition. The proposed method has been evaluated on three public benchmarks,
i.e., MSRC-12 Kinect gesture dataset (MSRC-12), G3D dataset and UTD multimodal
human action dataset (UTD-MHAD) and achieved the state-of-the-art results
Denoising Autoencoders for fast Combinatorial Black Box Optimization
Estimation of Distribution Algorithms (EDAs) require flexible probability
models that can be efficiently learned and sampled. Autoencoders (AE) are
generative stochastic networks with these desired properties. We integrate a
special type of AE, the Denoising Autoencoder (DAE), into an EDA and evaluate
the performance of DAE-EDA on several combinatorial optimization problems with
a single objective. We asses the number of fitness evaluations as well as the
required CPU times. We compare the results to the performance to the Bayesian
Optimization Algorithm (BOA) and RBM-EDA, another EDA which is based on a
generative neural network which has proven competitive with BOA. For the
considered problem instances, DAE-EDA is considerably faster than BOA and
RBM-EDA, sometimes by orders of magnitude. The number of fitness evaluations is
higher than for BOA, but competitive with RBM-EDA. These results show that DAEs
can be useful tools for problems with low but non-negligible fitness evaluation
costs.Comment: corrected typos and small inconsistencie
Joint Deep Modeling of Users and Items Using Reviews for Recommendation
A large amount of information exists in reviews written by users. This source
of information has been ignored by most of the current recommender systems
while it can potentially alleviate the sparsity problem and improve the quality
of recommendations. In this paper, we present a deep model to learn item
properties and user behaviors jointly from review text. The proposed model,
named Deep Cooperative Neural Networks (DeepCoNN), consists of two parallel
neural networks coupled in the last layers. One of the networks focuses on
learning user behaviors exploiting reviews written by the user, and the other
one learns item properties from the reviews written for the item. A shared
layer is introduced on the top to couple these two networks together. The
shared layer enables latent factors learned for users and items to interact
with each other in a manner similar to factorization machine techniques.
Experimental results demonstrate that DeepCoNN significantly outperforms all
baseline recommender systems on a variety of datasets.Comment: WSDM 201
- …