60,927 research outputs found
Cultural Event Recognition with Visual ConvNets and Temporal Models
This paper presents our contribution to the ChaLearn Challenge 2015 on
Cultural Event Classification. The challenge in this task is to automatically
classify images from 50 different cultural events. Our solution is based on the
combination of visual features extracted from convolutional neural networks
with temporal information using a hierarchical classifier scheme. We extract
visual features from the last three fully connected layers of both CaffeNet
(pretrained with ImageNet) and our fine tuned version for the ChaLearn
challenge. We propose a late fusion strategy that trains a separate low-level
SVM on each of the extracted neural codes. The class predictions of the
low-level SVMs form the input to a higher level SVM, which gives the final
event scores. We achieve our best result by adding a temporal refinement step
into our classification scheme, which is applied directly to the output of each
low-level SVM. Our approach penalizes high classification scores based on
visual features when their time stamp does not match well an event-specific
temporal distribution learned from the training and validation data. Our system
achieved the second best result in the ChaLearn Challenge 2015 on Cultural
Event Classification with a mean average precision of 0.767 on the test set.Comment: Initial version of the paper accepted at the CVPR Workshop ChaLearn
Looking at People 201
Learning to Select Pre-Trained Deep Representations with Bayesian Evidence Framework
We propose a Bayesian evidence framework to facilitate transfer learning from
pre-trained deep convolutional neural networks (CNNs). Our framework is
formulated on top of a least squares SVM (LS-SVM) classifier, which is simple
and fast in both training and testing, and achieves competitive performance in
practice. The regularization parameters in LS-SVM is estimated automatically
without grid search and cross-validation by maximizing evidence, which is a
useful measure to select the best performing CNN out of multiple candidates for
transfer learning; the evidence is optimized efficiently by employing Aitken's
delta-squared process, which accelerates convergence of fixed point update. The
proposed Bayesian evidence framework also provides a good solution to identify
the best ensemble of heterogeneous CNNs through a greedy algorithm. Our
Bayesian evidence framework for transfer learning is tested on 12 visual
recognition datasets and illustrates the state-of-the-art performance
consistently in terms of prediction accuracy and modeling efficiency.Comment: Appearing in CVPR-2016 (oral presentation
Automated detection of block falls in the north polar region of Mars
We developed a change detection method for the identification of ice block
falls using NASA's HiRISE images of the north polar scarps on Mars. Our method
is based on a Support Vector Machine (SVM), trained using Histograms of
Oriented Gradients (HOG), and on blob detection. The SVM detects potential new
blocks between a set of images; the blob detection, then, confirms the
identification of a block inside the area indicated by the SVM and derives the
shape of the block. The results from the automatic analysis were compared with
block statistics from visual inspection. We tested our method in 6 areas
consisting of 1000x1000 pixels, where several hundreds of blocks were
identified. The results for the given test areas produced a true positive rate
of ~75% for blocks with sizes larger than 0.7 m (i.e., approx. 3 times the
available ground pixel size) and a false discovery rate of ~8.5%. Using blob
detection we also recover the size of each block within 3 pixels of their
actual size
Visual Concepts and Compositional Voting
It is very attractive to formulate vision in terms of pattern theory
\cite{Mumford2010pattern}, where patterns are defined hierarchically by
compositions of elementary building blocks. But applying pattern theory to real
world images is currently less successful than discriminative methods such as
deep networks. Deep networks, however, are black-boxes which are hard to
interpret and can easily be fooled by adding occluding objects. It is natural
to wonder whether by better understanding deep networks we can extract building
blocks which can be used to develop pattern theoretic models. This motivates us
to study the internal representations of a deep network using vehicle images
from the PASCAL3D+ dataset. We use clustering algorithms to study the
population activities of the features and extract a set of visual concepts
which we show are visually tight and correspond to semantic parts of vehicles.
To analyze this we annotate these vehicles by their semantic parts to create a
new dataset, VehicleSemanticParts, and evaluate visual concepts as unsupervised
part detectors. We show that visual concepts perform fairly well but are
outperformed by supervised discriminative methods such as Support Vector
Machines (SVM). We next give a more detailed analysis of visual concepts and
how they relate to semantic parts. Following this, we use the visual concepts
as building blocks for a simple pattern theoretical model, which we call
compositional voting. In this model several visual concepts combine to detect
semantic parts. We show that this approach is significantly better than
discriminative methods like SVM and deep networks trained specifically for
semantic part detection. Finally, we return to studying occlusion by creating
an annotated dataset with occlusion, called VehicleOcclusion, and show that
compositional voting outperforms even deep networks when the amount of
occlusion becomes large.Comment: It is accepted by Annals of Mathematical Sciences and Application
Detection of the stroboscopic effect by young adults varying in sensitivity
The advent of LED lighting has renewed concern about the possible visual, neurobiological, and performance and cognition effects of cyclic variations in lighting system luminous flux (temporal light modulation). The stroboscopic visibility measure (SVM) characterises the temporal light modulation signal to predict the visibility of the stroboscopic effect, one of the visual perception effects of temporal light modulation. A SVM of 1 means that the average person would detect the phenomenon 50% of the time. There is little published data describing the population sensitivity to the stroboscopic effect in relation to the SVM, and none focusing on people subject to visual stress. This experiment, conducted in parallel in Canada and France, examined stroboscopic detection for horizontal and vertical moving targets when viewed under commercially available lamps varying in SVM conditions (SVM: ∼0; ∼0.4; ∼0.9; ∼1.4; ∼3.0). As expected, stroboscopic detection scores increased with increasing SVM. For the horizontal task, average scores were lower than the expected 4/8 at ∼0.90, but increased non-linearly with higher SVMs. Stroboscopic detection scores did not differ between people low and high in pattern glare sensitivity, but people in the high-pattern glare sensitivity group reported greater annoyance in the SVM ∼1.4 and ∼3.0 conditions
Application of support vector machine for classification of multispectral data
In this paper, support vector machine (SVM) is used to classify satellite remotely sensed multispectral data. The data are recorded from a Landsat-5 TM satellite with resolution of 30x30m. SVM finds the optimal separating hyperplane between classes by focusing on the training
cases. The study area of Klang Valley has more than 10 land covers and classification using SVM has been done successfully without any pixel being unclassified. The training area is determined carefully by visual interpretation and with the aid of the reference map of the study area. The result obtained is then analysed for the accuracy and visual performance. Accuracy assessment is done by determination and discussion of Kappa coefficient value, overall and producer accuracy for each
class (in pixels and percentage). While, visual analysis is done by comparing the classification data with the reference map. Overall the study shows that SVM is able to classify the land covers within the study area with a high accuracy
- …