118,679 research outputs found
Image Aesthetics Assessment Using Composite Features from off-the-Shelf Deep Models
Deep convolutional neural networks have recently achieved great success on
image aesthetics assessment task. In this paper, we propose an efficient method
which takes the global, local and scene-aware information of images into
consideration and exploits the composite features extracted from corresponding
pretrained deep learning models to classify the derived features with support
vector machine. Contrary to popular methods that require fine-tuning or
training a new model from scratch, our training-free method directly takes the
deep features generated by off-the-shelf models for image classification and
scene recognition. Also, we analyzed the factors that could influence the
performance from two aspects: the architecture of the deep neural network and
the contribution of local and scene-aware information. It turns out that deep
residual network could produce more aesthetics-aware image representation and
composite features lead to the improvement of overall performance. Experiments
on common large-scale aesthetics assessment benchmarks demonstrate that our
method outperforms the state-of-the-art results in photo aesthetics assessment.Comment: Accepted by ICIP 201
Harvesting Discriminative Meta Objects with Deep CNN Features for Scene Classification
Recent work on scene classification still makes use of generic CNN features
in a rudimentary manner. In this ICCV 2015 paper, we present a novel pipeline
built upon deep CNN features to harvest discriminative visual objects and parts
for scene classification. We first use a region proposal technique to generate
a set of high-quality patches potentially containing objects, and apply a
pre-trained CNN to extract generic deep features from these patches. Then we
perform both unsupervised and weakly supervised learning to screen these
patches and discover discriminative ones representing category-specific objects
and parts. We further apply discriminative clustering enhanced with local CNN
fine-tuning to aggregate similar objects and parts into groups, called meta
objects. A scene image representation is constructed by pooling the feature
response maps of all the learned meta objects at multiple spatial scales. We
have confirmed that the scene image representation obtained using this new
pipeline is capable of delivering state-of-the-art performance on two popular
scene benchmark datasets, MIT Indoor 67~\cite{MITIndoor67} and
Sun397~\cite{Sun397}Comment: To Appear in ICCV 201
A Discriminative Representation of Convolutional Features for Indoor Scene Recognition
Indoor scene recognition is a multi-faceted and challenging problem due to
the diverse intra-class variations and the confusing inter-class similarities.
This paper presents a novel approach which exploits rich mid-level
convolutional features to categorize indoor scenes. Traditionally used
convolutional features preserve the global spatial structure, which is a
desirable property for general object recognition. However, we argue that this
structuredness is not much helpful when we have large variations in scene
layouts, e.g., in indoor scenes. We propose to transform the structured
convolutional activations to another highly discriminative feature space. The
representation in the transformed space not only incorporates the
discriminative aspects of the target dataset, but it also encodes the features
in terms of the general object categories that are present in indoor scenes. To
this end, we introduce a new large-scale dataset of 1300 object categories
which are commonly present in indoor scenes. Our proposed approach achieves a
significant performance boost over previous state of the art approaches on five
major scene classification datasets
- …