327 research outputs found
Local, Semi-Local and Global Models for Texture, Object and Scene Recognition
This dissertation addresses the problems of recognizing textures, objects, and scenes in photographs. We present approaches to these recognition tasks that combine salient local image features with spatial relations and effective discriminative learning techniques. First, we introduce a bag of features image model for recognizing textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. We present results of a large-scale comparative evaluation indicating that bags of features can be effective not only for texture, but also for object categization, even in the presence of substantial clutter and intra-class variation. We also show how to augment the purely local image representation with statistical co-occurrence relations between pairs of nearby features, and develop a learning and classification framework for the task of classifying individual features in a multi-texture image. Next, we present a more structured alternative to bags of features for object recognition, namely, an image representation based on semi-local parts, or groups of features characterized by stable appearance and geometric layout. Semi-local parts are automatically learned from small sets of unsegmented, cluttered images. Finally, we present a global method for recognizing scene categories that works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting spatial pyramid representation demonstrates significantly improved performance on challenging scene categorization tasks
Who is the director of this movie? Automatic style recognition based on shot features
We show how low-level formal features, such as shot duration, meant as length
of camera takes, and shot scale, i.e. the distance between the camera and the
subject, are distinctive of a director's style in art movies. So far such
features were thought of not having enough varieties to become distinctive of
an author. However our investigation on the full filmographies of six different
authors (Scorsese, Godard, Tarr, Fellini, Antonioni, and Bergman) for a total
number of 120 movies analysed second by second, confirms that these
shot-related features do not appear as random patterns in movies from the same
director. For feature extraction we adopt methods based on both conventional
and deep learning techniques. Our findings suggest that feature sequential
patterns, i.e. how features evolve in time, are at least as important as the
related feature distributions. To the best of our knowledge this is the first
study dealing with automatic attribution of movie authorship, which opens up
interesting lines of cross-disciplinary research on the impact of style on the
aesthetic and emotional effects on the viewers
Exemplar Based Deep Discriminative and Shareable Feature Learning for Scene Image Classification
In order to encode the class correlation and class specific information in
image representation, we propose a new local feature learning approach named
Deep Discriminative and Shareable Feature Learning (DDSFL). DDSFL aims to
hierarchically learn feature transformation filter banks to transform raw pixel
image patches to features. The learned filter banks are expected to: (1) encode
common visual patterns of a flexible number of categories; (2) encode
discriminative information; and (3) hierarchically extract patterns at
different visual levels. Particularly, in each single layer of DDSFL, shareable
filters are jointly learned for classes which share the similar patterns.
Discriminative power of the filters is achieved by enforcing the features from
the same category to be close, while features from different categories to be
far away from each other. Furthermore, we also propose two exemplar selection
methods to iteratively select training data for more efficient and effective
learning. Based on the experimental results, DDSFL can achieve very promising
performance, and it also shows great complementary effect to the
state-of-the-art Caffe features.Comment: Pattern Recognition, Elsevier, 201
An in-depth evaluation of multimodal video genre categorization
International audienceIn this paper we propose an in-depth evaluation of the performance of video descriptors to multimodal video genre categorization. We discuss the perspective of designing appropriate late fusion techniques that would enable to attain very high categorization accuracy, close to the one achieved with user-based text information. Evaluation is carried out in the context of the 2012 Video Genre Tagging Task of the MediaEval Benchmarking Initiative for Multimedia Evaluation, using a data set of up to 15.000 videos (3,200 hours of footage) and 26 video genre categories specific to web media. Results show that the proposed approach significantly improves genre categorization performance, outperforming other existing approaches. The main contribution of this paper is in the experimental part, several valuable interesting findings are reported that motivate further research on video genre classification
Recent Trends in Computational Intelligence
Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications
- …