47,838 research outputs found

    Automatic Facial Expression Recognition Using Features of Salient Facial Patches

    Full text link
    Extraction of discriminative features from salient facial patches plays a vital role in effective facial expression recognition. The accurate detection of facial landmarks improves the localization of the salient patches on face images. This paper proposes a novel framework for expression recognition by using appearance features of selected facial patches. A few prominent facial patches, depending on the position of facial landmarks, are extracted which are active during emotion elicitation. These active patches are further processed to obtain the salient patches which contain discriminative features for classification of each pair of expressions, thereby selecting different facial patches as salient for different pair of expression classes. One-against-one classification method is adopted using these features. In addition, an automated learning-free facial landmark detection technique has been proposed, which achieves similar performances as that of other state-of-art landmark detection methods, yet requires significantly less execution time. The proposed method is found to perform well consistently in different resolutions, hence, providing a solution for expression recognition in low resolution images. Experiments on CK+ and JAFFE facial expression databases show the effectiveness of the proposed system

    What is the right way to represent document images?

    Full text link
    In this article we study the problem of document image representation based on visual features. We propose a comprehensive experimental study that compares three types of visual document image representations: (1) traditional so-called shallow features, such as the RunLength and the Fisher-Vector descriptors, (2) deep features based on Convolutional Neural Networks, and (3) features extracted from hybrid architectures that take inspiration from the two previous ones. We evaluate these features in several tasks (i.e. classification, clustering, and retrieval) and in different setups (e.g. domain transfer) using several public and in-house datasets. Our results show that deep features generally outperform other types of features when there is no domain shift and the new task is closely related to the one used to train the model. However, when a large domain or task shift is present, the Fisher-Vector shallow features generalize better and often obtain the best results

    The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs

    Full text link
    The cross-depiction problem is that of recognising visual objects regardless of whether they are photographed, painted, drawn, etc. It is a potentially significant yet under-researched problem. Emulating the remarkable human ability to recognise objects in an astonishingly wide variety of depictive forms is likely to advance both the foundations and the applications of Computer Vision. In this paper we benchmark classification, domain adaptation, and deep learning methods; demonstrating that none perform consistently well in the cross-depiction problem. Given the current interest in deep learning, the fact such methods exhibit the same behaviour as all but one other method: they show a significant fall in performance over inhomogeneous databases compared to their peak performance, which is always over data comprising photographs only. Rather, we find the methods that have strong models of spatial relations between parts tend to be more robust and therefore conclude that such information is important in modelling object classes regardless of appearance details.Comment: 12 pages, 6 figure

    Patterns of Activity in a Global Model of a Solar Active Region

    Full text link
    In this work we investigate the global activity patterns predicted from a model active region heated by distributions of nanoflares that have a range of frequencies. What differs is the average frequency of the distributions. The activity patterns are manifested in time lag maps of narrow-band instrument channel pairs. We combine hydrodynamic and forward modeling codes with a magnetic field extrapolation to create a model active region and apply the time lag method to synthetic observations. Our aim is not to reproduce a particular set of observations in detail, but to recover some typical properties and patterns observed in active regions. Our key findings are the following. 1. cooling dominates the time lag signature and the time lags between the channel pairs are generally consistent with observed values. 2. shorter coronal loops in the core cool more quickly than longer loops at the periphery. 3. all channel pairs show zero time lag when the line-of-sight passes through coronal loop foot-points. 4. there is strong evidence that plasma must be re-energized on a time scale comparable to the cooling timescale to reproduce the observed coronal activity, but it is likely that a relatively broad spectrum of heating frequencies are operating across active regions. 5. due to their highly dynamic nature, we find nanoflare trains produce zero time lags along entire flux tubes in our model active region that are seen between the same channel pairs in observed active regions

    Iterated Function System Models in Data Analysis: Detection and Separation

    Full text link
    We investigate the use of iterated function system (IFS) models for data analysis. An IFS is a discrete dynamical system in which each time step corresponds to the application of one of a finite collection of maps. The maps, which represent distinct dynamical regimes, may act in some pre-determined sequence or may be applied in random order. An algorithm is developed to detect the sequence of regime switches under the assumption of continuity. This method is tested on a simple IFS and applied to an experimental computer performance data set. This methodology has a wide range of potential uses: from change-point detection in time-series data to the field of digital communications

    Visualizing Deep Convolutional Neural Networks Using Natural Pre-Images

    Full text link
    Image representations, from SIFT and bag of visual words to Convolutional Neural Networks (CNNs) are a crucial component of almost all computer vision systems. However, our understanding of them remains limited. In this paper we study several landmark representations, both shallow and deep, by a number of complementary visualization techniques. These visualizations are based on the concept of "natural pre-image", namely a natural-looking image whose representation has some notable property. We study in particular three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We pose these as a regularized energy-minimization framework and demonstrate its generality and effectiveness. In particular, we show that this method can invert representations such as HOG more accurately than recent alternatives while being applicable to CNNs too. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.Comment: A substantially extended version of http://www.robots.ox.ac.uk/~vedaldi/assets/pubs/mahendran15understanding.pdf. arXiv admin note: text overlap with arXiv:1412.003

    SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

    Full text link
    We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes. Using our system, a user can walk into a room wearing a depth camera and a virtual reality headset, and both densely reconstruct the 3D scene and interactively segment the environment into object classes such as 'chair', 'floor' and 'table'. The user interacts physically with the real-world scene, touching objects and using voice commands to assign them appropriate labels. These user-generated labels are leveraged by an online random forest-based machine learning algorithm, which is used to predict labels for previously unseen parts of the scene. The entire pipeline runs in real time, and the user stays 'in the loop' throughout the process, receiving immediate feedback about the progress of the labelling and interacting with the scene as necessary to refine the predicted segmentation.Comment: 33 pages, Project: http://www.semantic-paint.com, Code: https://github.com/torrvision/spain

    A Monocular Vision System for Playing Soccer in Low Color Information Environments

    Full text link
    Humanoid soccer robots perceive their environment exclusively through cameras. This paper presents a monocular vision system that was originally developed for use in the RoboCup Humanoid League, but is expected to be transferable to other soccer leagues. Recent changes in the Humanoid League rules resulted in a soccer environment with less color coding than in previous years, which makes perception of the game situation more challenging. The proposed vision system addresses these challenges by using brightness and texture for the detection of the required field features and objects. Our system is robust to changes in lighting conditions, and is designed for real-time use on a humanoid soccer robot. This paper describes the main components of the detection algorithms in use, and presents experimental results from the soccer field, using ROS and the igus Humanoid Open Platform as a testbed. The proposed vision system was used successfully at RoboCup 2015.Comment: Proceedings of 10th Workshop on Humanoid Soccer Robots, International Conference on Humanoid Robots (Humanoids), Seoul, Korea, 201

    A review of EO image information mining

    Full text link
    We analyze the state of the art of content-based retrieval in Earth observation image archives focusing on complete systems showing promise for operational implementation. The different paradigms at the basis of the main system families are introduced. The approaches taken are analyzed, focusing in particular on the phases after primitive feature extraction. The solutions envisaged for the issues related to feature simplification and synthesis, indexing, semantic labeling are reviewed. The methodologies for query specification and execution are analyzed

    Automatic Segmentation and Overall Survival Prediction in Gliomas using Fully Convolutional Neural Network and Texture Analysis

    Full text link
    In this paper, we use a fully convolutional neural network (FCNN) for the segmentation of gliomas from Magnetic Resonance Images (MRI). A fully automatic, voxel based classification was achieved by training a 23 layer deep FCNN on 2-D slices extracted from patient volumes. The network was trained on slices extracted from 130 patients and validated on 50 patients. For the task of survival prediction, texture and shape based features were extracted from T1 post contrast volume to train an XGBoost regressor. On BraTS 2017 validation set, the proposed scheme achieved a mean whole tumor, tumor core and active dice score of 0.83, 0.69 and 0.69 respectively and an accuracy of 52% for the overall survival prediction.Comment: 10 Pages, 6 Figure
    • …
    corecore