Search CORE

47,838 research outputs found

Automatic Facial Expression Recognition Using Features of Salient Facial Patches

Author: Happy S L
Routray Aurobinda
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/05/2015
Field of study

Extraction of discriminative features from salient facial patches plays a vital role in effective facial expression recognition. The accurate detection of facial landmarks improves the localization of the salient patches on face images. This paper proposes a novel framework for expression recognition by using appearance features of selected facial patches. A few prominent facial patches, depending on the position of facial landmarks, are extracted which are active during emotion elicitation. These active patches are further processed to obtain the salient patches which contain discriminative features for classification of each pair of expressions, thereby selecting different facial patches as salient for different pair of expression classes. One-against-one classification method is adopted using these features. In addition, an automated learning-free facial landmark detection technique has been proposed, which achieves similar performances as that of other state-of-art landmark detection methods, yet requires significantly less execution time. The proposed method is found to perform well consistently in different resolutions, hence, providing a solution for expression recognition in low resolution images. Experiments on CK+ and JAFFE facial expression databases show the effectiveness of the proposed system

arXiv.org e-Print Archive

What is the right way to represent document images?

Author: Almazan Jon
Csurka Gabriela
Gordo Albert
Larlus Diane
Publication venue
Publication date: 02/12/2016
Field of study

In this article we study the problem of document image representation based on visual features. We propose a comprehensive experimental study that compares three types of visual document image representations: (1) traditional so-called shallow features, such as the RunLength and the Fisher-Vector descriptors, (2) deep features based on Convolutional Neural Networks, and (3) features extracted from hybrid architectures that take inspiration from the two previous ones. We evaluate these features in several tasks (i.e. classification, clustering, and retrieval) and in different setups (e.g. domain transfer) using several public and in-house datasets. Our results show that deep features generally outperform other types of features when there is no domain shift and the new task is closely related to the one used to train the model. However, when a large domain or task shift is present, the Fisher-Vector shallow features generalize better and often obtain the best results

arXiv.org e-Print Archive

The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs

Author: Cai Hongping
Corradi Tadeo
Hall Peter
Wu Qi
Publication venue
Publication date: 01/05/2015
Field of study

The cross-depiction problem is that of recognising visual objects regardless of whether they are photographed, painted, drawn, etc. It is a potentially significant yet under-researched problem. Emulating the remarkable human ability to recognise objects in an astonishingly wide variety of depictive forms is likely to advance both the foundations and the applications of Computer Vision. In this paper we benchmark classification, domain adaptation, and deep learning methods; demonstrating that none perform consistently well in the cross-depiction problem. Given the current interest in deep learning, the fact such methods exhibit the same behaviour as all but one other method: they show a significant fall in performance over inhomogeneous databases compared to their peak performance, which is always over data comprising photographs only. Rather, we find the methods that have strong models of spatial relations between parts tend to be more robust and therefore conclude that such information is important in modelling object classes regardless of appearance details.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Patterns of Activity in a Global Model of a Solar Active Region

Author: Bradshaw Stephen J.
Viall Nicholeen M.
Publication venue: 'American Astronomical Society'
Publication date: 22/03/2016
Field of study

In this work we investigate the global activity patterns predicted from a model active region heated by distributions of nanoflares that have a range of frequencies. What differs is the average frequency of the distributions. The activity patterns are manifested in time lag maps of narrow-band instrument channel pairs. We combine hydrodynamic and forward modeling codes with a magnetic field extrapolation to create a model active region and apply the time lag method to synthetic observations. Our aim is not to reproduce a particular set of observations in detail, but to recover some typical properties and patterns observed in active regions. Our key findings are the following. 1. cooling dominates the time lag signature and the time lags between the channel pairs are generally consistent with observed values. 2. shorter coronal loops in the core cool more quickly than longer loops at the periphery. 3. all channel pairs show zero time lag when the line-of-sight passes through coronal loop foot-points. 4. there is strong evidence that plasma must be re-energized on a time scale comparable to the cooling timescale to reproduce the observed coronal activity, but it is likely that a relatively broad spectrum of heating frequencies are operating across active regions. 5. due to their highly dynamic nature, we find nanoflare trains produce zero time lags along entire flux tubes in our model active region that are seen between the same channel pairs in observed active regions

arXiv.org e-Print Archive

Iterated Function System Models in Data Analysis: Detection and Separation

Author: Alexander Zachary
Bradley Elizabeth
Garland Joshua
Meiss James D.
Publication venue: 'AIP Publishing'
Publication date: 23/10/2011
Field of study

We investigate the use of iterated function system (IFS) models for data analysis. An IFS is a discrete dynamical system in which each time step corresponds to the application of one of a finite collection of maps. The maps, which represent distinct dynamical regimes, may act in some pre-determined sequence or may be applied in random order. An algorithm is developed to detect the sequence of regime switches under the assumption of continuity. This method is tested on a simple IFS and applied to an experimental computer performance data set. This methodology has a wide range of potential uses: from change-point detection in time-series data to the field of digital communications

arXiv.org e-Print Archive

Visualizing Deep Convolutional Neural Networks Using Natural Pre-Images

Author: Mahendran Aravindh
Vedaldi Andrea
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/04/2016
Field of study

Image representations, from SIFT and bag of visual words to Convolutional Neural Networks (CNNs) are a crucial component of almost all computer vision systems. However, our understanding of them remains limited. In this paper we study several landmark representations, both shallow and deep, by a number of complementary visualization techniques. These visualizations are based on the concept of "natural pre-image", namely a natural-looking image whose representation has some notable property. We study in particular three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We pose these as a regularized energy-minimization framework and demonstrate its generality and effectiveness. In particular, we show that this method can invert representations such as HOG more accurately than recent alternatives while being applicable to CNNs too. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.Comment: A substantially extended version of http://www.robots.ox.ac.uk/~vedaldi/assets/pubs/mahendran15understanding.pdf. arXiv admin note: text overlap with arXiv:1412.003

arXiv.org e-Print Archive

SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

Author: Arnab Anurag
Cheng Ming-Ming
Golodetz Stuart
Izadi Shahram
Kähler Olaf
Murray David W.
Prisacariu Victor A.
Ren Carl Yuheng
Sapienza Michael
Torr Philip H. S.
Valentin Julien P. C.
Vineet Vibhav
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/10/2015
Field of study

We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes. Using our system, a user can walk into a room wearing a depth camera and a virtual reality headset, and both densely reconstruct the 3D scene and interactively segment the environment into object classes such as 'chair', 'floor' and 'table'. The user interacts physically with the real-world scene, touching objects and using voice commands to assign them appropriate labels. These user-generated labels are leveraged by an online random forest-based machine learning algorithm, which is used to predict labels for previously unseen parts of the scene. The entire pipeline runs in real time, and the user stays 'in the loop' throughout the process, receiving immediate feedback about the progress of the labelling and interacting with the scene as necessary to refine the predicted segmentation.Comment: 33 pages, Project: http://www.semantic-paint.com, Code: https://github.com/torrvision/spain

arXiv.org e-Print Archive

A Monocular Vision System for Playing Soccer in Low Color Information Environments

Author: Allgeuer Philipp
Behnke Sven
Farazi Hafez
Publication venue
Publication date: 28/09/2018
Field of study

Humanoid soccer robots perceive their environment exclusively through cameras. This paper presents a monocular vision system that was originally developed for use in the RoboCup Humanoid League, but is expected to be transferable to other soccer leagues. Recent changes in the Humanoid League rules resulted in a soccer environment with less color coding than in previous years, which makes perception of the game situation more challenging. The proposed vision system addresses these challenges by using brightness and texture for the detection of the required field features and objects. Our system is robust to changes in lighting conditions, and is designed for real-time use on a humanoid soccer robot. This paper describes the main components of the detection algorithms in use, and presents experimental results from the soccer field, using ROS and the igus Humanoid Open Platform as a testbed. The proposed vision system was used successfully at RoboCup 2015.Comment: Proceedings of 10th Workshop on Humanoid Soccer Robots, International Conference on Humanoid Robots (Humanoids), Seoul, Korea, 201

arXiv.org e-Print Archive

A review of EO image information mining

Author: Olaizola Igor G.
Quartulli Marco
Publication venue
Publication date: 19/06/2012
Field of study

We analyze the state of the art of content-based retrieval in Earth observation image archives focusing on complete systems showing promise for operational implementation. The different paradigms at the basis of the main system families are introduced. The approaches taken are analyzed, focusing in particular on the phases after primitive feature extraction. The solutions envisaged for the issues related to feature simplification and synthesis, indexing, semantic labeling are reviewed. The methodologies for query specification and execution are analyzed

arXiv.org e-Print Archive

Automatic Segmentation and Overall Survival Prediction in Gliomas using Fully Convolutional Neural Network and Texture Analysis

Author: Alex Varghese
Krishnamurthi Ganapathy
Safwan Mohammed
Publication venue
Publication date: 06/12/2017
Field of study

In this paper, we use a fully convolutional neural network (FCNN) for the segmentation of gliomas from Magnetic Resonance Images (MRI). A fully automatic, voxel based classification was achieved by training a 23 layer deep FCNN on 2-D slices extracted from patient volumes. The network was trained on slices extracted from 130 patients and validated on 50 patients. For the task of survival prediction, texture and shape based features were extracted from T1 post contrast volume to train an XGBoost regressor. On BraTS 2017 validation set, the proposed scheme achieved a mean whole tumor, tumor core and active dice score of 0.83, 0.69 and 0.69 respectively and an accuracy of 52% for the overall survival prediction.Comment: 10 Pages, 6 Figure

arXiv.org e-Print Archive