Search CORE

10,274 research outputs found

Recasting Residual-based Local Descriptors as Convolutional Neural Networks: an Application to Image Forgery Detection

Author: Cozzolino Davide
Poggi Giovanni
Verdoliva Luisa
Publication venue
Publication date: 01/01/2017
Field of study

Local descriptors based on the image noise residual have proven extremely effective for a number of forensic applications, like forgery detection and localization. Nonetheless, motivated by promising results in computer vision, the focus of the research community is now shifting on deep learning. In this paper we show that a class of residual-based descriptors can be actually regarded as a simple constrained convolutional neural network (CNN). Then, by relaxing the constraints, and fine-tuning the net on a relatively small training set, we obtain a significant performance improvement with respect to the conventional detector

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression

Author: Hasan Md Kamrul
Honari Sina
Pal Chris
Rim David
Publication venue
Publication date: 01/01/2015
Field of study

We present techniques for improving performance driven facial animation, emotion recognition, and facial key-point or landmark prediction using learned identity invariant representations. Established approaches to these problems can work well if sufficient examples and labels for a particular identity are available and factors of variation are highly controlled. However, labeled examples of facial expressions, emotions and key-points for new individuals are difficult and costly to obtain. In this paper we improve the ability of techniques to generalize to new and unseen individuals by explicitly modeling previously seen variations related to identity and expression. We use a weakly-supervised approach in which identity labels are used to learn the different factors of variation linked to identity separately from factors related to expression. We show how probabilistic modeling of these sources of variation allows one to learn identity-invariant representations for expressions which can then be used to identity-normalize various procedures for facial expression analysis and animation control. We also show how to extend the widely used techniques of active appearance models and constrained local models through replacing the underlying point distribution models which are typically constructed using principal component analysis with identity-expression factorized representations. We present a wide variety of experiments in which we consistently improve performance on emotion recognition, markerless performance-driven facial animation and facial key-point tracking.Comment: to appear in Image and Vision Computing Journal (IMAVIS

arXiv.org e-Print Archive

PolyPublie

Good Features to Correlate for Visual Tracking

Author: Alatan A. Aydin
Gundogdu Erhan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/03/2018
Field of study

During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.Comment: Accepted version of IEEE Transactions on Image Processin

arXiv.org e-Print Archive

OpenMETU (Middle East Technical University)

Mass Displacement Networks

Author: Kokkinos Iasonas
Neverova Natalia
Publication venue
Publication date: 12/08/2017
Field of study

Despite the large improvements in performance attained by using deep learning in computer vision, one can often further improve results with some additional post-processing that exploits the geometric nature of the underlying task. This commonly involves displacing the posterior distribution of a CNN in a way that makes it more appropriate for the task at hand, e.g. better aligned with local image features, or more compact. In this work we integrate this geometric post-processing within a deep architecture, introducing a differentiable and probabilistically sound counterpart to the common geometric voting technique used for evidence accumulation in vision. We refer to the resulting neural models as Mass Displacement Networks (MDNs), and apply them to human pose estimation in two distinct setups: (a) landmark localization, where we collapse a distribution to a point, allowing for precise localization of body keypoints and (b) communication across body parts, where we transfer evidence from one part to the other, allowing for a globally consistent pose estimate. We evaluate on large-scale pose estimation benchmarks, such as MPII Human Pose and COCO datasets, and report systematic improvements when compared to strong baselines.Comment: 12 pages, 4 figure

arXiv.org e-Print Archive

UCL Discovery

Recommended from our members

Transcranial magnetic stimulation disrupts the perception and embodiment of facial expressions

Author: Duchaine BC
Garrido L
Pitcher D
Walsh V
Publication venue: 'Society for Neuroscience'
Publication date: 03/09/2008
Field of study

Copyright © 2008 Society for Neuroscience and the authors. The The Journal of Neuroscience uses a Creative Commons Attribution-NonCommercial-ShareAlike licence: http://creativecommons.org/licenses/by-nc-sa/4.0/.Theories of embodied cognition propose that recognizing facial expressions requires visual processing followed by simulation of the somatovisceral responses associated with the perceived expression. To test this proposal, we targeted the right occipital face area (rOFA) and the face region of right somatosensory cortex (rSC) with repetitive transcranial magnetic stimulation (rTMS) while participants discriminated facial expressions. rTMS selectively impaired discrimination of facial expressions at both sites but had no effect on a matched face identity task. Site specificity within the rSC was demonstrated by targeting rTMS at the face and finger regions while participants performed the expression discrimination task. rTMS targeted at the face region impaired task performance relative to rTMS targeted at the finger region. To establish the temporal course of visual and somatosensory contributions to expression processing, double-pulse TMS was delivered at different times to rOFA and rSC during expression discrimination. Accuracy dropped when pulses were delivered at 60–100 ms at rOFA and at 100–140 and 130–170 ms at rSC. These sequential impairments at rOFA and rSC support embodied accounts of expression recognition as well as hierarchical models of face processing. The results also demonstrate that nonvisual cortical areas contribute during early stages of expression processing.Biotechnology and Biological Sciences Research Counci

Brunel University Research Archive