10,530 research outputs found
Visualization of Time-Series Data in Parameter Space for Understanding Facial Dynamics
Over the past decade, computer scientists and psychologists have made great efforts to collect and analyze facial dynamics data that exhibit different expressions and emotions. Such data is commonly captured as videos and are transformed into feature-based time-series prior to any analysis. However, the analytical tasks, such as expression classification, have been hindered by the lack of understanding of the complex data space and the associated algorithm space. Conventional graph-based time-series visualization is also found inadequate to support such tasks. In this work, we adopt a visual analytics approach by visualizing the correlation between the algorithm space and our goal – classifying facial dynamics. We transform multiple feature-based time-series for each expression in measurement space to a multi-dimensional representation in parameter space. This enables us to utilize parallel coordinates visualization to gain an understanding of the algorithm space, providing a fast and cost-effective means to support the design of analytical algorithms
What May Visualization Processes Optimize?
In this paper, we present an abstract model of visualization and inference
processes and describe an information-theoretic measure for optimizing such
processes. In order to obtain such an abstraction, we first examined six
classes of workflows in data analysis and visualization, and identified four
levels of typical visualization components, namely disseminative,
observational, analytical and model-developmental visualization. We noticed a
common phenomenon at different levels of visualization, that is, the
transformation of data spaces (referred to as alphabets) usually corresponds to
the reduction of maximal entropy along a workflow. Based on this observation,
we establish an information-theoretic measure of cost-benefit ratio that may be
used as a cost function for optimizing a data visualization process. To
demonstrate the validity of this measure, we examined a number of successful
visualization processes in the literature, and showed that the
information-theoretic measure can mathematically explain the advantages of such
processes over possible alternatives.Comment: 10 page
LOMo: Latent Ordinal Model for Facial Analysis in Videos
We study the problem of facial analysis in videos. We propose a novel weakly
supervised learning method that models the video event (expression, pain etc.)
as a sequence of automatically mined, discriminative sub-events (eg. onset and
offset phase for smile, brow lower and cheek raise for pain). The proposed
model is inspired by the recent works on Multiple Instance Learning and latent
SVM/HCRF- it extends such frameworks to model the ordinal or temporal aspect in
the videos, approximately. We obtain consistent improvements over relevant
competitive baselines on four challenging and publicly available video based
facial analysis datasets for prediction of expression, clinical pain and intent
in dyadic conversations. In combination with complimentary features, we report
state-of-the-art results on these datasets.Comment: 2016 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR
Learning Bodily and Temporal Attention in Protective Movement Behavior Detection
For people with chronic pain, the assessment of protective behavior during
physical functioning is essential to understand their subjective pain-related
experiences (e.g., fear and anxiety toward pain and injury) and how they deal
with such experiences (avoidance or reliance on specific body joints), with the
ultimate goal of guiding intervention. Advances in deep learning (DL) can
enable the development of such intervention. Using the EmoPain MoCap dataset,
we investigate how attention-based DL architectures can be used to improve the
detection of protective behavior by capturing the most informative temporal and
body configurational cues characterizing specific movements and the strategies
used to perform them. We propose an end-to-end deep learning architecture named
BodyAttentionNet (BANet). BANet is designed to learn temporal and bodily parts
that are more informative to the detection of protective behavior. The approach
addresses the variety of ways people execute a movement (including healthy
people) independently of the type of movement analyzed. Through extensive
comparison experiments with other state-of-the-art machine learning techniques
used with motion capture data, we show statistically significant improvements
achieved by using these attention mechanisms. In addition, the BANet
architecture requires a much lower number of parameters than the state of the
art for comparable if not higher performances.Comment: 7 pages, 3 figures, 2 tables, code available, accepted in ACII 201
Automatic Analysis of Facial Expressions Based on Deep Covariance Trajectories
In this paper, we propose a new approach for facial expression recognition
using deep covariance descriptors. The solution is based on the idea of
encoding local and global Deep Convolutional Neural Network (DCNN) features
extracted from still images, in compact local and global covariance
descriptors. The space geometry of the covariance matrices is that of Symmetric
Positive Definite (SPD) matrices. By conducting the classification of static
facial expressions using Support Vector Machine (SVM) with a valid Gaussian
kernel on the SPD manifold, we show that deep covariance descriptors are more
effective than the standard classification with fully connected layers and
softmax. Besides, we propose a completely new and original solution to model
the temporal dynamic of facial expressions as deep trajectories on the SPD
manifold. As an extension of the classification pipeline of covariance
descriptors, we apply SVM with valid positive definite kernels derived from
global alignment for deep covariance trajectories classification. By performing
extensive experiments on the Oulu-CASIA, CK+, and SFEW datasets, we show that
both the proposed static and dynamic approaches achieve state-of-the-art
performance for facial expression recognition outperforming many recent
approaches.Comment: A preliminary version of this work appeared in "Otberdout N, Kacem A,
Daoudi M, Ballihi L, Berretti S. Deep Covariance Descriptors for Facial
Expression Recognition, in British Machine Vision Conference 2018, BMVC 2018,
Northumbria University, Newcastle, UK, September 3-6, 2018. ; 2018 :159."
arXiv admin note: substantial text overlap with arXiv:1805.0386
Linguistically-driven framework for computationally efficient and scalable sign recognition
We introduce a new general framework for sign recognition from monocular video using limited quantities of annotated data. The novelty of the hybrid framework we describe here is that we exploit state-of-the art learning methods while also incorporating features based on what we know about the linguistic composition of lexical signs. In particular, we analyze hand shape, orientation, location, and motion trajectories, and then use CRFs to combine this linguistically significant information for purposes of sign recognition. Our robust modeling and recognition of these sub-components of sign production allow an efficient parameterization of the sign recognition problem as compared with purely data-driven methods. This parameterization enables a scalable and extendable time-series learning approach that advances the state of the art in sign recognition, as shown by the results reported here for recognition of isolated, citation-form, lexical signs from American Sign Language (ASL)
Digital interaction: where are we going?
In the framework of the AVI 2018 Conference, the interuniversity center ECONA has organized a thematic workshop on "Digital Interaction: where are we going?". Six contributions from the ECONA members investigate different perspectives around this thematic
- …