2,598 research outputs found
Understanding the Perceived Quality of Video Predictions
The study of video prediction models is believed to be a fundamental approach
to representation learning for videos. While a plethora of generative models
for predicting the future frame pixel values given the past few frames exist,
the quantitative evaluation of the predicted frames has been found to be
extremely challenging. In this context, we study the problem of quality
assessment of predicted videos. We create the Indian Institute of Science
Predicted Videos Quality Assessment (IISc PVQA) Database consisting of 300
videos, obtained by applying different prediction models on different datasets,
and accompanying human opinion scores. We collected subjective ratings of
quality from 50 human participants for these videos. Our subjective study
reveals that human observers were highly consistent in their judgments of
quality of predicted videos. We benchmark several popularly used measures for
evaluating video prediction and show that they do not adequately correlate with
these subjective scores. We introduce two new features to effectively capture
the quality of predicted videos, motion-compensated cosine similarities of deep
features of predicted frames with past frames, and deep features extracted from
rescaled frame differences. We show that our feature design leads to state of
the art quality prediction in accordance with human judgments on our IISc PVQA
Database. The database and code are publicly available on our project website:
https://nagabhushansn95.github.io/publications/2020/pvqaComment: Project website:
https://nagabhushansn95.github.io/publications/2020/pvqa.htm
A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization
We propose NeuFace, a 3D face mesh pseudo annotation method on videos via
neural re-parameterized optimization. Despite the huge progress in 3D face
reconstruction methods, generating reliable 3D face labels for in-the-wild
dynamic videos remains challenging. Using NeuFace optimization, we annotate the
per-view/-frame accurate and consistent face meshes on large-scale face videos,
called the NeuFace-dataset. We investigate how neural re-parameterization helps
to reconstruct image-aligned facial details on 3D meshes via gradient analysis.
By exploiting the naturalness and diversity of 3D faces in our dataset, we
demonstrate the usefulness of our dataset for 3D face-related tasks: improving
the reconstruction accuracy of an existing 3D face reconstruction model and
learning 3D facial motion prior. Code and datasets will be available at
https://neuface-dataset.github.io.Comment: 9 pages, 7 figures, and 3 tables for the main paper. 8 pages, 6
figures and 3 tables for the appendi
Somatic ABC's: A Theoretical Framework for Designing, Developing and Evaluating the Building Blocks of Touch-Based Information Delivery
abstract: Situations of sensory overload are steadily becoming more frequent as the ubiquity of technology approaches reality--particularly with the advent of socio-communicative smartphone applications, and pervasive, high speed wireless networks. Although the ease of accessing information has improved our communication effectiveness and efficiency, our visual and auditory modalities--those modalities that today's computerized devices and displays largely engage--have become overloaded, creating possibilities for distractions, delays and high cognitive load; which in turn can lead to a loss of situational awareness, increasing chances for life threatening situations such as texting while driving. Surprisingly, alternative modalities for information delivery have seen little exploration. Touch, in particular, is a promising candidate given that it is our largest sensory organ with impressive spatial and temporal acuity. Although some approaches have been proposed for touch-based information delivery, they are not without limitations including high learning curves, limited applicability and/or limited expression. This is largely due to the lack of a versatile, comprehensive design theory--specifically, a theory that addresses the design of touch-based building blocks for expandable, efficient, rich and robust touch languages that are easy to learn and use. Moreover, beyond design, there is a lack of implementation and evaluation theories for such languages. To overcome these limitations, a unified, theoretical framework, inspired by natural, spoken language, is proposed called Somatic ABC's for Articulating (designing), Building (developing) and Confirming (evaluating) touch-based languages. To evaluate the usefulness of Somatic ABC's, its design, implementation and evaluation theories were applied to create communication languages for two very unique application areas: audio described movies and motor learning. These applications were chosen as they presented opportunities for complementing communication by offloading information, typically conveyed visually and/or aurally, to the skin. For both studies, it was found that Somatic ABC's aided the design, development and evaluation of rich somatic languages with distinct and natural communication units.Dissertation/ThesisPh.D. Computer Science 201
Neural theory for the perception of causal actions
The efficient prediction of the behavior of others requires the recognition of their actions and an understanding of their action goals. In humans, this process is fast and extremely robust, as demonstrated by classical experiments showing that human observers reliably judge causal relationships and attribute interactive social behavior to strongly simplified stimuli consisting of simple moving geometrical shapes. While psychophysical experiments have identified critical visual features that determine the perception of causality and agency from such stimuli, the underlying detailed neural mechanisms remain largely unclear, and it is an open question why humans developed this advanced visual capability at all. We created pairs of naturalistic and abstract stimuli of hand actions that were exactly matched in terms of their motion parameters. We show that varying critical stimulus parameters for both stimulus types leads to very similar modulations of the perception of causality. However, the additional form information about the hand shape and its relationship with the object supports more fine-grained distinctions for the naturalistic stimuli. Moreover, we show that a physiologically plausible model for the recognition of goal-directed hand actions reproduces the observed dependencies of causality perception on critical stimulus parameters. These results support the hypothesis that selectivity for abstract action stimuli might emerge from the same neural mechanisms that underlie the visual processing of natural goal-directed action stimuli. Furthermore, the model proposes specific detailed neural circuits underlying this visual function, which can be evaluated in future experiments.Seventh Framework Programme (European Commission) (Tango Grant FP7-249858-TP3 and AMARSi Grant FP7-ICT- 248311)Deutsche Forschungsgemeinschaft (Grant GI 305/4-1)Hermann and Lilly Schilling Foundation for Medical Researc
The Nomic Role Account of Carving Reality at the Joints
http://klinechair.missouri.edu/Vita_Revised.htm (#51)Natural properties are those that carve reality at the joints. The notion of carving reality at the joints, however, is somewhat obscure. It is sometimes understood in terms of making for similarity, sometimes in terms of conferring causal powers, and sometimes in terms of figuring in the laws of nature. I develop and assess an account of the third sort according to which carving reality at the joints is understood as having the right level of determinacy relative to nomic roles. The account has the attraction of involving only very weak metaphysical presuppositions, but it fails to capture several features that natural properties are presumed to have
On human motion prediction using recurrent neural networks
Human motion modelling is a classical problem at the intersection of graphics
and computer vision, with applications spanning human-computer interaction,
motion synthesis, and motion prediction for virtual and augmented reality.
Following the success of deep learning methods in several computer vision
tasks, recent work has focused on using deep recurrent neural networks (RNNs)
to model human motion, with the goal of learning time-dependent representations
that perform tasks such as short-term motion prediction and long-term human
motion synthesis. We examine recent work, with a focus on the evaluation
methodologies commonly used in the literature, and show that, surprisingly,
state-of-the-art performance can be achieved by a simple baseline that does not
attempt to model motion at all. We investigate this result, and analyze recent
RNN methods by looking at the architectures, loss functions, and training
procedures used in state-of-the-art approaches. We propose three changes to the
standard RNN models typically used for human motion, which result in a simple
and scalable RNN architecture that obtains state-of-the-art performance on
human motion prediction.Comment: Accepted at CVPR 1
- âŠ