41,363 research outputs found
A Computational Approach to Hand Pose Recognition in Early Modern Paintings
Hands represent an important aspect of pictorial narration but have rarely been addressed as an object of study in art history and digital humanities. Although hand gestures play a significant role in conveying emotions, narratives, and cultural symbolism in the context of visual art, a comprehensive terminology for the classification of depicted hand poses is still lacking. In this article, we present the process of creating a new annotated dataset of pictorial hand poses. The dataset is based on a collection of European early modern paintings, from which hands are extracted using human pose estimation (HPE) methods. The hand images are then manually annotated based on art historical categorization schemes. From this categorization, we introduce a new classification task and perform a series of experiments using different types of features, including our newly introduced 2D hand keypoint features, as well as existing neural network-based features. This classification task represents a new and complex challenge due to the subtle and contextually dependent differences between depicted hands. The presented computational approach to hand pose recognition in paintings represents an initial attempt to tackle this challenge, which could potentially advance the use of HPE methods on paintings, as well as foster new research on the understanding of hand gestures in art
From images via symbols to contexts: using augmented reality for interactive model acquisition
Systems that perform in real environments need to bind the internal state to externally
perceived objects, events, or complete scenes. How to learn this correspondence has been a long
standing problem in computer vision as well as artificial intelligence. Augmented Reality provides
an interesting perspective on this problem because a human user can directly relate displayed
system results to real environments. In the following we present a system that is able to bootstrap
internal models from user-system interactions. Starting from pictorial representations it learns
symbolic object labels that provide the basis for storing observed episodes. In a second step, more
complex relational information is extracted from stored episodes that enables the system to react
on specific scene contexts
Discovering useful parts for pose estimation in sparsely annotated datasets
Our work introduces a novel way to increase pose estimation accuracy by discovering parts from unannotated regions of training images. Discovered parts are used to generate more accurate appearance likelihoods for traditional part-based models like Pictorial Structures and its derivatives. Our experiments on images of a hawkmoth in flight show that our proposed approach significantly improves over existing work for this application, while also being more generally applicable. Our proposed approach localizes landmarks at least twice as accurately as a baseline based on a Mixture of Pictorial Structures (MPS) model. Our unique High-Resolution Moth Flight (HRMF) dataset is made publicly available with annotations.https://arxiv.org/abs/1605.00707Accepted manuscrip
Mimesis and Metaphor
"Representation", "imitation" and "mirroring" have proved
to be insufficient translations of the concept of mimesis.
Walter Benjamin's notion of "mimetic potential" offers a
different view on the qualities of mimesis. Benjamin
stresses the importance of language and its mediality to
mimesis; for him language is the "höchste Stufe des
mimetischen Verhaltens und das vollkommenste Archiv
der unsinnlichen Ähnlichkeit" (Benjamin 1991, Bd. II/1,
213) He considers mimesis on the level of linguistic
mediality.
In the following I will try to outline the mimetic potential
of metaphors in literary texts which focus on their linguistic
mediality. As Paul Ricoeur suggests, "the possibility that
metaphorical discourse says something about reality
collides with the apparent constitution of poetic discourse,
which seems to be essentially non-referential and centred
on itself. To this non-referential conception of poetic
discourse I oppose the idea that the suspension of literal
reference is the condition for the release of a power of
second-degree reference which is properly poetic
reference. Thus, to use an expression borrowed from
Jakobson, one must not speak only of split sense but of
'split reference' as well." (Ricoeur 1977, 6
Towards Accurate Multi-person Pose Estimation in the Wild
We propose a method for multi-person detection and 2-D pose estimation that
achieves state-of-art results on the challenging COCO keypoints task. It is a
simple, yet powerful, top-down approach consisting of two stages.
In the first stage, we predict the location and scale of boxes which are
likely to contain people; for this we use the Faster RCNN detector. In the
second stage, we estimate the keypoints of the person potentially contained in
each proposed bounding box. For each keypoint type we predict dense heatmaps
and offsets using a fully convolutional ResNet. To combine these outputs we
introduce a novel aggregation procedure to obtain highly localized keypoint
predictions. We also use a novel form of keypoint-based Non-Maximum-Suppression
(NMS), instead of the cruder box-level NMS, and a novel form of keypoint-based
confidence score estimation, instead of box-level scoring.
Trained on COCO data alone, our final system achieves average precision of
0.649 on the COCO test-dev set and the 0.643 test-standard sets, outperforming
the winner of the 2016 COCO keypoints challenge and other recent state-of-art.
Further, by using additional in-house labeled data we obtain an even higher
average precision of 0.685 on the test-dev set and 0.673 on the test-standard
set, more than 5% absolute improvement compared to the previous best performing
method on the same dataset.Comment: Paper describing an improved version of the G-RMI entry to the 2016
COCO keypoints challenge (http://image-net.org/challenges/ilsvrc+coco2016).
Camera ready version to appear in the Proceedings of CVPR 201
DeepPose: Human Pose Estimation via Deep Neural Networks
We propose a method for human pose estimation based on Deep Neural Networks
(DNNs). The pose estimation is formulated as a DNN-based regression problem
towards body joints. We present a cascade of such DNN regressors which results
in high precision pose estimates. The approach has the advantage of reasoning
about pose in a holistic fashion and has a simple but yet powerful formulation
which capitalizes on recent advances in Deep Learning. We present a detailed
empirical analysis with state-of-art or better performance on four academic
benchmarks of diverse real-world images.Comment: IEEE Conference on Computer Vision and Pattern Recognition, 201
- …