215 research outputs found
A semantic feature for human motion retrieval
With the explosive growth of motion capture data, it becomes very imperative in animation production to have an efficient search engine to retrieve motions from large motion repository. However, because of the high dimension of data space and complexity of matching methods, most of the existing approaches cannot return the result in real time. This paper proposes a high level semantic feature in a low dimensional space to represent the essential characteristic of different motion classes. On the basis of the statistic training of Gauss Mixture Model, this feature can effectively achieve motion matching on both global clip level and local frame level. Experiment results show that our approach can retrieve similar motions with rankings from large motion database in real-time and also can make motion annotation automatically on the fly. Copyright © 2013 John Wiley & Sons, Ltd
"Ein ächtes antipoeticum":Untersuchungen zur poetologischen Verwendung des Horns in der deutschen Literatur
Die Geschichte des Horns in der deutschen Literatur reicht bis in das Althochdeutsche zurück: Im Muspilli ertönt das Horn zum Weltgericht. Vom Olifant im Rolandslied über das Horns Oberons und Des Knaben Wunderhorn bis zur Verwendung in Wagners Opern spielt das Instrument eine wichtige Rolle, insbesondere in Werken der Romantik. Mit seinem besonderen Klang und seiner Fähigkeit des Echospiels ist es bestens geeignet Vorstellungen von Jenseitigkeit zu erzeugen. Das Horn vermag aber auch antipoetisch und als Instrument der Apokalypse zu wirken. Die vorliegende Arbeit zeigt an ausgewählten Beispielen den vielfältigen Bedeutungskosmos in der literarischen Verwendung und untersucht den spezifischen semantischen Gehalt des Horn-Motivs für das jeweilige Werk. Dabei wird auch die Auflösung des einst festen Bedeutungsspektrums hin zu einer freieren Verwendung des Horn-Motivs im literaturgeschichtlichen Verlauf beleuchtet
Adding New Tasks to a Single Network with Weight Transformations using Binary Masks
Visual recognition algorithms are required today to exhibit adaptive
abilities. Given a deep model trained on a specific, given task, it would be
highly desirable to be able to adapt incrementally to new tasks, preserving
scalability as the number of new tasks increases, while at the same time
avoiding catastrophic forgetting issues. Recent work has shown that masking the
internal weights of a given original conv-net through learned binary variables
is a promising strategy. We build upon this intuition and take into account
more elaborated affine transformations of the convolutional weights that
include learned binary masks. We show that with our generalization it is
possible to achieve significantly higher levels of adaptation to new tasks,
enabling the approach to compete with fine tuning strategies by requiring
slightly more than 1 bit per network parameter per additional task. Experiments
on two popular benchmarks showcase the power of our approach, that achieves the
new state of the art on the Visual Decathlon Challenge
Classification of Two Comic Books based on Convolutional Neural Networks
Unphotographic images are the powerful representations described various situations. Thus, understanding intellectual products such as comics and picture books is one of the important topics in the field of artificial intelligence. Hence, stepwise analysis of a comic story, i.e., features of a part of the image, information features, features relating to continuous scene etc., was pursued. Especially, the length and each scene of four-scene comics are limited so as to ensure a clear interpretation of the contents.In this study, as the first step in this direction, the problem to classify two four-scene comics by the same artists were focused as the example. Several classifiers were constructed by utilizing a Convolutional Neural Network(CNN), and the results of classification by a human annotator and by a computational method were compared.From these experiments, we have clearly shown that CNN is efficient way to classify unphotographic gray scaled images and found that characteristic features of images to classify incorrectly.</p
Deep Shape Matching
We cast shape matching as metric learning with convolutional networks. We
break the end-to-end process of image representation into two parts. Firstly,
well established efficient methods are chosen to turn the images into edge
maps. Secondly, the network is trained with edge maps of landmark images, which
are automatically obtained by a structure-from-motion pipeline. The learned
representation is evaluated on a range of different tasks, providing
improvements on challenging cases of domain generalization, generic
sketch-based image retrieval or its fine-grained counterpart. In contrast to
other methods that learn a different model per task, object category, or
domain, we use the same network throughout all our experiments, achieving
state-of-the-art results in multiple benchmarks.Comment: ECCV 201
Development and implementation of a web-enabled 3D consultation tool for breast augmentation surgery based on 3D-image reconstruction of 2D pictures
Producing a rich, personalized Web-based consultation tool for plastic surgeons and patients is challenging
Aligning Salient Objects to Queries: A Multi-modal and Multi-object Image Retrieval Framework
This is the author accepted manuscript. The final version is available from Springer Verlag via the DOI in this recordACCV 2018:
14th Asian Conference on Computer Vision, Perth, Australia, 2-6 December 2018In this paper we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. Our architecture also relies on a salient object detection through a supervised LSTM-based visual attention model learned from convolutional features. Both the alignment between the queries and the image and the supervision of the attention on the images are obtained by generalizing the Hungarian Algorithm using different loss functions. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set. We validate the performance of our approach on standard single/multi-object datasets, showing state-of-the art performance in every dataset.European Union Horizon 2020CERCA Program of Generalitat de Cataluny
Inner Space Preserving Generative Pose Machine
Image-based generative methods, such as generative adversarial networks
(GANs) have already been able to generate realistic images with much context
control, specially when they are conditioned. However, most successful
frameworks share a common procedure which performs an image-to-image
translation with pose of figures in the image untouched. When the objective is
reposing a figure in an image while preserving the rest of the image, the
state-of-the-art mainly assumes a single rigid body with simple background and
limited pose shift, which can hardly be extended to the images under normal
settings. In this paper, we introduce an image "inner space" preserving model
that assigns an interpretable low-dimensional pose descriptor (LDPD) to an
articulated figure in the image. Figure reposing is then generated by passing
the LDPD and the original image through multi-stage augmented hourglass
networks in a conditional GAN structure, called inner space preserving
generative pose machine (ISP-GPM). We evaluated ISP-GPM on reposing human
figures, which are highly articulated with versatile variations. Test of a
state-of-the-art pose estimator on our reposed dataset gave an accuracy over
80% on PCK0.5 metric. The results also elucidated that our ISP-GPM is able to
preserve the background with high accuracy while reasonably recovering the area
blocked by the figure to be reposed.Comment: http://www.northeastern.edu/ostadabbas/2018/07/23/inner-space-preserving-generative-pose-machine
SketchyScene: Richly-Annotated Scene Sketches
We contribute the first large-scale dataset of scene sketches, SketchyScene,
with the goal of advancing research on sketch understanding at both the object
and scene level. The dataset is created through a novel and carefully designed
crowdsourcing pipeline, enabling users to efficiently generate large quantities
of realistic and diverse scene sketches. SketchyScene contains more than 29,000
scene-level sketches, 7,000+ pairs of scene templates and photos, and 11,000+
object sketches. All objects in the scene sketches have ground-truth semantic
and instance masks. The dataset is also highly scalable and extensible, easily
allowing augmenting and/or changing scene composition. We demonstrate the
potential impact of SketchyScene by training new computational models for
semantic segmentation of scene sketches and showing how the new dataset enables
several applications including image retrieval, sketch colorization, editing,
and captioning, etc. The dataset and code can be found at
https://github.com/SketchyScene/SketchyScene
- …