1,384 research outputs found
A semantic feature for human motion retrieval
With the explosive growth of motion capture data, it becomes very imperative in animation production to have an efficient search engine to retrieve motions from large motion repository. However, because of the high dimension of data space and complexity of matching methods, most of the existing approaches cannot return the result in real time. This paper proposes a high level semantic feature in a low dimensional space to represent the essential characteristic of different motion classes. On the basis of the statistic training of Gauss Mixture Model, this feature can effectively achieve motion matching on both global clip level and local frame level. Experiment results show that our approach can retrieve similar motions with rankings from large motion database in real-time and also can make motion annotation automatically on the fly. Copyright © 2013 John Wiley & Sons, Ltd
Efficient sketch-based creation of detailed character models through data-driven mesh deformations
Creation of detailed character models is a very challenging task in animation production. Sketch-based character model creation from a 3D template provides a promising solution. However, how to quickly find correct correspondences between user's drawn sketches and the 3D template model, how to efficiently deform the 3D template model to exactly match user's drawn sketches, and realize real-time interactive modeling is still an open topic. In this paper, we propose a new approach and develop a user interface to effectively tackle this problem. Our proposed approach includes using user's drawn sketches to retrieve a most similar 3D template model from our dataset and marrying human's perception and interactions with computer's highly efficient computing to extract occluding and silhouette contours of the 3D template model and find correct correspondences quickly. We then combine skeleton-based deformation and mesh editing to deform the 3D template model to fit user's drawn sketches and create new and detailed 3D character models. The results presented in this paper demonstrate the effectiveness and advantages of our proposed approach and usefulness of our developed user interface
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models
Misalignment between the outputs of a vision-language (VL) model and task
goal hinders its deployment. This issue can worsen when there are distribution
shifts between the training and test data. To address this problem, prevailing
fully test-time adaptation~(TTA) methods bootstrap themselves through entropy
minimization. However, minimizing the entropy of the predictions makes the
model overfit to incorrect output distributions of itself. In this work, we
propose TTA with feedback to avoid such overfitting and align the model with
task goals. Specifically, we adopt CLIP as reward model to provide feedback for
VL models during test time in various tasks, including image classification,
image-text retrieval, and image captioning. Given a single test sample, the
model aims to maximize CLIP reward through reinforcement learning. We adopt a
reward design with the average CLIP score of sampled candidates as the
baseline. This design is simple and surprisingly effective when combined with
various task-specific sampling strategies. The entire system is flexible,
allowing the reward model to be extended with multiple CLIP models. Plus, a
momentum buffer can be used to memorize and leverage the learned knowledge from
multiple test samples. Extensive experiments demonstrate that our method
significantly improves different VL models after TTA.Comment: preprint, work in progress; project URL
https://github.com/mzhaoshuai/RLC
- …