195 research outputs found
3D Object Reconstruction from Hand-Object Interactions
Recent advances have enabled 3d object reconstruction approaches using a
single off-the-shelf RGB-D camera. Although these approaches are successful for
a wide range of object classes, they rely on stable and distinctive geometric
or texture features. Many objects like mechanical parts, toys, household or
decorative articles, however, are textureless and characterized by minimalistic
shapes that are simple and symmetric. Existing in-hand scanning systems and 3d
reconstruction techniques fail for such symmetric objects in the absence of
highly distinctive features. In this work, we show that extracting 3d hand
motion for in-hand scanning effectively facilitates the reconstruction of even
featureless and highly symmetric objects and we present an approach that fuses
the rich additional information of hands into a 3d reconstruction pipeline,
significantly contributing to the state-of-the-art of in-hand scanning.Comment: International Conference on Computer Vision (ICCV) 2015,
http://files.is.tue.mpg.de/dtzionas/In-Hand-Scannin
Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints
Action detection and temporal segmentation of actions in videos are topics of
increasing interest. While fully supervised systems have gained much attention
lately, full annotation of each action within the video is costly and
impractical for large amounts of video data. Thus, weakly supervised action
detection and temporal segmentation methods are of great importance. While most
works in this area assume an ordered sequence of occurring actions to be given,
our approach only uses a set of actions. Such action sets provide much less
supervision since neither action ordering nor the number of action occurrences
are known. In exchange, they can be easily obtained, for instance, from
meta-tags, while ordered sequences still require human annotation. We introduce
a system that automatically learns to temporally segment and label actions in a
video, where the only supervision that is used are action sets. An evaluation
on three datasets shows that our method still achieves good results although
the amount of supervision is significantly smaller than for other related
methods.Comment: CVPR 201
Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling
We present an approach for weakly supervised learning of human actions. Given
a set of videos and an ordered list of the occurring actions, the goal is to
infer start and end frames of the related action classes within the video and
to train the respective action classifiers without any need for hand labeled
frame boundaries. To address this task, we propose a combination of a
discriminative representation of subactions, modeled by a recurrent neural
network, and a coarse probabilistic model to allow for a temporal alignment and
inference over long sequences. While this system alone already generates good
results, we show that the performance can be further improved by approximating
the number of subactions to the characteristics of the different action
classes. To this end, we adapt the number of subaction classes by iterating
realignment and reestimation during training. The proposed system is evaluated
on two benchmark datasets, the Breakfast and the Hollywood extended dataset,
showing a competitive performance on various weak learning tasks such as
temporal action segmentation and action alignment
- …