9,016 research outputs found
Unsupervised Video Understanding by Reconciliation of Posture Similarities
Understanding human activity and being able to explain it in detail surpasses
mere action classification by far in both complexity and value. The challenge
is thus to describe an activity on the basis of its most fundamental
constituents, the individual postures and their distinctive transitions.
Supervised learning of such a fine-grained representation based on elementary
poses is very tedious and does not scale. Therefore, we propose a completely
unsupervised deep learning procedure based solely on video sequences, which
starts from scratch without requiring pre-trained networks, predefined body
models, or keypoints. A combinatorial sequence matching algorithm proposes
relations between frames from subsets of the training data, while a CNN is
reconciling the transitivity conflicts of the different subsets to learn a
single concerted pose embedding despite changes in appearance across sequences.
Without any manual annotation, the model learns a structured representation of
postures and their temporal development. The model not only enables retrieval
of similar postures but also temporal super-resolution. Additionally, based on
a recurrent formulation, next frames can be synthesized.Comment: Accepted by ICCV 201
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
In this work, we propose a novel robot learning framework called Neural Task
Programming (NTP), which bridges the idea of few-shot learning from
demonstration and neural program induction. NTP takes as input a task
specification (e.g., video demonstration of a task) and recursively decomposes
it into finer sub-task specifications. These specifications are fed to a
hierarchical neural program, where bottom-level programs are callable
subroutines that interact with the environment. We validate our method in three
robot manipulation tasks. NTP achieves strong generalization across sequential
tasks that exhibit hierarchal and compositional structures. The experimental
results show that NTP learns to generalize well to- wards unseen tasks with
increasing lengths, variable topologies, and changing objectives.Comment: ICRA 201
Generating Music Medleys via Playing Music Puzzle Games
Generating music medleys is about finding an optimal permutation of a given
set of music clips. Toward this goal, we propose a self-supervised learning
task, called the music puzzle game, to train neural network models to learn the
sequential patterns in music. In essence, such a game requires machines to
correctly sort a few multisecond music fragments. In the training stage, we
learn the model by sampling multiple non-overlapping fragment pairs from the
same songs and seeking to predict whether a given pair is consecutive and is in
the correct chronological order. For testing, we design a number of puzzle
games with different difficulty levels, the most difficult one being music
medley, which requiring sorting fragments from different songs. On the basis of
state-of-the-art Siamese convolutional network, we propose an improved
architecture that learns to embed frame-level similarity scores computed from
the input fragment pairs to a common space, where fragment pairs in the correct
order can be more easily identified. Our result shows that the resulting model,
dubbed as the similarity embedding network (SEN), performs better than
competing models across different games, including music jigsaw puzzle, music
sequencing, and music medley. Example results can be found at our project
website, https://remyhuang.github.io/DJnet.Comment: Accepted at AAAI 201
- …