3,648 research outputs found
Activity representation with motion hierarchies
International audienceComplex activities, e.g., pole vaulting, are composed of a variable number of sub-events connected by complex spatio-temporal relations, whereas simple actions can be represented as sequences of short temporal parts. In this paper, we learn hierarchical representations of activity videos in an unsupervised manner. These hierarchies of mid-level motion components are data-driven decompositions specific to each video. We introduce a spectral divisive clustering algorithm to efficiently extract a hierarchy over a large number of tracklets (i.e., local trajectories). We use this structure to represent a video as an unordered binary tree. We model this tree using nested histograms of local motion features. We provide an efficient positive definite kernel that computes the structural and visual similarity of two hierarchical decompositions by relying on models of their parent-child relations. We present experimental results on four recent challenging benchmarks: the High Five dataset [Patron-Perez et al, 2010], the Olympics Sports dataset [Niebles et al, 2010], the Hollywood 2 dataset [Marszalek et al, 2009], and the HMDB dataset [Kuehne et al, 2011]. We show that pervideo hierarchies provide additional information for activity recognition. Our approach improves over unstructured activity models, baselines using other motion decomposition algorithms, and the state of the art
Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies
In motion analysis and understanding it is important to be able to fit a
suitable model or structure to the temporal series of observed data, in order
to describe motion patterns in a compact way, and to discriminate between them.
In an unsupervised context, i.e., no prior model of the moving object(s) is
available, such a structure has to be learned from the data in a bottom-up
fashion. In recent times, volumetric approaches in which the motion is captured
from a number of cameras and a voxel-set representation of the body is built
from the camera views, have gained ground due to attractive features such as
inherent view-invariance and robustness to occlusions. Automatic, unsupervised
segmentation of moving bodies along entire sequences, in a temporally-coherent
and robust way, has the potential to provide a means of constructing a
bottom-up model of the moving body, and track motion cues that may be later
exploited for motion classification. Spectral methods such as locally linear
embedding (LLE) can be useful in this context, as they preserve "protrusions",
i.e., high-curvature regions of the 3D volume, of articulated shapes, while
improving their separation in a lower dimensional space, making them in this
way easier to cluster. In this paper we therefore propose a spectral approach
to unsupervised and temporally-coherent body-protrusion segmentation along time
sequences. Volumetric shapes are clustered in an embedding space, clusters are
propagated in time to ensure coherence, and merged or split to accommodate
changes in the body's topology. Experiments on both synthetic and real
sequences of dense voxel-set data are shown. This supports the ability of the
proposed method to cluster body-parts consistently over time in a totally
unsupervised fashion, its robustness to sampling density and shape quality, and
its potential for bottom-up model constructionComment: 31 pages, 26 figure
Optimal Networks from Error Correcting Codes
To address growth challenges facing large Data Centers and supercomputing
clusters a new construction is presented for scalable, high throughput, low
latency networks. The resulting networks require 1.5-5 times fewer switches,
2-6 times fewer cables, have 1.2-2 times lower latency and correspondingly
lower congestion and packet losses than the best present or proposed networks
providing the same number of ports at the same total bisection. These advantage
ratios increase with network size. The key new ingredient is the exact
equivalence discovered between the problem of maximizing network bisection for
large classes of practically interesting Cayley graphs and the problem of
maximizing codeword distance for linear error correcting codes. Resulting
translation recipe converts existent optimal error correcting codes into
optimal throughput networks.Comment: 14 pages, accepted at ANCS 2013 conferenc
- …