4,924 research outputs found

    Learning and Refining of Privileged Information-based RNNs for Action Recognition from Depth Sequences

    Full text link
    Existing RNN-based approaches for action recognition from depth sequences require either skeleton joints or hand-crafted depth features as inputs. An end-to-end manner, mapping from raw depth maps to action classes, is non-trivial to design due to the fact that: 1) single channel map lacks texture thus weakens the discriminative power; 2) relatively small set of depth training data. To address these challenges, we propose to learn an RNN driven by privileged information (PI) in three-steps: An encoder is pre-trained to learn a joint embedding of depth appearance and PI (i.e. skeleton joints). The learned embedding layers are then tuned in the learning step, aiming to optimize the network by exploiting PI in a form of multi-task loss. However, exploiting PI as a secondary task provides little help to improve the performance of a primary task (i.e. classification) due to the gap between them. Finally, a bridging matrix is defined to connect two tasks by discovering latent PI in the refining step. Our PI-based classification loss maintains a consistency between latent PI and predicted distribution. The latent PI and network are iteratively estimated and updated in an expectation-maximization procedure. The proposed learning process provides greater discriminative power to model subtle depth difference, while helping avoid overfitting the scarcer training data. Our experiments show significant performance gains over state-of-the-art methods on three public benchmark datasets and our newly collected Blanket dataset.Comment: conference cvpr 201

    Occlusion-aware Hand Pose Estimation Using Hierarchical Mixture Density Network

    Full text link
    Learning and predicting the pose parameters of a 3D hand model given an image, such as locations of hand joints, is challenging due to large viewpoint changes and articulations, and severe self-occlusions exhibited particularly in egocentric views. Both feature learning and prediction modeling have been investigated to tackle the problem. Though effective, most existing discriminative methods yield a single deterministic estimation of target poses. Due to their single-value mapping intrinsic, they fail to adequately handle self-occlusion problems, where occluded joints present multiple modes. In this paper, we tackle the self-occlusion issue and provide a complete description of observed poses given an input depth image by a novel method called hierarchical mixture density networks (HMDN). The proposed method leverages the state-of-the-art hand pose estimators based on Convolutional Neural Networks to facilitate feature learning, while it models the multiple modes in a two-level hierarchy to reconcile single-valued and multi-valued mapping in its output. The whole framework with a mixture of two differentiable density functions is naturally end-to-end trainable. In the experiments, HMDN produces interpretable and diverse candidate samples, and significantly outperforms the state-of-the-art methods on two benchmarks with occlusions, and performs comparably on another benchmark free of occlusions

    Deep Convolutional Decision Jungle for Image Classification

    Full text link
    We propose a novel method called deep convolutional decision jungle (CDJ) and its learning algorithm for image classification. The CDJ maintains the structure of standard convolutional neural networks (CNNs), i.e. multiple layers of multiple response maps fully connected. Each response map-or node-in both the convolutional and fully-connected layers selectively respond to class labels s.t. each data sample travels via a specific soft route of those activated nodes. The proposed method CDJ automatically learns features, whereas decision forests and jungles require pre-defined feature sets. Compared to CNNs, the method embeds the benefits of using data-dependent discriminative functions, which better handles multi-modal/heterogeneous data; further,the method offers more diverse sparse network responses, which in turn can be used for cost-effective learning/classification. The network is learnt by combining conventional softmax and proposed entropy losses in each layer. The entropy loss,as used in decision tree growing, measures the purity of data activation according to the class label distribution. The back-propagation rule for the proposed loss function is derived from stochastic gradient descent (SGD) optimization of CNNs. We show that our proposed method outperforms state-of-the-art methods on three public image classification benchmarks and one face verification dataset. We also demonstrate the use of auxiliary data labels, when available, which helps our method to learn more discriminative routing and representations and leads to improved classification

    Condensate of a charged boson fluid at non-integer dimensions

    Full text link
    Condensate of a charged boson fluid at non-integer dimensions between 2 and 3 is studied. Interaction between particles is assumed to be Coulombic, and Bogoliubov approximation is applied for a weak coupling regime. The condensate and the superfluid fraction at finite temperatures and at non-integer dimensions are calculated. The theoretical results are compared with the superfluid densities of superconducting films, which show a universal splitting behavior. The qualitative similarity between the charged boson fluid and the superconducting films gives a strong clue for the origin of the universality.Comment: 10 pages, 8 figure

    Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection

    Full text link
    A human action can be seen as transitions between one's body poses over time, where the transition depicts a temporal relation between two poses. Recognizing actions thus involves learning a classifier sensitive to these pose transitions as well as to static poses. In this paper, we introduce a novel method called transitions forests, an ensemble of decision trees that both learn to discriminate static poses and transitions between pairs of two independent frames. During training, node splitting is driven by alternating two criteria: the standard classification objective that maximizes the discrimination power in individual frames, and the proposed one in pairwise frame transitions. Growing the trees tends to group frames that have similar associated transitions and share same action label incorporating temporal information that was not available otherwise. Unlike conventional decision trees where the best split in a node is determined independently of other nodes, the transition forests try to find the best split of nodes jointly (within a layer) for incorporating distant node transitions. When inferring the class label of a new frame, it is passed down the trees and the prediction is made based on previous frame predictions and the current one in an efficient and online manner. We apply our method on varied skeleton action recognition and online detection datasets showing its suitability over several baselines and state-of-the-art approaches.Comment: to appear in CVPR 201

    Decoherence from Internal Dynamics in Macroscopic Matter-Wave Interferometry

    Full text link
    Usually, decoherence is generated from the coupling with an outer environment. However, a macroscopic object generically possesses its own environment in itself, namely the complicated dynamics of internal degrees of freedom. We address a question: when and how the internal dynamics decohere interference of the center of mass motion of a macroscopic object. We will also show that weak localization of a macroscopic object in disordered potentials can be destroyed by such decoherence.Comment: 5 pages, 2 figure

    Atomic-scale simulations on the sliding of incommensurate surfaces: The breakdown of superlubricity

    Full text link
    Molecular dynamics simulations of frictional sliding in an Atomic Force Microscope (AFM) show a clear dependence of superlubricity between incommensurate surfaces on tip compliance and applied normal force. While the kinetic friction vanishes for rigid tips and low normal force, superlubric behavior breaks down for softer tips and high normal force. The simulations provide evidence that the Frenkel-Kontorova-Tomlinson (FKT) scaling applies equally to a more realistic 3-D incommensurate AFM model except in the limit of very low stiffness and high normal load limit. Unlike the FKT model in which the breakdown of superlubricity coincides to the emergence of the meta-stable states, in the 3-D model some meta-stable states appear to reduce frictional force leading to non-monotonic dependence of force on normal load and tip compliance. Meta-stable states vary with the slider positions, and the relative stabilities of these meta-stable states result in varying transition mechanisms depending on sliding velocity.Comment: 31 pages, 15 figures, 1 tabl

    Reformulating hyperdynamics without a transition state theory dividing surface

    Full text link
    Reformulating hyperdynamics without using a transition state theory (TST) dividing surface makes it possible to accelerate conventional molecular dynamics (MD) simulation using a broader range of bias potentials. A new scheme to calculate the boost factor is also introduced that makes the hyperdynamics method more accurate and efficient. Novel bias potentials using the hyper-distance and the potential energy slope and curvature along the direction vector from a minimum to a current position can significantly reduce the computational overhead required. Results simulating an atomic force microscope (AFM) system validate the new methodology

    Learning Deep Convolutional Embeddings for Face Representation Using Joint Sample- and Set-based Supervision

    Full text link
    In this work, we investigate several methods and strategies to learn deep embeddings for face recognition, using joint sample- and set-based optimization. We explain our framework that expands traditional learning with set-based supervision together with the strategies used to maintain set characteristics. We, then, briefly review the related set-based loss functions, and subsequently propose a novel Max-Margin Loss which maximizes maximum possible inter-class margin with assistance of Support Vector Machines (SVMs). It implicitly pushes all the samples towards correct side of the margin with a vector perpendicular to the hyperplane and a strength exponentially growing towards to negative side of the hyperplane. We show that the introduced loss outperform the previous sample-based and set-based ones in terms verification of faces on two commonly used benchmarks.Comment: 8 pages, 5 figures, 2 tables, workshop pape

    Alpha invariants of birationally rigid Fano threefolds

    Full text link
    We compute global log canonical thresholds, or equivalently alpha invariants, of birationally rigid orbifold Fano threefolds embedded in weighted projective spaces as codimension two or three. As an important application, we prove that most of them are weakly exceptional, KK-stable and admit K\"{a}hler--Einstien metric.Comment: 38 page
    • …
    corecore