96,893 research outputs found

    Deep collective inference

    Get PDF
    Collective inference is widely used to improve classification in network datasets. However, despite recent advances in deep learning and the successes of recurrent neural networks (RNNs), researchers have only just recently begun to study how to apply RNNs to heterogeneous graph and network datasets. There has been recent work on using RNNs for unsupervised learning in networks (e.g., graph clustering, node embedding) and for prediction (e.g., link prediction, graph classification), but there has been little work on using RNNs for node-based relational classification tasks. In this paper, we provide an end-to-end learning framework using RNNs for collective inference. Our main insight is to transform a node and its set of neighbors into an unordered sequence (of varying length) and use an LSTM-based RNN to predict the class label as the output of that sequence. We develop a collective inference method, which we refer to as Deep Collective Inference (DCI), that uses semi-supervised learning in partially-labeled networks and two label distribution correction mechanisms for imbalanced classes. We compare to several alternative methods on seven network datasets. DCI achieves up to a 12% reduction in error compared to the best alternative and a 25% reduction in error on average over all methods, for all label proportions

    Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

    Get PDF
    We present a unified framework for understanding human social behaviors in raw image sequences. Our model jointly detects multiple individuals, infers their social actions, and estimates the collective actions with a single feed-forward pass through a neural network. We propose a single architecture that does not rely on external detection algorithms but rather is trained end-to-end to generate dense proposal maps that are refined via a novel inference scheme. The temporal consistency is handled via a person-level matching Recurrent Neural Network. The complete model takes as input a sequence of frames and outputs detections along with the estimates of individual actions and collective activities. We demonstrate state-of-the-art performance of our algorithm on multiple publicly available benchmarks

    CERN: Confidence-Energy Recurrent Network for Group Activity Recognition

    Full text link
    This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities. The recognition is realized using a two-level hierarchy of Long Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture, which can be trained end-to-end. In comparison with existing architectures of LSTMs, we make two key contributions giving the name to our approach as Confidence-Energy Recurrent Network -- CERN. First, instead of using the common softmax layer for prediction, we specify a novel energy layer (EL) for estimating the energy of our predictions. Second, rather than finding the common minimum-energy class assignment, which may be numerically unstable under uncertainty, we specify that the EL additionally computes the p-values of the solutions, and in this way estimates the most confident energy minimum. The evaluation on the Collective Activity and Volleyball datasets demonstrates: (i) advantages of our two contributions relative to the common softmax and energy-minimization formulations and (ii) a superior performance relative to the state-of-the-art approaches.Comment: Accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 201

    Latent Embeddings for Collective Activity Recognition

    Full text link
    Rather than simply recognizing the action of a person individually, collective activity recognition aims to find out what a group of people is acting in a collective scene. Previ- ous state-of-the-art methods using hand-crafted potentials in conventional graphical model which can only define a limited range of relations. Thus, the complex structural de- pendencies among individuals involved in a collective sce- nario cannot be fully modeled. In this paper, we overcome these limitations by embedding latent variables into feature space and learning the feature mapping functions in a deep learning framework. The embeddings of latent variables build a global relation containing person-group interac- tions and richer contextual information by jointly modeling broader range of individuals. Besides, we assemble atten- tion mechanism during embedding for achieving more com- pact representations. We evaluate our method on three col- lective activity datasets, where we contribute a much larger dataset in this work. The proposed model has achieved clearly better performance as compared to the state-of-the- art methods in our experiments.Comment: 6pages, accepted by IEEE-AVSS201

    A Collective Variational Autoencoder for Top-NN Recommendation with Side Information

    Full text link
    Recommender systems have been studied extensively due to their practical use in many real-world scenarios. Despite this, generating effective recommendations with sparse user ratings remains a challenge. Side information associated with items has been widely utilized to address rating sparsity. Existing recommendation models that use side information are linear and, hence, have restricted expressiveness. Deep learning has been used to capture non-linearities by learning deep item representations from side information but as side information is high-dimensional existing deep models tend to have large input dimensionality, which dominates their overall size. This makes them difficult to train, especially with small numbers of inputs. Rather than learning item representations, which is problematic with high-dimensional side information, in this paper, we propose to learn feature representation through deep learning from side information. Learning feature representations, on the other hand, ensures a sufficient number of inputs to train a deep network. To achieve this, we propose to simultaneously recover user ratings and side information, by using a Variational Autoencoder (VAE). Specifically, user ratings and side information are encoded and decoded collectively through the same inference network and generation network. This is possible as both user ratings and side information are data associated with items. To account for the heterogeneity of user rating and side information, the final layer of the generation network follows different distributions depending on the type of information. The proposed model is easy to implement and efficient to optimize and is shown to outperform state-of-the-art top-NN recommendation methods that use side information.Comment: 7 pages, 3 figures, DLRS workshop 201
    • …
    corecore