240 research outputs found

    Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment

    Full text link
    In this work, we address the task of weakly-supervised human action segmentation in long, untrimmed videos. Recent methods have relied on expensive learning models, such as Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM). However, these methods suffer from expensive computational cost, thus are unable to be deployed in large scale. To overcome the limitations, the keys to our design are efficiency and scalability. We propose a novel action modeling framework, which consists of a new temporal convolutional network, named Temporal Convolutional Feature Pyramid Network (TCFPN), for predicting frame-wise action labels, and a novel training strategy for weakly-supervised sequence modeling, named Iterative Soft Boundary Assignment (ISBA), to align action sequences and update the network in an iterative fashion. The proposed framework is evaluated on two benchmark datasets, Breakfast and Hollywood Extended, with four different evaluation metrics. Extensive experimental results show that our methods achieve competitive or superior performance to state-of-the-art methods.Comment: CVPR 201

    DiffuRec: A Diffusion Model for Sequential Recommendation

    Full text link
    Mainstream solutions to Sequential Recommendation (SR) represent items with fixed vectors. These vectors have limited capability in capturing items' latent aspects and users' diverse preferences. As a new generative paradigm, Diffusion models have achieved excellent performance in areas like computer vision and natural language processing. To our understanding, its unique merit in representation generation well fits the problem setting of sequential recommendation. In this paper, we make the very first attempt to adapt Diffusion model to SR and propose DiffuRec, for item representation construction and uncertainty injection. Rather than modeling item representations as fixed vectors, we represent them as distributions in DiffuRec, which reflect user's multiple interests and item's various aspects adaptively. In diffusion phase, DiffuRec corrupts the target item embedding into a Gaussian distribution via noise adding, which is further applied for sequential item distribution representation generation and uncertainty injection. Afterwards, the item representation is fed into an Approximator for target item representation reconstruction. In reversion phase, based on user's historical interaction behaviors, we reverse a Gaussian noise into the target item representation, then apply rounding operation for target item prediction. Experiments over four datasets show that DiffuRec outperforms strong baselines by a large margin

    Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All

    Full text link
    Collective entity disambiguation aims to jointly resolve multiple mentions by linking them to their associated entities in a knowledge base. Previous works are primarily based on the underlying assumption that entities within the same document are highly related. However, the extend to which these mentioned entities are actually connected in reality is rarely studied and therefore raises interesting research questions. For the first time, we show that the semantic relationships between the mentioned entities are in fact less dense than expected. This could be attributed to several reasons such as noise, data sparsity and knowledge base incompleteness. As a remedy, we introduce MINTREE, a new tree-based objective for the entity disambiguation problem. The key intuition behind MINTREE is the concept of coherence relaxation which utilizes the weight of a minimum spanning tree to measure the coherence between entities. Based on this new objective, we design a novel entity disambiguation algorithms which we call Pair-Linking. Instead of considering all the given mentions, Pair-Linking iteratively selects a pair with the highest confidence at each step for decision making. Via extensive experiments, we show that our approach is not only more accurate but also surprisingly faster than many state-of-the-art collective linking algorithms
    • …
    corecore