7 research outputs found

    Generating Video Descriptions with Topic Guidance

    Full text link
    Generating video descriptions in natural language (a.k.a. video captioning) is a more challenging task than image captioning as the videos are intrinsically more complicated than images in two aspects. First, videos cover a broader range of topics, such as news, music, sports and so on. Second, multiple topics could coexist in the same video. In this paper, we propose a novel caption model, topic-guided model (TGM), to generate topic-oriented descriptions for videos in the wild via exploiting topic information. In addition to predefined topics, i.e., category tags crawled from the web, we also mine topics in a data-driven way based on training captions by an unsupervised topic mining model. We show that data-driven topics reflect a better topic schema than the predefined topics. As for testing video topic prediction, we treat the topic mining model as teacher to train the student, the topic prediction model, by utilizing the full multi-modalities in the video especially the speech modality. We propose a series of caption models to exploit topic guidance, including implicitly using the topics as input features to generate words related to the topic and explicitly modifying the weights in the decoder with topics to function as an ensemble of topic-aware language decoders. Our comprehensive experimental results on the current largest video caption dataset MSR-VTT prove the effectiveness of our topic-guided model, which significantly surpasses the winning performance in the 2016 MSR video to language challenge.Comment: Appeared at ICMR 201

    Efficient Correlated Topic Modeling with Topic Embedding

    Full text link
    Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations through the closeness between the topic vectors. Our method enables efficient inference in the low-dimensional embedding space, reducing previous cubic or quadratic time complexity to linear w.r.t the topic size. We further speedup variational inference with a fast sampler to exploit sparsity of topic occurrence. Extensive experiments show that our approach is capable of handling model and data scales which are several orders of magnitude larger than existing correlation results, without sacrificing modeling quality by providing competitive or superior performance in document classification and retrieval.Comment: KDD 2017 oral. The first two authors contributed equall

    Localized sampling for hospital re-admission prediction with imbalanced sample distributions

    Full text link
    © 2017 IEEE. Hospital re-admission refers to special medical events that a patient previously discharged from the hospital is readmitted within a short period of time (say 30 days). A re-admission not only downgrades the quality of living of the patient, it also adds significant financial burdens to the health care systems. To date, many systems exist to use computational approaches to predict the likelihood of a patient being readmitted in the future for medical decision assistance. When building predictive models for hospital re-admission prediction, one essential challenge is that sample distributions in the data are severely imbalanced where, typically, less than 10% of patients are likely going to be readmitted in a near future. A predictive model, without considering sample imbalance, will unlikely generate accurate results for prediction. To date, no existing re-admission model has explicitly addressed such data imbalance issues in their systems. In this paper, we consider hospital re-admission prediction with imbalanced sample distributions, and propose to use localized sampling approach to help build accurate predictive models. For localized sampling, we emphasize on samples which are difficult to classify, and allow the sampling process to bias to such instances. Because finding instances difficult to classify requires calculation of distance between instances, and the high dimensionality of Electronic Health Records (EHR) data makes the distance calculation highly ineffective, we propose to use latent topic embedding to reduce the sample from high dimensionality to a handful of low dimensional topic space for effective and accurate calculation of the distance between instances. By using localized sampling to build multiple versions of balanced datasets, we are able to train multiple predictive models and combine their results for prediction. Experiments and comparisons on data collected from several South Florida regional hospitals demonstrate the performance of our method
    corecore