795 research outputs found

    Video Question Answering via Attribute-Augmented Attention Network Learning

    Full text link
    Video Question Answering is a challenging problem in visual information retrieval, which provides the answer to the referenced video content according to the question. However, the existing visual question answering approaches mainly tackle the problem of static image question, which may be ineffectively for video question answering due to the insufficiency of modeling the temporal dynamics of video contents. In this paper, we study the problem of video question answering by modeling its temporal dynamics with frame-level attention mechanism. We propose the attribute-augmented attention network learning framework that enables the joint frame-level attribute detection and unified video representation learning for video question answering. We then incorporate the multi-step reasoning process for our proposed attention network to further improve the performance. We construct a large-scale video question answering dataset. We conduct the experiments on both multiple-choice and open-ended video question answering tasks to show the effectiveness of the proposed method.Comment: Accepted for SIGIR 201

    Multidimensional Membership Mixture Models

    Full text link
    We present the multidimensional membership mixture (M3) models where every dimension of the membership represents an independent mixture model and each data point is generated from the selected mixture components jointly. This is helpful when the data has a certain shared structure. For example, three unique means and three unique variances can effectively form a Gaussian mixture model with nine components, while requiring only six parameters to fully describe it. In this paper, we present three instantiations of M3 models (together with the learning and inference algorithms): infinite, finite, and hybrid, depending on whether the number of mixtures is fixed or not. They are built upon Dirichlet process mixture models, latent Dirichlet allocation, and a combination respectively. We then consider two applications: topic modeling and learning 3D object arrangements. Our experiments show that our M3 models achieve better performance using fewer topics than many classic topic models. We also observe that topics from the different dimensions of M3 models are meaningful and orthogonal to each other.Comment: 9 pages, 7 figure

    An Ontology-Based Recommender System with an Application to the Star Trek Television Franchise

    Full text link
    Collaborative filtering based recommender systems have proven to be extremely successful in settings where user preference data on items is abundant. However, collaborative filtering algorithms are hindered by their weakness against the item cold-start problem and general lack of interpretability. Ontology-based recommender systems exploit hierarchical organizations of users and items to enhance browsing, recommendation, and profile construction. While ontology-based approaches address the shortcomings of their collaborative filtering counterparts, ontological organizations of items can be difficult to obtain for items that mostly belong to the same category (e.g., television series episodes). In this paper, we present an ontology-based recommender system that integrates the knowledge represented in a large ontology of literary themes to produce fiction content recommendations. The main novelty of this work is an ontology-based method for computing similarities between items and its integration with the classical Item-KNN (K-nearest neighbors) algorithm. As a study case, we evaluated the proposed method against other approaches by performing the classical rating prediction task on a collection of Star Trek television series episodes in an item cold-start scenario. This transverse evaluation provides insights into the utility of different information resources and methods for the initial stages of recommender system development. We found our proposed method to be a convenient alternative to collaborative filtering approaches for collections of mostly similar items, particularly when other content-based approaches are not applicable or otherwise unavailable. Aside from the new methods, this paper contributes a testbed for future research and an online framework to collaboratively extend the ontology of literary themes to cover other narrative content.Comment: 25 pages, 6 figures, 5 tables, minor revision

    Latent Semantic Indexing (LSI) Based Distributed System and Search On Encrypted Data

    Get PDF
    Latent semantic indexing (LSI) was initially introduced to overcome the issues of synonymy and polysemy of the traditional vector space model (VSM). LSI, however, has challenges of its own, mainly scalability. Despite being introduced in 1990, there are few attempts that provide an efficient solution for LSI, most of the literature is focuses on LSI’s applications rather than improving the original algorithm. In this work we analyze the first framework to provide scalable implementation of LSI and report its performance on the distributed environment of RAAD. The possibility of adopting LSI in the field of searching over encrypted data is also investigated. The importance of that field is stemmed from the need for cloud computing as an effective computing paradigm that provides an affordable access to high computational power. Encryption is usually applied to prevent unauthorized access to the data (the host is assumed to be curious), however this limits accessibility to the data given that search over encryption is yet to catch with the latest techniques adopted by the Information Retrieval (IR) community. In this work we propose a system that uses LSI for indexing and free-query text for retrieving. The results show that the available LSI framework does scale on large datasets, however it had some limitations with respect to factors like dictionary size and memory limit. When replicating the exact settings of the baseline on RAAD, it performed relatively slower. This could be resulted by the fact that RAAD uses a distributed file system or because of network latency. The results also show that the proposed system for applying LSI on encrypted data retrieved documents in the same order as the baseline (unencrypted data)
    • …
    corecore