106 research outputs found

    Designing interfaces for exploratory content based image retrieval systems

    Get PDF
    Content Based Image Retrieval or CBIR systems have become the state of the art image retrieval technique over the past few years. They showed commendable retrieval performance over traditional annotation based retrieval. CBIR systems use relevance feedback as input query. CBIR systems developed so far did not put much effort to come up with suitable user interfaces for accepting relevance feedback efficiently i.e. by putting less cognitive load to the user and providing a higher amount of exploration in a limited amount of time. In this study we propose a new interface 'FutureView' which allows peeking into the future providing access to more images in less time than traditional interfaces. This idea helps the user to choose more appropriate images without getting diverted. We used Gaussian process upper confidence bound algorithm for recommending images. We successfully compared this algorithm with Random and Exploitation algorithms with positive results

    Counterfactual Learning of Continuous Stochastic Policies

    Get PDF
    Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare. In this paper, we address the problem of counterfactual risk minimization (CRM) for learning a stochastic policy with continuous actions, whereas most existing work has focused on the discrete setting. Switching from discrete to continuous action spaces presents several difficulties as naive discretization strategies have been shown to perform poorly. To deal with this issue, we first introduce an effective contextual modelling strategy that learns a joint representation of contexts and actions based on positive definite kernels. Second, we empirically show that the optimization perspective of CRM is more important than previously thought, and we demonstrate the benefits of proximal point algorithms and differentiable estimators. Finally, we propose an evaluation protocol for offline policies in real-world logged systems, which is challenging since policies cannot be replayed on test data, and we release a new large-scale dataset along with multiple synthetic, yet realistic, evaluation setups

    Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints

    Full text link
    We propose a novel master-slave architecture to solve the top-KK combinatorial multi-armed bandits problem with non-linear bandit feedback and diversity constraints, which, to the best of our knowledge, is the first combinatorial bandits setting considering diversity constraints under bandit feedback. Specifically, to efficiently explore the combinatorial and constrained action space, we introduce six slave models with distinguished merits to generate diversified samples well balancing rewards and constraints as well as efficiency. Moreover, we propose teacher learning based optimization and the policy co-training technique to boost the performance of the multiple slave models. The master model then collects the elite samples provided by the slave models and selects the best sample estimated by a neural contextual UCB-based network to make a decision with a trade-off between exploration and exploitation. Thanks to the elaborate design of slave models, the co-training mechanism among slave models, and the novel interactions between the master and slave models, our approach significantly surpasses existing state-of-the-art algorithms in both synthetic and real datasets for recommendation tasks. The code is available at: \url{https://github.com/huanghanchi/Master-slave-Algorithm-for-Top-K-Bandits}.Comment: IEEE Transactions on Neural Networks and Learning System
    • …
    corecore