Search CORE

106 research outputs found

Designing interfaces for exploratory content based image retrieval systems

Author: Hore Sayantan
Publication venue: Helsingin yliopisto
Publication date: 01/01/2015
Field of study

Content Based Image Retrieval or CBIR systems have become the state of the art image retrieval technique over the past few years. They showed commendable retrieval performance over traditional annotation based retrieval. CBIR systems use relevance feedback as input query. CBIR systems developed so far did not put much effort to come up with suitable user interfaces for accepting relevance feedback efficiently i.e. by putting less cognitive load to the user and providing a higher amount of exploration in a limited amount of time. In this study we propose a new interface 'FutureView' which allows peeking into the future providing access to more images in less time than traditional interfaces. This idea helps the user to choose more appropriate images without getting diverted. We used Gaussian process upper confidence bound algorithm for recommending images. We successfully compared this algorithm with Random and Exploitation algorithms with positive results

Helsingin yliopiston digitaalinen arkisto

Counterfactual Learning of Continuous Stochastic Policies

Author: Bietti Alberto
Diemert Eustache
Mairal Julien
Martin Matthieu
Zenati Houssam
Publication venue
Publication date: 24/06/2020
Field of study

Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare. In this paper, we address the problem of counterfactual risk minimization (CRM) for learning a stochastic policy with continuous actions, whereas most existing work has focused on the discrete setting. Switching from discrete to continuous action spaces presents several difficulties as naive discretization strategies have been shown to perform poorly. To deal with this issue, we first introduce an effective contextual modelling strategy that learns a joint representation of contexts and actions based on positive definite kernels. Second, we empirically show that the optimization perspective of CRM is more important than previously thought, and we demonstrate the benefits of proximal point algorithms and differentiable estimators. Finally, we propose an evaluation protocol for offline policies in real-world logged systems, which is challenging since policies cannot be replayed on test data, and we release a new large-scale dataset along with multiple synthetic, yet realistic, evaluation setups

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints

Author: Huang Hanchi
Liu Wei
Shen Li
Ye Deheng
Publication venue
Publication date: 24/08/2023
Field of study

We propose a novel master-slave architecture to solve the top-

K

combinatorial multi-armed bandits problem with non-linear bandit feedback and diversity constraints, which, to the best of our knowledge, is the first combinatorial bandits setting considering diversity constraints under bandit feedback. Specifically, to efficiently explore the combinatorial and constrained action space, we introduce six slave models with distinguished merits to generate diversified samples well balancing rewards and constraints as well as efficiency. Moreover, we propose teacher learning based optimization and the policy co-training technique to boost the performance of the multiple slave models. The master model then collects the elite samples provided by the slave models and selects the best sample estimated by a neural contextual UCB-based network to make a decision with a trade-off between exploration and exploitation. Thanks to the elaborate design of slave models, the co-training mechanism among slave models, and the novel interactions between the master and slave models, our approach significantly surpasses existing state-of-the-art algorithms in both synthetic and real datasets for recommendation tasks. The code is available at: \url{https://github.com/huanghanchi/Master-slave-Algorithm-for-Top-K-Bandits}.Comment: IEEE Transactions on Neural Networks and Learning System

arXiv.org e-Print Archive

Recommended from our members

Beam alignment for millimeter wave vehicular communications

Author: Va Vutha
Publication venue
Publication date: 04/04/2019
Field of study

Millimeter wave (mmWave) has the potential to provide vehicles with high data rate communications that will enable a whole new range of applications. Its use, however, is not straightforward due to its challenging propagation characteristics. One approach to overcome the propagation challenge is the use of directional beams, but it requires a proper alignment and presents a challenging engineering problem, especially under the high vehicular mobility. In this dissertation, fast and efficient beam alignment solutions suitable for vehicular applications are developed. To better quantify the problem, first the impact of directional beams on the temporal variation of the channels is investigated theoretically. The proposed model includes both the Doppler effect and the pointing error due to mobility. The channel coherence time is derived, and a new concept called the beam coherence time is proposed for capturing the overhead of mmWave beam alignment. Next, an efficient learning-based beam alignment framework is proposed. The core of this framework is the beam pair selection methods that use side information (position in this case) and past beam measurements to identify promising beam directions and eliminate unnecessary beam training. Three offline learning methods for beam pair selection are proposed: two statistics-based and one machine learning-based methods. The two statistical learning methods consist of a heuristic and an optimal selection that minimizes the misalignment probability. The third one uses a learning-to-rank approach from the recommender system literature. The proposed approach shows an order of magnitude lower overhead than existing standard (IEEE 802.11ad) enabling it to support large arrays at high speed. Finally, an online version of the optimal statistical learning method is developed. The solution is based on the upper confidence bound algorithm with a newly introduced risk-aware feature that helps avoid severe misalignment during the learning. Along with the online beam pair selection, an online beam pair refinement is also proposed for learning to adapt the codebook to the environment to further maximize the beamforming gain. The combined solution shows a fast learning behavior that can quickly achieve positive gain over the exhaustive search on the original (and unrefined) codebook. The results show that side information can help reduce mmWave link configuration overhead.Electrical and Computer Engineerin

Texas ScholarWorks