93 research outputs found

    Unimodal Thompson Sampling for Graph-Structured Arms

    Full text link
    We study, to the best of our knowledge, the first Bayesian algorithm for unimodal Multi-Armed Bandit (MAB) problems with graph structure. In this setting, each arm corresponds to a node of a graph and each edge provides a relationship, unknown to the learner, between two nodes in terms of expected reward. Furthermore, for any node of the graph there is a path leading to the unique node providing the maximum expected reward, along which the expected reward is monotonically increasing. Previous results on this setting describe the behavior of frequentist MAB algorithms. In our paper, we design a Thompson Sampling-based algorithm whose asymptotic pseudo-regret matches the lower bound for the considered setting. We show that -as it happens in a wide number of scenarios- Bayesian MAB algorithms dramatically outperform frequentist ones. In particular, we provide a thorough experimental evaluation of the performance of our and state-of-the-art algorithms as the properties of the graph vary

    Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes

    Full text link
    Policy-based algorithms are among the most widely adopted techniques in model-free RL, thanks to their strong theoretical groundings and good properties in continuous action spaces. Unfortunately, these methods require precise and problem-specific hyperparameter tuning to achieve good performance, and tend to struggle when asked to accomplish a series of heterogeneous tasks. In particular, the selection of the step size has a crucial impact on their ability to learn a highly performing policy, affecting the speed and the stability of the training process, and often being the main culprit for poor results. In this paper, we tackle these issues with a Meta Reinforcement Learning approach, by introducing a new formulation, known as meta-MDP, that can be used to solve any hyperparameter selection problem in RL with contextual processes. After providing a theoretical Lipschitz bound to the difference of performance in different tasks, we adopt the proposed framework to train a batch RL algorithm to dynamically recommend the most adequate step size for different policies and tasks. In conclusion, we present an experimental campaign to show the advantages of selecting an adaptive learning rate in heterogeneous environments

    Stochastic Rising Bandits

    Get PDF
    This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e., those sequential selection techniques able to learn online using only the feedback given by the chosen option (a.k.a. arm). We study a particular case of the rested and restless bandits in which the arms’ expected payoff is monotonically non-decreasing. This characteristic allows designing specifically crafted algorithms that exploit the regularity of the payoffs to provide tight regret bounds. We design an algorithm for the rested case (R-ed-UCB) and one for the restless case (R-less-UCB), providing a regret bound depending on the properties of the instance and, under certain circumstances, of O~(T23)\widetilde{\mathcal{O}}(T^{\frac{2}{3}}). We empirically compare our algorithms with state-of-the-art methods for non-stationary MABs over several synthetically generated tasks and an online model selection problem for a real-world dataset. Finally, using synthetic and real-world data, we illustrate the effectiveness of the proposed approaches compared with state-of-the-art algorithms for the non-stationary bandits

    Confocal laser imaging in neurosurgery: A comprehensive review of sodium fluorescein-based CONVIVO preclinical and clinical applications.

    Get PDF
    Given the established direct correlation that exists among extent of resection and postoperative survival in brain tumors, obtaining complete resections is of primary importance. Apart from the various technological advancements that have been introduced in current clinical practice, histopathological study still remains the gold-standard for definitive diagnosis. Frozen section analysis still represents the most rapid and used intraoperative histopathological method that allows for an intraoperative differential diagnosis. Nevertheless, such technique owes some intrinsic limitations that limit its overall potential in obtaining real-time diagnosis during surgery. In this context, confocal laser technology has been suggested as a promising method to have near real-time intraoperative histological images in neurosurgery, thanks to the results of various studies performed in other non-neurosurgical fields. Still far to be routinely implemented in current neurosurgical practice, pertinent literature is growing quickly, and various reports have recently demonstrated the utility of this technology in both preclinical and clinical settings in identifying brain tumors, microvasculature, and tumor margins, when coupled to the intravenous administration of sodium fluorescein. Specifically in neurosurgery, among different available devices, the ZEISS CONVIVO system probably boasts the most recent and largest number of experimental studies assessing its usefulness, which has been confirmed for identifying brain tumors, offering a diagnosis and distinguishing between healthy and pathologic tissue, and studying brain vessels. The main objective of this systematic review is to present a state-of-the-art summary on sodium fluorescein-based preclinical and clinical applications of the ZEISS CONVIVO in neurosurgery

    COVIDIAGNOSTIX : health technology assessment of serological tests for SARS-CoV-2 infection.

    Get PDF
    Abstract Objective In vitro diagnostic tests for SARS-COV-2, also known as serological tests, have rapidly spread. However, to date, mostly single-center technical and diagnostic performance's assessments have been carried out without an intralaboratory validation process and a health technology assessment (HTA) systematic approach. Therefore, the rapid HTA for evaluating antibody tests for SARS-COV-2 was applied. Methods The use of rapid HTA is an opportunity to test innovative technology. Unlike traditional HTA (which evaluates the benefits of new technologies after being tested in clinical trials or have been applied in practice for some time), the rapid HTA is performed during the early stages of developing new technology. A multidisciplinary team conducted the rapid HTA following the HTA Core ModelÂŽ (version 3.0) developed by the European Network for Health Technology Assessment. Results The three methodological and analytical steps used in the HTA applied to the evaluation of antibody tests for SARS-COV-2 are reported: the selection of the tests to be evaluated; the research and collection of information to support the adoption and appropriateness of the technology; and the preparation of the final reports and their dissemination. Finally, the rapid HTA of serological tests for SARS-CoV-2 is summarized in a report that allows its dissemination and communication. Conclusions The rapid-HTA evaluation method, in addition to highlighting the characteristics that differentiate the tests from each other, guarantees a timely and appropriate evaluation, becoming a tool to create a direct link between science and health management

    Best Arm Identification for Stochastic Rising Bandits

    Full text link
    Stochastic Rising Bandits is a setting in which the values of the expected rewards of the available options increase every time they are selected. This framework models a wide range of scenarios in which the available options are learning entities whose performance improves over time. In this paper, we focus on the Best Arm Identification (BAI) problem for the stochastic rested rising bandits. In this scenario, we are asked, given a fixed budget of rounds, to provide a recommendation about the best option at the end of the selection process. We propose two algorithms to tackle the above-mentioned setting, namely R-UCBE, which resorts to a UCB-like approach, and R-SR, which employs a successive reject procedure. We show that they provide guarantees on the probability of properly identifying the optimal option at the end of the learning process. Finally, we numerically validate the proposed algorithms in synthetic and realistic environments and compare them with the currently available BAI strategies

    Endoscopic Endonasal Odontoidectomy Preserving Atlantoaxial Stability: a Pediatric Case

    Get PDF
    Abstract Objectives We illustrate endoscopic endonasal odontoidectomy for the Chiari-I malformation respecting craniovertebral junction (CVJ) stability. Design Case report of a 12-year-old girl affected by the Chiari-I malformation. Magnetic resonance imaging (MRI) showed tonsillar herniation, basilar invagination, and dental retroversion, causing angulation and compression of the bulbomedullary junction. Patient underwent endoscopic third ventriculostomy (ETV) with reduction of ventricular size and resolution of gait disturbances, but she complained the Valsalva-induced headaches, hiccup, and dysesthesias in the lower limbs. Endoscopic endonasal odontoidectomy was chosen to decompress the cervicomedullary junction. Setting The research was conducted at University Hospital "Ospedale di Circolo," Department of Neurosurgery at Varese in Italy. Participants Patients were from neurosurgical and ENT (ear, nose, and throat) skull base team. Main Outcome Measures A bilateral paraseptal approach was performed, using a four-hand technique. After resection of posterior edge of the nasal septum, the choana is entered and a rhinopharynx muscle–mucosal flap is dissected subperiosteal and transposed in oral cavity. The CVJ is exposed and, using neuronavigation and neuromonitoring, odontoidectomy is fulfilled until dura is reached, preserving the anterior arch of C1. Reconstruction is obtained suturing the flap previously harvested. Results Postoperative course was unremarkable and the patient experienced improvement of symptoms. Postoperative MRI documented the appearance of tight cerebrospinal fluid (CSF) film anterior to bulbomedullary junction and in retrotonsillar spaces, opening of the bulbomedullary angle, and slight tonsils reduction. No CVJ instability was occurred with any need of posterior fixation. Conclusion Endoscopic endonasal odontoidectomy is a feasible approach for CVJ malformation. In this case, bulbar decompression was achieved preserving CVJ stability and avoiding posterior fixation.The link to the video can be found at: https://youtu.be/VIobocHfCuc

    Autoregressive Bandits

    Full text link
    Autoregressive processes naturally arise in a large variety of real-world scenarios, including e.g., stock markets, sell forecasting, weather prediction, advertising, and pricing. When addressing a sequential decision-making problem in such a context, the temporal dependence between consecutive observations should be properly accounted for converge to the optimal decision policy. In this work, we propose a novel online learning setting, named Autoregressive Bandits (ARBs), in which the observed reward follows an autoregressive process of order kk, whose parameters depend on the action the agent chooses, within a finite set of nn actions. Then, we devise an optimistic regret minimization algorithm AutoRegressive Upper Confidence Bounds (AR-UCB) that suffers regret of order O~((k+1)3/2nT(1−Γ)2)\widetilde{\mathcal{O}} \left( \frac{(k+1)^{3/2}\sqrt{nT}}{(1-\Gamma)^2} \right), being TT the optimization horizon and Γ<1\Gamma < 1 an index of the stability of the system. Finally, we present a numerical validation in several synthetic and one real-world setting, in comparison with general and specific purpose bandit baselines showing the advantages of the proposed approach
    • …
    corecore