279 research outputs found

    Dynamical Linear Bandits

    Get PDF
    In many real-world sequential decision-making problems, an action does not immediately reflect on the feedback and spreads its effects over a long time frame. For instance, in online advertising, investing in a platform produces an instantaneous increase of awareness, but the actual reward, i.e., a conversion, might occur far in the future. Furthermore, whether a conversion takes place depends on: how fast the awareness grows, its vanishing effects, and the synergy or interference with other advertising platforms. Previous work has investigated the Multi-Armed Bandit framework with the possibility of delayed and aggregated feedback, without a particular structure on how an action propagates in the future, disregarding possible dynamical effects. In this paper, we introduce a novel setting, the Dynamical Linear Bandits (DLB), an extension of the linear bandits characterized by a hidden state. When an action is performed, the learner observes a noisy reward whose mean is a linear function of the hidden state and of the action. Then, the hidden state evolves according to linear dynamics, affected by the performed action too. We start by introducing the setting, discussing the notion of optimal policy, and deriving an expected regret lower bound. Then, we provide an optimistic regret minimization algorithm, Dynamical Linear Upper Confidence Bound (DynLin-UCB), that suffers an expected regret of order O~(dT(1ρ)3/2)\widetilde{\mathcal{O}} \Big( \frac{d \sqrt{T}}{(1-\overline{\rho})^{3/2}} \Big), where ρ\overline{\rho} is a measure of the stability of the system, and dd is the dimension of the action vector. Finally, we conduct a numerical validation on a synthetic environment and on real-world data to show the effectiveness of DynLin-UCB in comparison with several baselines

    Dynamical Linear Bandits

    Full text link
    In many real-world sequential decision-making problems, an action does not immediately reflect on the feedback and spreads its effects over a long time frame. For instance, in online advertising, investing in a platform produces an instantaneous increase of awareness, but the actual reward, i.e., a conversion, might occur far in the future. Furthermore, whether a conversion takes place depends on: how fast the awareness grows, its vanishing effects, and the synergy or interference with other advertising platforms. Previous work has investigated the Multi-Armed Bandit framework with the possibility of delayed and aggregated feedback, without a particular structure on how an action propagates in the future, disregarding possible dynamical effects. In this paper, we introduce a novel setting, the Dynamical Linear Bandits (DLB), an extension of the linear bandits characterized by a hidden state. When an action is performed, the learner observes a noisy reward whose mean is a linear function of the hidden state and of the action. Then, the hidden state evolves according to linear dynamics, affected by the performed action too. We start by introducing the setting, discussing the notion of optimal policy, and deriving an expected regret lower bound. Then, we provide an optimistic regret minimization algorithm, Dynamical Linear Upper Confidence Bound (DynLin-UCB), that suffers an expected regret of order O~(dT(1ρ)3/2)\widetilde{\mathcal{O}} \Big( \frac{d \sqrt{T}}{(1-\overline{\rho})^{3/2}} \Big), where ρ\overline{\rho} is a measure of the stability of the system, and dd is the dimension of the action vector. Finally, we conduct a numerical validation on a synthetic environment and on real-world data to show the effectiveness of DynLin-UCB in comparison with several baselines

    Thymectomy in ocular myasthenia gravis

    Get PDF

    Best Arm Identification for Stochastic Rising Bandits

    Full text link
    Stochastic Rising Bandits is a setting in which the values of the expected rewards of the available options increase every time they are selected. This framework models a wide range of scenarios in which the available options are learning entities whose performance improves over time. In this paper, we focus on the Best Arm Identification (BAI) problem for the stochastic rested rising bandits. In this scenario, we are asked, given a fixed budget of rounds, to provide a recommendation about the best option at the end of the selection process. We propose two algorithms to tackle the above-mentioned setting, namely R-UCBE, which resorts to a UCB-like approach, and R-SR, which employs a successive reject procedure. We show that they provide guarantees on the probability of properly identifying the optimal option at the end of the learning process. Finally, we numerically validate the proposed algorithms in synthetic and realistic environments and compare them with the currently available BAI strategies

    Estimating the Isotopic Altitude Gradient for Hydrogeological Studies in Mountainous Areas: Are the Low-Yield Springs Suitable? Insights from the Northern Apennines of Italy

    Get PDF
    Several prior studies investigated the use of stable isotopes of water in hydrogeological applications, most on a local scale and often involving the isotopic gradient (evaluated by exploiting the so-called altitude effect), calculated on the basis of rainwater isotopes. A few times, this gradient has been obtained using the stable isotopic contents of low-yield springs in a limited time series. Despite the fact that this method has been recognized by the hydrogeological community, marked differences have been observed with respect to the mean stable isotopes content of groundwater and rainwater. The present investigation compares the stable isotopic signatures of 23 low-yield springs discharging along two transects from the Tyrrhenian sea to the Po Plain of Italy, evaluates the different isotopic gradients and assesses their distribution in relation to some climatic and topographic conditions. Stable isotopes of water show that groundwater in the study area is recharged by precipitation and that the precipitation regime in the eastern portion of the study area is strongly controlled by a shadow effect caused by the Alps chain on the air masses from central Europe. Stable isotopes (in particular the d18O and deuterium excess (d-excess) contents together with the obtained isotopic gradients) allow us to identify in the study area an opposite oriented orographic effect and a different provenance of the air masses. When the windward slope is located on the Tyrrhenian side, the precipitation shows a predominant oceanic origin; when the windward slope moves to the Adriatic side, the precipitation is characterized by a continental origin. The main results of this study confirm the usefulness of low-yield springs and the need for a highly detailed survey-scale hydrological investigation in the mountainous context

    Autoregressive Bandits

    Full text link
    Autoregressive processes naturally arise in a large variety of real-world scenarios, including e.g., stock markets, sell forecasting, weather prediction, advertising, and pricing. When addressing a sequential decision-making problem in such a context, the temporal dependence between consecutive observations should be properly accounted for converge to the optimal decision policy. In this work, we propose a novel online learning setting, named Autoregressive Bandits (ARBs), in which the observed reward follows an autoregressive process of order kk, whose parameters depend on the action the agent chooses, within a finite set of nn actions. Then, we devise an optimistic regret minimization algorithm AutoRegressive Upper Confidence Bounds (AR-UCB) that suffers regret of order O~((k+1)3/2nT(1Γ)2)\widetilde{\mathcal{O}} \left( \frac{(k+1)^{3/2}\sqrt{nT}}{(1-\Gamma)^2} \right), being TT the optimization horizon and Γ<1\Gamma < 1 an index of the stability of the system. Finally, we present a numerical validation in several synthetic and one real-world setting, in comparison with general and specific purpose bandit baselines showing the advantages of the proposed approach

    Cross-calibration of eight-polar bioelectrical impedance analysis versus dual-energy X-ray absorptiometry for the assessment of total and appendicular body composition in healthy subjects aged 21-82 years.

    Get PDF
    Aim: To calibrate eight-polar bioelectrical impedance analysis (BIA) against dual-energy X-ray absorptiometry (DXA) for the assessment of total and appendicular body composition in healthy adults.Research design: A cross-sectional study was carried out.Subjects: Sixty-eight females and 42 males aged 21-82 years participated in the study.Methods: Whole-body fat-free mass (FFM) and appendicular lean tissue mass (LTM) were measured by DXA; resistance (R) of arms, trunk and legs was measured by eight-polar BIA at frequencies of 5, 50, 250 and 500 kHz; whole-body resistance was calculated as the sum R of arms, trunk and legs.Results: The resistance index (RI), i.e. the height(2)/resistance ratio, was the best predictor of FFM and appendicular LTM. As compared with weight (Wt), RI at 500 kHz explained 35% more variance of FFM (R-adj(2) =0.92 vs 0.57), 45% more variance of LTMarm (R-adj(2) = 0.93 vs 0.48) and 36% more variance of LTleg (R-adj(2) = 0.86 vs 0.50) (p &lt; 0.001 for all). The contribution of age to the unexplained variance of FFM and appendicular LTM was nil or negligible and the RI x sex interactions were either not significant or not important on practical grounds. The percent root mean square error of the estimate was 6% for FFM and 8% for LTMarm and LTMleg.Conclusion: Eight-polar BIA offers accurate estimates of total and appendicular body composition. The attractive hypothesis that eight-polar BIA is influenced minimally by age and sex should be tested on larger samples including younger individuals

    Nonalcoholic fatty liver disease and aging: epidemiology to management

    Get PDF
    Nonalcoholic fatty liver disease (NAFLD) is common in the elderly, in whom it carries a more substantial burden of hepatic (nonalcoholic steatohepatitis, cirrhosis and hepatocellular carcinoma) and extra-hepatic manifestations and complications (cardiovascular disease, extrahepatic neoplasms) than in younger age groups. Therefore, proper identification and management of this condition is a major task for clinical geriatricians and geriatric hepatologists. In this paper, the epidemiology and pathophysiology of this condition are reviewed, and a full discussion of the link between NAFLD and the aspects that are peculiar to elderly individuals is provided; these aspects include frailty, multimorbidity, polypharmacy and dementia. The proper treatment strategy will have to consider the peculiarities of geriatric patients, so a multidisciplinary approach is mandatory. Non-pharmacological treatment (diet and physical exercise) has to be tailored individually considering the physical limitations of most elderly people and the need for an adequate caloric supply. Similarly, the choice of drug treatment must carefully balance the benefits and risks in terms of adverse events and pharmacological interactions in the common context of both multiple health conditions and polypharmacy. In conclusion, further epidemiological and pathophysiological insight is warranted. More accurate understanding of the molecular mechanisms of geriatric NAFLD will help in identifying the most appropriate diagnostic and therapeutic approach for individual elderly patients
    corecore