143 research outputs found

    Near-Optimal Time and Sample Complexities for Solving Discounted Markov Decision Process with a Generative Model

    Get PDF
    In this paper we consider the problem of computing an ϵ\epsilon-optimal policy of a discounted Markov Decision Process (DMDP) provided we can only access its transition function through a generative sampling model that given any state-action pair samples from the transition function in O(1)O(1) time. Given such a DMDP with states SS, actions AA, discount factor γ(0,1)\gamma\in(0,1), and rewards in range [0,1][0, 1] we provide an algorithm which computes an ϵ\epsilon-optimal policy with probability 1δ1 - \delta where \emph{both} the time spent and number of sample taken are upper bounded by O[SA(1γ)3ϵ2log(SA(1γ)δϵ)log(1(1γ)ϵ)] . O\left[\frac{|S||A|}{(1-\gamma)^3 \epsilon^2} \log \left(\frac{|S||A|}{(1-\gamma)\delta \epsilon} \right) \log\left(\frac{1}{(1-\gamma)\epsilon}\right)\right] ~. For fixed values of ϵ\epsilon, this improves upon the previous best known bounds by a factor of (1γ)1(1 - \gamma)^{-1} and matches the sample complexity lower bounds proved in Azar et al. (2013) up to logarithmic factors. We also extend our method to computing ϵ\epsilon-optimal policies for finite-horizon MDP with a generative model and provide a nearly matching sample complexity lower bound.Comment: 31 pages. Accepted to NeurIPS, 201

    Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

    Full text link
    Off-policy Learning to Rank (LTR) aims to optimize a ranker from data collected by a deployed logging policy. However, existing off-policy learning to rank methods often make strong assumptions about how users generate the click data, i.e., the click model, and hence need to tailor their methods specifically under different click models. In this paper, we unified the ranking process under general stochastic click models as a Markov Decision Process (MDP), and the optimal ranking could be learned with offline reinforcement learning (RL) directly. Building upon this, we leverage offline RL techniques for off-policy LTR and propose the Click Model-Agnostic Unified Off-policy Learning to Rank (CUOLR) method, which could be easily applied to a wide range of click models. Through a dedicated formulation of the MDP, we show that offline RL algorithms can adapt to various click models without complex debiasing techniques and prior knowledge of the model. Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms while maintaining consistency and robustness under different click models

    Clonal integration promotes the growth of Phragmites australis populations in saline wetlands of the Yellow River Delta

    Get PDF
    Estuarine wetlands are highly heterogeneous due to strong interactions between freshwater input and seawater intrusion. However, little is known about how clonal plant populations adapt to heterogeneous salinity in soil environments. In the present study, the effects of clonal integration on Phragmites australis populations under salinity heterogeneity were studied using field experiments with 10 treatments in the Yellow River Delta. Clonal integration significantly increased plant height, aboveground biomass, underground biomass, root–shoot ratio, intercellular CO2 concentration, net photosynthetic rate, stomatal conductance, transpiration rate, and stem Na+ content under homogeneous treatment. Under the heterogeneous salt treatment, clonal integration significantly affected total aboveground and underground biomass, photosynthetic traits, and stem Na+ content under different salt gradients. The increase in salt concentration inhibited the physiological activity and growth of P. australis to varying degrees. Compared with the heterogeneous saline environment, clonal integration was more beneficial to P. australis populations in the homogeneous saline habitat. The results of the present study suggest that P. australis prefers homogeneous saline habitats; however, plants can adapt to heterogeneous salinity conditions via clonal integration

    KMF: Knowledge-Aware Multi-Faceted Representation Learning for Zero-Shot Node Classification

    Full text link
    Recently, Zero-Shot Node Classification (ZNC) has been an emerging and crucial task in graph data analysis. This task aims to predict nodes from unseen classes which are unobserved in the training process. Existing work mainly utilizes Graph Neural Networks (GNNs) to associate features' prototypes and labels' semantics thus enabling knowledge transfer from seen to unseen classes. However, the multi-faceted semantic orientation in the feature-semantic alignment has been neglected by previous work, i.e. the content of a node usually covers diverse topics that are relevant to the semantics of multiple labels. It's necessary to separate and judge the semantic factors that tremendously affect the cognitive ability to improve the generality of models. To this end, we propose a Knowledge-Aware Multi-Faceted framework (KMF) that enhances the richness of label semantics via the extracted KG (Knowledge Graph)-based topics. And then the content of each node is reconstructed to a topic-level representation that offers multi-faceted and fine-grained semantic relevancy to different labels. Due to the particularity of the graph's instance (i.e., node) representation, a novel geometric constraint is developed to alleviate the problem of prototype drift caused by node information aggregation. Finally, we conduct extensive experiments on several public graph datasets and design an application of zero-shot cross-domain recommendation. The quantitative results demonstrate both the effectiveness and generalization of KMF with the comparison of state-of-the-art baselines

    GlanceSeg: Real-time microaneurysm lesion segmentation with gaze-map-guided foundation model for early detection of diabetic retinopathy

    Full text link
    Early-stage diabetic retinopathy (DR) presents challenges in clinical diagnosis due to inconspicuous and minute microangioma lesions, resulting in limited research in this area. Additionally, the potential of emerging foundation models, such as the segment anything model (SAM), in medical scenarios remains rarely explored. In this work, we propose a human-in-the-loop, label-free early DR diagnosis framework called GlanceSeg, based on SAM. GlanceSeg enables real-time segmentation of microangioma lesions as ophthalmologists review fundus images. Our human-in-the-loop framework integrates the ophthalmologist's gaze map, allowing for rough localization of minute lesions in fundus images. Subsequently, a saliency map is generated based on the located region of interest, which provides prompt points to assist the foundation model in efficiently segmenting microangioma lesions. Finally, a domain knowledge filter refines the segmentation of minute lesions. We conducted experiments on two newly-built public datasets, i.e., IDRiD and Retinal-Lesions, and validated the feasibility and superiority of GlanceSeg through visualized illustrations and quantitative measures. Additionally, we demonstrated that GlanceSeg improves annotation efficiency for clinicians and enhances segmentation performance through fine-tuning using annotations. This study highlights the potential of GlanceSeg-based annotations for self-model optimization, leading to enduring performance advancements through continual learning.Comment: 12 pages, 10 figure

    Investigation of hearing loss in elderly vertigo and dizziness patients in the past 10 years

    Get PDF
    BackgroundVertigo and hearing loss are both prevalent in the elderly. This study retrospectively analyzed hearing test results from elderly patients experiencing vertigo and dizziness at ENT outpatient over a 10-year period, in order to study the patterns of hearing loss in this patient population.MethodsNine thousand three hundred eighty four patients over 50 years old underwent retrospective collection and screening of outpatient diagnosis, pure tone audiometry, acoustic immittance measurement (tympanogram) and auditory brainstem response (ABR) test. The patient's audiograms are divided into 7 subtypes according to a set of fixed criteria. Meanwhile, K-Means clustering analysis method was used to classify the audiogram.ResultsThe Jerger classification of tympanogram in elderly patients with vertigo and dizziness showed the majority falling under type A. The leading audiogram shapes were flat (27.81% in right ear and 26.89% in left ear), high-frequency gently sloping (25.97% in right ear and 27.34% in left ear), and high-frequency steeply sloping (21.60% in right ear and 22.53% in left ear). Meniere's disease (MD; 30.87%), benign recurrent vertigo (BRV; 19.07%), and benign paroxysmal positional vertigo (BPPV; 15.66%) were the most common etiologies in elderly vestibular diseases. We observed statistically significant differences in hearing thresholds among these vestibular diseases (P < 0.001). K-Means clustering analysis suggested that the optimal number of clusters was three, with sample sizes for the three clusters being 2,747, 2,413, and 4,139, respectively. The ANOVA statistical results of each characteristic value showed P < 0.001.ConclusionThe elderly patients often have mild to moderate hearing loss as a concomitant symptom with vertigo. Female patients have better hearing thresholds than males. The dominant audiometric shapes in this patient population were flat, high-frequency gently sloping, and high-frequency steeply sloping according to a set of fixed criteria. This study highlights the need for tailored strategies in managing hearing loss in elderly patients with vertigo and dizziness
    corecore