Search CORE

143 research outputs found

Near-Optimal Time and Sample Complexities for Solving Discounted Markov Decision Process with a Generative Model

Author: Sidford Aaron
Wang Mengdi
Wu Xian
Yang Lin F.
Ye Yinyu
Publication venue
Publication date: 04/06/2018
Field of study

In this paper we consider the problem of computing an

\epsilon

-optimal policy of a discounted Markov Decision Process (DMDP) provided we can only access its transition function through a generative sampling model that given any state-action pair samples from the transition function in

O(1)

time. Given such a DMDP with states

S

, actions

A

, discount factor

\gamma\in(0,1)

, and rewards in range

[0, 1]

we provide an algorithm which computes an

\epsilon

-optimal policy with probability

1 - \delta

where \emph{both} the time spent and number of sample taken are upper bounded by

O\left[\frac{|S||A|}{(1-\gamma)^3 \epsilon^2} \log \left(\frac{|S||A|}{(1-\gamma)\delta \epsilon} \right) \log\left(\frac{1}{(1-\gamma)\epsilon}\right)\right] ~.

For fixed values of

\epsilon

, this improves upon the previous best known bounds by a factor of

(1 - \gamma)^{-1}

and matches the sample complexity lower bounds proved in Azar et al. (2013) up to logarithmic factors. We also extend our method to computing

\epsilon

-optimal policies for finite-horizon MDP with a generative model and provide a nearly matching sample complexity lower bound.Comment: 31 pages. Accepted to NeurIPS, 201

arXiv.org e-Print Archive

eScholarship - University of California

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Author: Balasubramanian Rishab
Su Yi
Wang Huazheng
Wang Mengdi
Wu Qingyun
Wu Yiran
Yuan Hui
Zhang Zeyu
Publication venue
Publication date: 12/06/2023
Field of study

Off-policy Learning to Rank (LTR) aims to optimize a ranker from data collected by a deployed logging policy. However, existing off-policy learning to rank methods often make strong assumptions about how users generate the click data, i.e., the click model, and hence need to tailor their methods specifically under different click models. In this paper, we unified the ranking process under general stochastic click models as a Markov Decision Process (MDP), and the optimal ranking could be learned with offline reinforcement learning (RL) directly. Building upon this, we leverage offline RL techniques for off-policy LTR and propose the Click Model-Agnostic Unified Off-policy Learning to Rank (CUOLR) method, which could be easily applied to a wide range of click models. Through a dedicated formulation of the MDP, we show that offline RL algorithms can adapt to various click models without complex debiasing techniques and prior knowledge of the model. Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms while maintaining consistency and robustness under different click models

arXiv.org e-Print Archive

Clonal integration promotes the growth of Phragmites australis populations in saline wetlands of the Yellow River Delta

Author: Bo Guan
Di Zhou
Jisong Yang
Junbao Yu
Mengdi Wu
Mengdi Wu
Xiaoling Liu
Xiaolong Zhang
Xuehong Wang
Publication venue: 'Frontiers Media SA'
Publication date: 01/06/2023
Field of study

Estuarine wetlands are highly heterogeneous due to strong interactions between freshwater input and seawater intrusion. However, little is known about how clonal plant populations adapt to heterogeneous salinity in soil environments. In the present study, the effects of clonal integration on Phragmites australis populations under salinity heterogeneity were studied using field experiments with 10 treatments in the Yellow River Delta. Clonal integration significantly increased plant height, aboveground biomass, underground biomass, root–shoot ratio, intercellular CO2 concentration, net photosynthetic rate, stomatal conductance, transpiration rate, and stem Na+ content under homogeneous treatment. Under the heterogeneous salt treatment, clonal integration significantly affected total aboveground and underground biomass, photosynthetic traits, and stem Na+ content under different salt gradients. The increase in salt concentration inhibited the physiological activity and growth of P. australis to varying degrees. Compared with the heterogeneous saline environment, clonal integration was more beneficial to P. australis populations in the homogeneous saline habitat. The results of the present study suggest that P. australis prefers homogeneous saline habitats; however, plants can adapt to heterogeneous salinity conditions via clonal integration

Directory of Open Access Journals

Institutional Repository of Yantai Institute of Coastal Zone Research, CAS

KMF: Knowledge-Aware Multi-Faceted Representation Learning for Zero-Shot Node Classification

Author: Chen Enhong
Jiang Junji
Lian Defu
Wang Hao
Wu Likang
Zhang Mengdi
Zhao Hongke
Publication venue
Publication date: 14/08/2023
Field of study

Recently, Zero-Shot Node Classification (ZNC) has been an emerging and crucial task in graph data analysis. This task aims to predict nodes from unseen classes which are unobserved in the training process. Existing work mainly utilizes Graph Neural Networks (GNNs) to associate features' prototypes and labels' semantics thus enabling knowledge transfer from seen to unseen classes. However, the multi-faceted semantic orientation in the feature-semantic alignment has been neglected by previous work, i.e. the content of a node usually covers diverse topics that are relevant to the semantics of multiple labels. It's necessary to separate and judge the semantic factors that tremendously affect the cognitive ability to improve the generality of models. To this end, we propose a Knowledge-Aware Multi-Faceted framework (KMF) that enhances the richness of label semantics via the extracted KG (Knowledge Graph)-based topics. And then the content of each node is reconstructed to a topic-level representation that offers multi-faceted and fine-grained semantic relevancy to different labels. Due to the particularity of the graph's instance (i.e., node) representation, a novel geometric constraint is developed to alleviate the problem of prototype drift caused by node information aggregation. Finally, we conduct extensive experiments on several public graph datasets and design an application of zero-shot cross-domain recommendation. The quantitative results demonstrate both the effectiveness and generalization of KMF with the comparison of state-of-the-art baselines

arXiv.org e-Print Archive

GlanceSeg: Real-time microaneurysm lesion segmentation with gaze-map-guided foundation model for early detection of diabetic retinopathy

Author: Gao Mengdi
Jiang Hongyang
Jiang Shuai
Liu Jiang
Liu Zirong
Tang Chen
Yuan Wu
Zhang Xiaoqing
Publication venue
Publication date: 14/11/2023
Field of study

Early-stage diabetic retinopathy (DR) presents challenges in clinical diagnosis due to inconspicuous and minute microangioma lesions, resulting in limited research in this area. Additionally, the potential of emerging foundation models, such as the segment anything model (SAM), in medical scenarios remains rarely explored. In this work, we propose a human-in-the-loop, label-free early DR diagnosis framework called GlanceSeg, based on SAM. GlanceSeg enables real-time segmentation of microangioma lesions as ophthalmologists review fundus images. Our human-in-the-loop framework integrates the ophthalmologist's gaze map, allowing for rough localization of minute lesions in fundus images. Subsequently, a saliency map is generated based on the located region of interest, which provides prompt points to assist the foundation model in efficiently segmenting microangioma lesions. Finally, a domain knowledge filter refines the segmentation of minute lesions. We conducted experiments on two newly-built public datasets, i.e., IDRiD and Retinal-Lesions, and validated the feasibility and superiority of GlanceSeg through visualized illustrations and quantitative measures. Additionally, we demonstrated that GlanceSeg improves annotation efficiency for clinicians and enhances segmentation performance through fine-tuning using annotations. This study highlights the potential of GlanceSeg-based annotations for self-model optimization, leading to enduring performance advancements through continual learning.Comment: 12 pages, 10 figure

arXiv.org e-Print Archive

Investigation of hearing loss in elderly vertigo and dizziness patients in the past 10 years

Author: Aiting Chen
Aiting Chen
Aiting Chen
Aiting Chen
Fei Ji
Fei Ji
Fei Ji
Fei Ji
Mengdi Hong
Mengdi Hong
Mengdi Hong
Mengdi Hong
Qian Wang
Qian Wang
Qian Wang
Qian Wang
Wenbo Cheng
Xingjian Liu
Xingjian Liu
Xingjian Liu
Xingjian Liu
Yi Du
Yi Du
Yi Du
Yi Du
Ziming Wu
Ziming Wu
Ziming Wu
Ziming Wu
Publication venue: Frontiers Media S.A.
Publication date: 01/09/2023
Field of study

BackgroundVertigo and hearing loss are both prevalent in the elderly. This study retrospectively analyzed hearing test results from elderly patients experiencing vertigo and dizziness at ENT outpatient over a 10-year period, in order to study the patterns of hearing loss in this patient population.MethodsNine thousand three hundred eighty four patients over 50 years old underwent retrospective collection and screening of outpatient diagnosis, pure tone audiometry, acoustic immittance measurement (tympanogram) and auditory brainstem response (ABR) test. The patient's audiograms are divided into 7 subtypes according to a set of fixed criteria. Meanwhile, K-Means clustering analysis method was used to classify the audiogram.ResultsThe Jerger classification of tympanogram in elderly patients with vertigo and dizziness showed the majority falling under type A. The leading audiogram shapes were flat (27.81% in right ear and 26.89% in left ear), high-frequency gently sloping (25.97% in right ear and 27.34% in left ear), and high-frequency steeply sloping (21.60% in right ear and 22.53% in left ear). Meniere's disease (MD; 30.87%), benign recurrent vertigo (BRV; 19.07%), and benign paroxysmal positional vertigo (BPPV; 15.66%) were the most common etiologies in elderly vestibular diseases. We observed statistically significant differences in hearing thresholds among these vestibular diseases (P < 0.001). K-Means clustering analysis suggested that the optimal number of clusters was three, with sample sizes for the three clusters being 2,747, 2,413, and 4,139, respectively. The ANOVA statistical results of each characteristic value showed P < 0.001.ConclusionThe elderly patients often have mild to moderate hearing loss as a concomitant symptom with vertigo. Female patients have better hearing thresholds than males. The dominant audiometric shapes in this patient population were flat, high-frequency gently sloping, and high-frequency steeply sloping according to a set of fixed criteria. This study highlights the need for tailored strategies in managing hearing loss in elderly patients with vertigo and dizziness

Directory of Open Access Journals