107 research outputs found

    Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds

    Full text link
    While numerous works have focused on devising efficient algorithms for reinforcement learning (RL) with uniformly bounded rewards, it remains an open question whether sample or time-efficient algorithms for RL with large state-action space exist when the rewards are \emph{heavy-tailed}, i.e., with only finite (1+ϵ)(1+\epsilon)-th moments for some ϵ∈(0,1]\epsilon\in(0,1]. In this work, we address the challenge of such rewards in RL with linear function approximation. We first design an algorithm, \textsc{Heavy-OFUL}, for heavy-tailed linear bandits, achieving an \emph{instance-dependent} TT-round regret of O~(dT1−ϵ2(1+ϵ)∑t=1Tνt2+dT1−ϵ2(1+ϵ))\tilde{O}\big(d T^{\frac{1-\epsilon}{2(1+\epsilon)}} \sqrt{\sum_{t=1}^T \nu_t^2} + d T^{\frac{1-\epsilon}{2(1+\epsilon)}}\big), the \emph{first} of this kind. Here, dd is the feature dimension, and νt1+ϵ\nu_t^{1+\epsilon} is the (1+ϵ)(1+\epsilon)-th central moment of the reward at the tt-th round. We further show the above bound is minimax optimal when applied to the worst-case instances in stochastic and deterministic linear bandits. We then extend this algorithm to the RL settings with linear function approximation. Our algorithm, termed as \textsc{Heavy-LSVI-UCB}, achieves the \emph{first} computationally efficient \emph{instance-dependent} KK-episode regret of O~(dHU∗K11+ϵ+dHV∗K)\tilde{O}(d \sqrt{H \mathcal{U}^*} K^\frac{1}{1+\epsilon} + d \sqrt{H \mathcal{V}^* K}). Here, HH is length of the episode, and U∗,V∗\mathcal{U}^*, \mathcal{V}^* are instance-dependent quantities scaling with the central moment of reward and value functions, respectively. We also provide a matching minimax lower bound Ω(dHK11+ϵ+dH3K)\Omega(d H K^{\frac{1}{1+\epsilon}} + d \sqrt{H^3 K}) to demonstrate the optimality of our algorithm in the worst case. Our result is achieved via a novel robust self-normalized concentration inequality that may be of independent interest in handling heavy-tailed noise in general online regression problems.Comment: NeurIPS 202

    An Empirical Study on the Holiday Effect of China's Time-Honored Companies

    Full text link
    The stock segment of China's time-honored brand enterprises has an important position in our securities stock market. The holiday effect is one of the market anomalies that occur in the securities market, which refers to the phenomenon that the stock market has significantly different returns than other trading days around festivals. The study of the holiday effect of China's time-honored brand enterprises can provide fresh ideas for the revitalization of our time-honored brands and the revitalization of time-honored enterprises. This paper takes listed companies of China's time-honored brand enterprises as the research object and focuses on the impact of the holiday effect on listed companies of China's time-honored brands with the help of the event study, and empirically analyses the changes in the return of listed companies of China time-honored brands during the Spring Festival period from 2012 to 2021. The empirical results reveal that: the time-honored brand concept stocks have a significant post-holiday effect during the Chinese New Year period, the time-honored alcoholic beverage enterprises are more sensitive to the Chinese New Year reflection, while the holiday effect of the time-honored pharmaceutical manufacturing enterprises is not significant.Comment: 24page

    Higher critical closing pressure is independently associated with enlarged basal ganglia perivascular spaces

    Get PDF
    ObjectiveThis study aimed to explore the association between cerebral hemodynamic parameters focused on the critical closing pressure (CCP) and enlarged perivascular spaces (EPVS).MethodsCerebral blood velocity in the middle cerebral artery (MCAv) and non-invasive continuous blood pressure (NIBP) were measured using a transcranial Doppler (TCD) and Finometer, followed by the calculation of cerebral hemodynamic parameters including CCP, resistance area product (RAP), pulsatility index (PI), and pulse pressure (PP). EPVS were graded separately in the basal ganglia (BG) and centrum semiovale (CSO), using a visual semiquantitative ordinal scale. Patients with EPVS >10 were classified into the severe BG-EPVS group and severe CSO-EPVS group, and the remainder into the mild BG-EPVS group and the mild CSO-EPVS group. Spearman’s correlation and binary logistic regression analysis were performed to analyze the relationship between hemodynamic parameters and BG-EPVS and CSO-EPVS, respectively.ResultsOverall, 107 patients were enrolled. The severe BG-EPVS group had higher CCP, mean arterial blood pressure (MABP), systolic blood pressure (SBP), and diastolic blood pressure (DBP) than that in the mild BG-EPVS group (p < 0.05). There was no statistical difference in hemodynamic parameters between the severe CSO-EPVS group and the mild CSO-EPVS group. Spearman’s correlation analysis showed that CCP was positively associated with BG-EPVS (rho = 0.331, p < 0.001) and CSO-EPVS (rho = 0.154, p = 0.044). The binary logistic regression analysis showed that CCP was independently associated with severe BG-EPVS (p < 0.05) and not with CSO-EPVS (p > 0.05) after adjusting for confounders.ConclusionCCP representing cerebrovascular tension was independently associated with BG-EPVS

    Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning

    Full text link
    Self-supervised audio-visual source localization aims to locate sound-source objects in video frames without extra annotations. Recent methods often approach this goal with the help of contrastive learning, which assumes only the audio and visual contents from the same video are positive samples for each other. However, this assumption would suffer from false negative samples in real-world training. For example, for an audio sample, treating the frames from the same audio class as negative samples may mislead the model and therefore harm the learned representations e.g., the audio of a siren wailing may reasonably correspond to the ambulances in multiple images). Based on this observation, we propose a new learning strategy named False Negative Aware Contrastive (FNAC) to mitigate the problem of misleading the training with such false negative samples. Specifically, we utilize the intra-modal similarities to identify potentially similar samples and construct corresponding adjacency matrices to guide contrastive learning. Further, we propose to strengthen the role of true negative samples by explicitly leveraging the visual features of sound sources to facilitate the differentiation of authentic sounding source regions. FNAC achieves state-of-the-art performances on Flickr-SoundNet, VGG-Sound, and AVSBench, which demonstrates the effectiveness of our method in mitigating the false negative issue. The code is available at \url{https://github.com/OpenNLPLab/FNAC_AVL}.Comment: CVPR202

    Audio-Visual Segmentation

    Full text link
    We propose to explore a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound at the time of the image frame. To facilitate this research, we construct the first audio-visual segmentation benchmark (AVSBench), providing pixel-wise annotations for the sounding objects in audible videos. Two settings are studied with this benchmark: 1) semi-supervised audio-visual segmentation with a single sound source and 2) fully-supervised audio-visual segmentation with multiple sound sources. To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process. We also design a regularization loss to encourage the audio-visual mapping during training. Quantitative and qualitative experiments on the AVSBench compare our approach to several existing methods from related tasks, demonstrating that the proposed method is promising for building a bridge between the audio and pixel-wise visual semantics. Code is available at https://github.com/OpenNLPLab/AVSBench.Comment: ECCV 2022; Correct the equation (3) and update the notation of the evaluation metrics in the last arxiv version; Code is available at https://github.com/OpenNLPLab/AVSBenc

    Unlocking the enigma: unraveling multiple cognitive dysfunction linked to glymphatic impairment in early Alzheimer’s disease

    Get PDF
    BackgroundAlzheimer’s disease (AD) is one of the world’s well-known neurodegenerative diseases, which is related to the balance mechanism of production and clearance of two proteins (amyloid-β and tau) regulated by the glymphatic system. Latest studies have found that AD patients exhibit impairments to their glymphatic system. However, the alterations in the AD disease continuum, especially in the early stages, remain unclear. Moreover, the relationship between the glymphatic system and cognitive dysfunction is still worth exploring.MethodsA novel diffusion tensor image analysis method was applied to evaluate the activity of the glymphatic system by an index for diffusivity along the perivascular space (ALPS-index). Based on this method, the activity of the glymphatic system was noninvasively evaluated in 300 subjects, including 111 normal controls (NC), 120 subjects with mild cognitive impairment (MCI), and 69 subjects with AD. Partial correlation analysis was applied to explore the association between glymphatic system and cognitive impairment based on three domain-general scales and several domain-specific cognitive scales. Receiver operating characteristic curve analysis was used to evaluate the classification performance of ALPS-index along the AD continuum.ResultsALPS-index was significantly different among NC, MCI and AD groups, and ALPS-index decreased with cognitive decline. In addition, ALPS-index was significantly correlated with the scores of the clinical scales (p<0.05, FDR corrected), especially in left hemisphere. Furthermore, combination of ALPS and fractional anisotropy (FA) values achieved better classification results (NC vs. MCI: AUC = 0.6610, NC vs. AD: AUC = 0.8214).ConclusionHere, we show that the glymphatic system is closely associated with multiple cognitive dysfunctions, and ALPS-index can be used as a biomarker for alterations along the AD continuum. This may provide new targets and strategies for the treatment of AD, and has the potential to assist clinical diagnosis

    Robust Multimodal Failure Detection for Microservice Systems

    Full text link
    Proactive failure detection of instances is vitally essential to microservice systems because an instance failure can propagate to the whole system and degrade the system's performance. Over the years, many single-modal (i.e., metrics, logs, or traces) data-based nomaly detection methods have been proposed. However, they tend to miss a large number of failures and generate numerous false alarms because they ignore the correlation of multimodal data. In this work, we propose AnoFusion, an unsupervised failure detection approach, to proactively detect instance failures through multimodal data for microservice systems. It applies a Graph Transformer Network (GTN) to learn the correlation of the heterogeneous multimodal data and integrates a Graph Attention Network (GAT) with Gated Recurrent Unit (GRU) to address the challenges introduced by dynamically changing multimodal data. We evaluate the performance of AnoFusion through two datasets, demonstrating that it achieves the F1-score of 0.857 and 0.922, respectively, outperforming the state-of-the-art failure detection approaches
    • …
    corecore