305 research outputs found

    DISPEL: Domain Generalization via Domain-Specific Liberating

    Full text link
    Domain generalization aims to learn a generalization model that can perform well on unseen test domains by only training on limited source domains. However, existing domain generalization approaches often bring in prediction-irrelevant noise or require the collection of domain labels. To address these challenges, we consider the domain generalization problem from a different perspective by categorizing underlying feature groups into domain-shared and domain-specific features. Nevertheless, the domain-specific features are difficult to be identified and distinguished from the input data. In this work, we propose DomaIn-SPEcific Liberating (DISPEL), a post-processing fine-grained masking approach that can filter out undefined and indistinguishable domain-specific features in the embedding space. Specifically, DISPEL utilizes a mask generator that produces a unique mask for each input data to filter domain-specific features. The DISPEL framework is highly flexible to be applied to any fine-tuned models. We derive a generalization error bound to guarantee the generalization performance by optimizing a designed objective loss. The experimental results on five benchmarks demonstrate DISPEL outperforms existing methods and can further generalize various algorithms

    Efficient XAI Techniques: A Taxonomic Survey

    Full text link
    Recently, there has been a growing demand for the deployment of Explainable Artificial Intelligence (XAI) algorithms in real-world applications. However, traditional XAI methods typically suffer from a high computational complexity problem, which discourages the deployment of real-time systems to meet the time-demanding requirements of real-world scenarios. Although many approaches have been proposed to improve the efficiency of XAI methods, a comprehensive understanding of the achievements and challenges is still needed. To this end, in this paper we provide a review of efficient XAI. Specifically, we categorize existing techniques of XAI acceleration into efficient non-amortized and efficient amortized methods. The efficient non-amortized methods focus on data-centric or model-centric acceleration upon each individual instance. In contrast, amortized methods focus on learning a unified distribution of model explanations, following the predictive, generative, or reinforcement frameworks, to rapidly derive multiple model explanations. We also analyze the limitations of an efficient XAI pipeline from the perspectives of the training phase, the deployment phase, and the use scenarios. Finally, we summarize the challenges of deploying XAI acceleration methods to real-world scenarios, overcoming the trade-off between faithfulness and efficiency, and the selection of different acceleration methods.Comment: 15 pages, 3 figure

    The Impact of Reasoning Step Length on Large Language Models

    Full text link
    Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correlation between the effectiveness of CoT and the length of reasoning steps in prompts remains largely unknown. To shed light on this, we have conducted several empirical experiments to explore the relations. Specifically, we design experiments that expand and compress the rationale reasoning steps within CoT demonstrations while keeping all other factors constant. We have the following key findings. First, the results indicate that lengthening the reasoning steps in prompts, even without adding new information into the prompt, considerably enhances LLMs' reasoning abilities across multiple datasets. Alternatively, shortening the reasoning steps, even while preserving the key information, significantly diminishes the reasoning abilities of models. This finding highlights the importance of the number of steps in CoT prompts and provides practical guidance to make better use of LLMs' potential in complex problem-solving scenarios. Second, we also investigated the relationship between the performance of CoT and the rationales used in demonstrations. Surprisingly, the result shows that even incorrect rationales can yield favorable outcomes if they maintain the requisite length of inference. Third, we observed that the advantages of increasing reasoning steps are task-dependent: simpler tasks require fewer steps, whereas complex tasks gain significantly from longer inference sequences. The code is available at https://github.com/MingyuJ666/The-Impact-of-Reasoning-Step-Length-on-Large-Language-ModelsComment: Findings of ACL 202

    Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

    Full text link
    The flourishing blossom of deep learning has witnessed the rapid development of text recognition in recent years. However, the existing text recognition methods are mainly proposed for English texts. As another widely-spoken language, Chinese text recognition (CTR) in all ways has extensive application markets. Based on our observations, we attribute the scarce attention on CTR to the lack of reasonable dataset construction standards, unified evaluation protocols, and results of the existing baselines. To fill this gap, we manually collect CTR datasets from publicly available competitions, projects, and papers. According to application scenarios, we divide the collected datasets into four categories including scene, web, document, and handwriting datasets. Besides, we standardize the evaluation protocols in CTR. With unified evaluation protocols, we evaluate a series of representative text recognition methods on the collected datasets to provide baselines. The experimental results indicate that the performance of baselines on CTR datasets is not as good as that on English datasets due to the characteristics of Chinese texts that are quite different from the Latin alphabet. Moreover, we observe that by introducing radical-level supervision as an auxiliary task, the performance of baselines can be further boosted. The code and datasets are made publicly available at https://github.com/FudanVI/benchmarking-chinese-text-recognitionComment: Code is available at https://github.com/FudanVI/benchmarking-chinese-text-recognitio

    FaithLM: Towards Faithful Explanations for Large Language Models

    Full text link
    Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their extensive internal knowledge and reasoning capabilities. However, the black-box nature of these models complicates the task of explaining their decision-making processes. While recent advancements demonstrate the potential of leveraging LLMs to self-explain their predictions through natural language (NL) explanations, their explanations may not accurately reflect the LLMs' decision-making process due to a lack of fidelity optimization on the derived explanations. Measuring the fidelity of NL explanations is a challenging issue, as it is difficult to manipulate the input context to mask the semantics of these explanations. To this end, we introduce FaithLM to explain the decision of LLMs with NL explanations. Specifically, FaithLM designs a method for evaluating the fidelity of NL explanations by incorporating the contrary explanations to the query process. Moreover, FaithLM conducts an iterative process to improve the fidelity of derived explanations. Experiment results on three datasets from multiple domains demonstrate that FaithLM can significantly improve the fidelity of derived explanations, which also provides a better alignment with the ground-truth explanations

    Dopamine D2-receptor neurons in nucleus accumbens regulate sevoflurane anesthesia in mice

    Get PDF
    IntroductionThe mechanism of general anesthesia remains elusive. In recent years, numerous investigations have indicated that its mode of action is closely associated with the sleep-wake pathway. As a result, this study aimed to explore the involvement of dopamine D2 receptor (D2R) expressing neurons located in the nucleus accumbens (NAc), a critical nucleus governing sleep-wake regulation, in sevoflurane anesthesia.MethodsThis exploration was carried out using calcium fiber photometry and optogenetics technology, while utilizing cortical electroencephalogram (EEG), loss of righting reflex (LORR), and recovery of righting reflex (RORR) as experimental indicators.ResultsThe findings from calcium fiber photometry revealed a decrease in the activity of NAcD2R neurons during the induction phase of sevoflurane anesthesia, with subsequent recovery observed during the anesthesia’s emergence phase. Moreover, the activation of NAcD2R neurons through optogenetics technology led to a reduction in the anesthesia induction process and an extension of the arousal process in mice. Conversely, the inhibition of these neurons resulted in the opposite effect. Furthermore, the activation of NAcD2R neurons projecting into the ventral pallidum (VP) via optogenetics demonstrated a shortened induction time for mice under sevoflurane anesthesia.DiscussionIn conclusion, our research outcomes suggest that NAcD2R neurons play a promotive role in the sevoflurane general anesthesia process in mice, and their activation can reduce the induction time of anesthesia via the ventral pallidum (VP)

    Adherence to the cMIND and AIDD diets and their associations with anxiety in older adults in China

    Get PDF
    IntroductionAnxiety is highly prevalent among older adults, and dietary interventions targeting nutrition may offer effective, practical strategies for preventing mental disorders. This study aimed to explore the association between the cMIND diet, anti-inflammatory dietary diversity (AIDD), and the risk of anxiety in older adults.MethodsA cross-sectional analysis was conducted using data from the 2018 Chinese Longitudinal Healthy Longevity Survey (CLHLS). Anxiety symptoms were assessed using the Generalized Anxiety Disorder (GAD-7) scale, while adherence to the cMIND diet and AIDD was evaluated through a food frequency questionnaire. Univariable and multivariable logistic regression analyses were performed to examine associations between dietary patterns and anxiety risk, with odds ratios (ORs) and 95% confidence intervals (CIs) reported. Random forest analysis was used to identify key factors influencing anxiety, and sensitivity analyses were conducted to test the robustness of the results.ResultsA total of 13,815 participants aged 65 and older were included, with 1,550 (11.2%) identified with anxiety. Multivariable logistic models indicated that adherence to the cMIND diet or higher AIDD was associated with a 16–26% reduced risk of anxiety, with the adjusted ORs (95% CIs) for the cMIND diet ranging from 0.75 (0.64–0.87) to 0.75 (0.61–0.91), and for AIDD from 0.74 (0.62–0.88) to 0.84 (0.73–0.96). Sensitivity analyses confirmed the stability of these findings. Depression and sleep quality were identified as the most important factors contributing to anxiety, while diet was one of the few modifiable factors.ConclusionThis study provides evidence supporting the association between diet and anxiety in older adults, highlighting the potential of promoting healthy dietary patterns and targeted nutritional interventions as effective strategies for improving mental health in the aging population

    Interpreting Deep Learning-Based Networking Systems

    Full text link
    While many deep learning (DL)-based networking systems have demonstrated superior performance, the underlying Deep Neural Networks (DNNs) remain blackboxes and stay uninterpretable for network operators. The lack of interpretability makes DL-based networking systems prohibitive to deploy in practice. In this paper, we propose Metis, a framework that provides interpretability for two general categories of networking problems spanning local and global control. Accordingly, Metis introduces two different interpretation methods based on decision tree and hypergraph, where it converts DNN policies to interpretable rule-based controllers and highlight critical components based on analysis over hypergraph. We evaluate Metis over several state-of-the-art DL-based networking systems and show that Metis provides human-readable interpretations while preserving nearly no degradation in performance. We further present four concrete use cases of Metis, showcasing how Metis helps network operators to design, debug, deploy, and ad-hoc adjust DL-based networking systems.Comment: To appear at ACM SIGCOMM 202

    AKR1C3 in carcinomas: from multifaceted roles to therapeutic strategies

    Get PDF
    Aldo-Keto Reductase Family 1 Member C3 (AKR1C3), also known as type 5 17β-hydroxysteroid dehydrogenase (17β-HSD5) or prostaglandin F (PGF) synthase, functions as a pivotal enzyme in androgen biosynthesis. It catalyzes the conversion of weak androgens, estrone (a weak estrogen), and PGD2 into potent androgens (testosterone and 5α-dihydrotestosterone), 17β-estradiol (a potent estrogen), and 11β-PGF2α, respectively. Elevated levels of AKR1C3 activate androgen receptor (AR) signaling pathway, contributing to tumor recurrence and imparting resistance to cancer therapies. The overexpression of AKR1C3 serves as an oncogenic factor, promoting carcinoma cell proliferation, invasion, and metastasis, and is correlated with unfavorable prognosis and overall survival in carcinoma patients. Inhibiting AKR1C3 has demonstrated potent efficacy in suppressing tumor progression and overcoming treatment resistance. As a result, the development and design of AKR1C3 inhibitors have garnered increasing interest among researchers, with significant progress witnessed in recent years. Novel AKR1C3 inhibitors, including natural products and analogues of existing drugs designed based on their structures and frameworks, continue to be discovered and developed in laboratories worldwide. The AKR1C3 enzyme has emerged as a key player in carcinoma progression and therapeutic resistance, posing challenges in cancer treatment. This review aims to provide a comprehensive analysis of AKR1C3’s role in carcinoma development, its implications in therapeutic resistance, and recent advancements in the development of AKR1C3 inhibitors for tumor therapies
    • …
    corecore