305 research outputs found
DISPEL: Domain Generalization via Domain-Specific Liberating
Domain generalization aims to learn a generalization model that can perform
well on unseen test domains by only training on limited source domains.
However, existing domain generalization approaches often bring in
prediction-irrelevant noise or require the collection of domain labels. To
address these challenges, we consider the domain generalization problem from a
different perspective by categorizing underlying feature groups into
domain-shared and domain-specific features. Nevertheless, the domain-specific
features are difficult to be identified and distinguished from the input data.
In this work, we propose DomaIn-SPEcific Liberating (DISPEL), a post-processing
fine-grained masking approach that can filter out undefined and
indistinguishable domain-specific features in the embedding space.
Specifically, DISPEL utilizes a mask generator that produces a unique mask for
each input data to filter domain-specific features. The DISPEL framework is
highly flexible to be applied to any fine-tuned models. We derive a
generalization error bound to guarantee the generalization performance by
optimizing a designed objective loss. The experimental results on five
benchmarks demonstrate DISPEL outperforms existing methods and can further
generalize various algorithms
Efficient XAI Techniques: A Taxonomic Survey
Recently, there has been a growing demand for the deployment of Explainable
Artificial Intelligence (XAI) algorithms in real-world applications. However,
traditional XAI methods typically suffer from a high computational complexity
problem, which discourages the deployment of real-time systems to meet the
time-demanding requirements of real-world scenarios. Although many approaches
have been proposed to improve the efficiency of XAI methods, a comprehensive
understanding of the achievements and challenges is still needed. To this end,
in this paper we provide a review of efficient XAI. Specifically, we categorize
existing techniques of XAI acceleration into efficient non-amortized and
efficient amortized methods. The efficient non-amortized methods focus on
data-centric or model-centric acceleration upon each individual instance. In
contrast, amortized methods focus on learning a unified distribution of model
explanations, following the predictive, generative, or reinforcement
frameworks, to rapidly derive multiple model explanations. We also analyze the
limitations of an efficient XAI pipeline from the perspectives of the training
phase, the deployment phase, and the use scenarios. Finally, we summarize the
challenges of deploying XAI acceleration methods to real-world scenarios,
overcoming the trade-off between faithfulness and efficiency, and the selection
of different acceleration methods.Comment: 15 pages, 3 figure
The Impact of Reasoning Step Length on Large Language Models
Chain of Thought (CoT) is significant in improving the reasoning abilities of
large language models (LLMs). However, the correlation between the
effectiveness of CoT and the length of reasoning steps in prompts remains
largely unknown. To shed light on this, we have conducted several empirical
experiments to explore the relations. Specifically, we design experiments that
expand and compress the rationale reasoning steps within CoT demonstrations
while keeping all other factors constant. We have the following key findings.
First, the results indicate that lengthening the reasoning steps in prompts,
even without adding new information into the prompt, considerably enhances
LLMs' reasoning abilities across multiple datasets. Alternatively, shortening
the reasoning steps, even while preserving the key information, significantly
diminishes the reasoning abilities of models. This finding highlights the
importance of the number of steps in CoT prompts and provides practical
guidance to make better use of LLMs' potential in complex problem-solving
scenarios. Second, we also investigated the relationship between the
performance of CoT and the rationales used in demonstrations. Surprisingly, the
result shows that even incorrect rationales can yield favorable outcomes if
they maintain the requisite length of inference. Third, we observed that the
advantages of increasing reasoning steps are task-dependent: simpler tasks
require fewer steps, whereas complex tasks gain significantly from longer
inference sequences. The code is available at
https://github.com/MingyuJ666/The-Impact-of-Reasoning-Step-Length-on-Large-Language-ModelsComment: Findings of ACL 202
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study
The flourishing blossom of deep learning has witnessed the rapid development
of text recognition in recent years. However, the existing text recognition
methods are mainly proposed for English texts. As another widely-spoken
language, Chinese text recognition (CTR) in all ways has extensive application
markets. Based on our observations, we attribute the scarce attention on CTR to
the lack of reasonable dataset construction standards, unified evaluation
protocols, and results of the existing baselines. To fill this gap, we manually
collect CTR datasets from publicly available competitions, projects, and
papers. According to application scenarios, we divide the collected datasets
into four categories including scene, web, document, and handwriting datasets.
Besides, we standardize the evaluation protocols in CTR. With unified
evaluation protocols, we evaluate a series of representative text recognition
methods on the collected datasets to provide baselines. The experimental
results indicate that the performance of baselines on CTR datasets is not as
good as that on English datasets due to the characteristics of Chinese texts
that are quite different from the Latin alphabet. Moreover, we observe that by
introducing radical-level supervision as an auxiliary task, the performance of
baselines can be further boosted. The code and datasets are made publicly
available at https://github.com/FudanVI/benchmarking-chinese-text-recognitionComment: Code is available at
https://github.com/FudanVI/benchmarking-chinese-text-recognitio
FaithLM: Towards Faithful Explanations for Large Language Models
Large Language Models (LLMs) have become proficient in addressing complex
tasks by leveraging their extensive internal knowledge and reasoning
capabilities. However, the black-box nature of these models complicates the
task of explaining their decision-making processes. While recent advancements
demonstrate the potential of leveraging LLMs to self-explain their predictions
through natural language (NL) explanations, their explanations may not
accurately reflect the LLMs' decision-making process due to a lack of fidelity
optimization on the derived explanations. Measuring the fidelity of NL
explanations is a challenging issue, as it is difficult to manipulate the input
context to mask the semantics of these explanations. To this end, we introduce
FaithLM to explain the decision of LLMs with NL explanations. Specifically,
FaithLM designs a method for evaluating the fidelity of NL explanations by
incorporating the contrary explanations to the query process. Moreover, FaithLM
conducts an iterative process to improve the fidelity of derived explanations.
Experiment results on three datasets from multiple domains demonstrate that
FaithLM can significantly improve the fidelity of derived explanations, which
also provides a better alignment with the ground-truth explanations
Dopamine D2-receptor neurons in nucleus accumbens regulate sevoflurane anesthesia in mice
IntroductionThe mechanism of general anesthesia remains elusive. In recent years, numerous investigations have indicated that its mode of action is closely associated with the sleep-wake pathway. As a result, this study aimed to explore the involvement of dopamine D2 receptor (D2R) expressing neurons located in the nucleus accumbens (NAc), a critical nucleus governing sleep-wake regulation, in sevoflurane anesthesia.MethodsThis exploration was carried out using calcium fiber photometry and optogenetics technology, while utilizing cortical electroencephalogram (EEG), loss of righting reflex (LORR), and recovery of righting reflex (RORR) as experimental indicators.ResultsThe findings from calcium fiber photometry revealed a decrease in the activity of NAcD2R neurons during the induction phase of sevoflurane anesthesia, with subsequent recovery observed during the anesthesia’s emergence phase. Moreover, the activation of NAcD2R neurons through optogenetics technology led to a reduction in the anesthesia induction process and an extension of the arousal process in mice. Conversely, the inhibition of these neurons resulted in the opposite effect. Furthermore, the activation of NAcD2R neurons projecting into the ventral pallidum (VP) via optogenetics demonstrated a shortened induction time for mice under sevoflurane anesthesia.DiscussionIn conclusion, our research outcomes suggest that NAcD2R neurons play a promotive role in the sevoflurane general anesthesia process in mice, and their activation can reduce the induction time of anesthesia via the ventral pallidum (VP)
Adherence to the cMIND and AIDD diets and their associations with anxiety in older adults in China
IntroductionAnxiety is highly prevalent among older adults, and dietary interventions targeting nutrition may offer effective, practical strategies for preventing mental disorders. This study aimed to explore the association between the cMIND diet, anti-inflammatory dietary diversity (AIDD), and the risk of anxiety in older adults.MethodsA cross-sectional analysis was conducted using data from the 2018 Chinese Longitudinal Healthy Longevity Survey (CLHLS). Anxiety symptoms were assessed using the Generalized Anxiety Disorder (GAD-7) scale, while adherence to the cMIND diet and AIDD was evaluated through a food frequency questionnaire. Univariable and multivariable logistic regression analyses were performed to examine associations between dietary patterns and anxiety risk, with odds ratios (ORs) and 95% confidence intervals (CIs) reported. Random forest analysis was used to identify key factors influencing anxiety, and sensitivity analyses were conducted to test the robustness of the results.ResultsA total of 13,815 participants aged 65 and older were included, with 1,550 (11.2%) identified with anxiety. Multivariable logistic models indicated that adherence to the cMIND diet or higher AIDD was associated with a 16–26% reduced risk of anxiety, with the adjusted ORs (95% CIs) for the cMIND diet ranging from 0.75 (0.64–0.87) to 0.75 (0.61–0.91), and for AIDD from 0.74 (0.62–0.88) to 0.84 (0.73–0.96). Sensitivity analyses confirmed the stability of these findings. Depression and sleep quality were identified as the most important factors contributing to anxiety, while diet was one of the few modifiable factors.ConclusionThis study provides evidence supporting the association between diet and anxiety in older adults, highlighting the potential of promoting healthy dietary patterns and targeted nutritional interventions as effective strategies for improving mental health in the aging population
Interpreting Deep Learning-Based Networking Systems
While many deep learning (DL)-based networking systems have demonstrated
superior performance, the underlying Deep Neural Networks (DNNs) remain
blackboxes and stay uninterpretable for network operators. The lack of
interpretability makes DL-based networking systems prohibitive to deploy in
practice. In this paper, we propose Metis, a framework that provides
interpretability for two general categories of networking problems spanning
local and global control. Accordingly, Metis introduces two different
interpretation methods based on decision tree and hypergraph, where it converts
DNN policies to interpretable rule-based controllers and highlight critical
components based on analysis over hypergraph. We evaluate Metis over several
state-of-the-art DL-based networking systems and show that Metis provides
human-readable interpretations while preserving nearly no degradation in
performance. We further present four concrete use cases of Metis, showcasing
how Metis helps network operators to design, debug, deploy, and ad-hoc adjust
DL-based networking systems.Comment: To appear at ACM SIGCOMM 202
AKR1C3 in carcinomas: from multifaceted roles to therapeutic strategies
Aldo-Keto Reductase Family 1 Member C3 (AKR1C3), also known as type 5 17β-hydroxysteroid dehydrogenase (17β-HSD5) or prostaglandin F (PGF) synthase, functions as a pivotal enzyme in androgen biosynthesis. It catalyzes the conversion of weak androgens, estrone (a weak estrogen), and PGD2 into potent androgens (testosterone and 5α-dihydrotestosterone), 17β-estradiol (a potent estrogen), and 11β-PGF2α, respectively. Elevated levels of AKR1C3 activate androgen receptor (AR) signaling pathway, contributing to tumor recurrence and imparting resistance to cancer therapies. The overexpression of AKR1C3 serves as an oncogenic factor, promoting carcinoma cell proliferation, invasion, and metastasis, and is correlated with unfavorable prognosis and overall survival in carcinoma patients. Inhibiting AKR1C3 has demonstrated potent efficacy in suppressing tumor progression and overcoming treatment resistance. As a result, the development and design of AKR1C3 inhibitors have garnered increasing interest among researchers, with significant progress witnessed in recent years. Novel AKR1C3 inhibitors, including natural products and analogues of existing drugs designed based on their structures and frameworks, continue to be discovered and developed in laboratories worldwide. The AKR1C3 enzyme has emerged as a key player in carcinoma progression and therapeutic resistance, posing challenges in cancer treatment. This review aims to provide a comprehensive analysis of AKR1C3’s role in carcinoma development, its implications in therapeutic resistance, and recent advancements in the development of AKR1C3 inhibitors for tumor therapies
- …