63 research outputs found
MGR: Multi-generator based Rationalization
Rationalization is to employ a generator and a predictor to construct a
self-explaining NLP model in which the generator selects a subset of
human-intelligible pieces of the input text to the following predictor.
However, rationalization suffers from two key challenges, i.e., spurious
correlation and degeneration, where the predictor overfits the spurious or
meaningless pieces solely selected by the not-yet well-trained generator and in
turn deteriorates the generator. Although many studies have been proposed to
address the two challenges, they are usually designed separately and do not
take both of them into account. In this paper, we propose a simple yet
effective method named MGR to simultaneously solve the two problems. The key
idea of MGR is to employ multiple generators such that the occurrence stability
of real pieces is improved and more meaningful pieces are delivered to the
predictor. Empirically, we show that MGR improves the F1 score by up to 20.9%
as compared to state-of-the-art methods. Codes are available at
https://github.com/jugechengzi/Rationalization-MGR .Comment: Accepted as a main conference paper of ACL 2023. arXiv admin note:
text overlap with arXiv:2209.0828
D-Separation for Causal Self-Explanation
Rationalization is a self-explaining framework for NLP models. Conventional
work typically uses the maximum mutual information (MMI) criterion to find the
rationale that is most indicative of the target label. However, this criterion
can be influenced by spurious features that correlate with the causal rationale
or the target label. Instead of attempting to rectify the issues of the MMI
criterion, we propose a novel criterion to uncover the causal rationale, termed
the Minimum Conditional Dependence (MCD) criterion, which is grounded on our
finding that the non-causal features and the target label are
\emph{d-separated} by the causal rationale. By minimizing the dependence
between the unselected parts of the input and the target label conditioned on
the selected rationale candidate, all the causes of the label are compelled to
be selected. In this study, we employ a simple and practical measure of
dependence, specifically the KL-divergence, to validate our proposed MCD
criterion. Empirically, we demonstrate that MCD improves the F1 score by up to
compared to previous state-of-the-art MMI-based methods. Our code is
available at: \url{https://github.com/jugechengzi/Rationalization-MCD}.Comment: NeurIPS 202
Controllable Textual Inversion for Personalized Text-to-Image Generation
The recent large-scale generative modeling has attained unprecedented
performance especially in producing high-fidelity images driven by text
prompts. Text inversion (TI), alongside the text-to-image model backbones, is
proposed as an effective technique in personalizing the generation when the
prompts contain user-defined, unseen or long-tail concept tokens. Despite that,
we find and show that the deployment of TI remains full of "dark-magics" -- to
name a few, the harsh requirement of additional datasets, arduous human efforts
in the loop and lack of robustness. In this work, we propose a much-enhanced
version of TI, dubbed Controllable Textual Inversion (COTI), in resolving all
the aforementioned problems and in turn delivering a robust, data-efficient and
easy-to-use framework. The core to COTI is a theoretically-guided loss
objective instantiated with a comprehensive and novel weighted scoring
mechanism, encapsulated by an active-learning paradigm. The extensive results
show that COTI significantly outperforms the prior TI-related approaches with a
26.05 decrease in the FID score and a 23.00% boost in the R-precision.Comment: 10 pages, 6 figures, 2 tables. Project Page:
https://github.com/jnzju/COT
DocAsRef: An Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely
Automated summary quality assessment falls into two categories:
reference-based and reference-free. Reference-based metrics, historically
deemed more accurate due to the additional information provided by
human-written references, are limited by their reliance on human input. In this
paper, we hypothesize that the comparison methodologies used by some
reference-based metrics to evaluate a system summary against its corresponding
reference can be effectively adapted to assess it against its source document,
thereby transforming these metrics into reference-free ones. Experimental
results support this hypothesis. After being repurposed reference-freely, the
zero-shot BERTScore using the pretrained DeBERTa-large-MNLI model of <0.5B
parameters consistently outperforms its original reference-based version across
various aspects on the SummEval and Newsroom datasets. It also excels in
comparison to most existing reference-free metrics and closely competes with
zero-shot summary evaluators based on GPT-3.5.Comment: Accepted into Findings of EMNLP 202
FLAME: Differentially Private Federated Learning in the Shuffle Model
Federated Learning (FL) is a promising machine learning paradigm that enables
the analyzer to train a model without collecting users' raw data. To ensure
users' privacy, differentially private federated learning has been intensively
studied. The existing works are mainly based on the \textit{curator model} or
\textit{local model} of differential privacy. However, both of them have pros
and cons. The curator model allows greater accuracy but requires a trusted
analyzer. In the local model where users randomize local data before sending
them to the analyzer, a trusted analyzer is not required but the accuracy is
limited. In this work, by leveraging the \textit{privacy amplification} effect
in the recently proposed shuffle model of differential privacy, we achieve the
best of two worlds, i.e., accuracy in the curator model and strong privacy
without relying on any trusted party. We first propose an FL framework in the
shuffle model and a simple protocol (SS-Simple) extended from existing work. We
find that SS-Simple only provides an insufficient privacy amplification effect
in FL since the dimension of the model parameter is quite large. To solve this
challenge, we propose an enhanced protocol (SS-Double) to increase the privacy
amplification effect by subsampling. Furthermore, for boosting the utility when
the model size is greater than the user population, we propose an advanced
protocol (SS-Topk) with gradient sparsification techniques. We also provide
theoretical analysis and numerical evaluations of the privacy amplification of
the proposed protocols. Experiments on real-world dataset validate that SS-Topk
improves the testing accuracy by 60.7\% than the local model based FL.Comment: accepted by AAAI-2
PrivateRec: Differentially Private Training and Serving for Federated News Recommendation
Privacy protection is an essential issue in personalized news recommendation,
and federated learning can potentially mitigate the privacy concern by training
personalized news recommendation models over decentralized user data.For a
theoretical privacy guarantee, differential privacy is necessary. However,
applying differential privacy to federated recommendation training and serving
conventionally suffers from the unsatisfactory trade-off between privacy and
utility due to the high-dimensional characteristics of model gradients and
hidden representations. In addition, there is no formal privacy guarantee for
both training and serving in federated recommendation. In this paper, we
propose a unified federated news recommendation method for effective and
privacy-preserving model training and online serving with differential privacy
guarantees. We first clarify the notion of differential privacy over users'
behavior data for both model training and online serving in the federated
recommendation scenario. Next, we propose a privacy-preserving online serving
mechanism under this definition with differentially private user interest
decomposition. More specifically, it decomposes the high-dimensional and
privacy-sensitive user embedding into a combination of public basic vectors and
adds noise to the combination coefficients. In this way, it can avoid the
dimension curse and improve the utility by reducing the required noise
intensity for differential privacy. Besides, we design a federated
recommendation model training method with differential privacy, which can avoid
the dimension-dependent noise for large models via label permutation and
differentially private attention modules. Experiments on real-world news
recommendation datasets validate the effectiveness of our method in achieving a
good trade-off between privacy protection and utility for federated news
recommendations
Association of inflammatory indicators with intensive care unit mortality in critically ill patients with coronary heart disease.
ObjectiveCoronary heart disease (CHD) is one of the major cardiovascular diseases, a common chronic disease in the elderly and a major cause of disability and death in the world. Currently, intensive care unit (ICU) patients have a high probability of concomitant coronary artery disease, and the mortality of this category of patients in the ICU is receiving increasing attention. Therefore, the aim of this study was to verify whether the composite inflammatory indicators are significantly associated with ICU mortality in ICU patients with CHD and to develop a simple personalized prediction model.Method7115 patients from the Multi-Parameter Intelligent Monitoring in Intensive Care Database IV were randomly assigned to the training cohort (n = 5692) and internal validation cohort (n = 1423), and 701 patients from the eICU Collaborative Research Database served as the external validation cohort. The association between various inflammatory indicators and ICU mortality was determined by multivariate Logistic regression analysis and Cox proportional hazards model. Subsequently, a novel predictive model for mortality in ICU patients with CHD was developed in the training cohort and performance was evaluated in the internal and external validation cohorts.ResultsVarious inflammatory indicators were demonstrated to be significantly associated with ICU mortality, 30-day ICU mortality, and 90-day ICU mortality in ICU patients with CHD by Logistic regression analysis and Cox proportional hazards model. The area under the curve of the novel predictive model for ICU mortality in ICU patients with CHD was 0.885 for the internal validation cohort and 0.726 for the external validation cohort. The calibration curve showed that the predicted probabilities of the model matched the actual observed probabilities. Furthermore, the decision curve analysis showed that the novel prediction model had a high net clinical benefit.ConclusionIn ICU patients with CHD, various inflammatory indicators were independent risk factors for ICU mortality. We constructed a novel predictive model of ICU mortality risk in ICU patients with CHD that had great potential to guide clinical decision-making
Climate, not grazing, influences soil microbial diversity through changes in vegetation and abiotic factors on geographical patterns in the Eurasian steppe
Livestock grazing has a significant impact on the biodiversity of nature grassland ecosystems, which is mainly regulated by climate factors. Soil microbes are essential components of biogeochemical cycles. However, the coupling effects of grazing with MAT (mean annual temperature) and MAP (mean annual precipitation) on soil microbial communities remain inconsistent. Our study considered the various climates in four grasslands as natural temperature and precipitation gradients combined with grazing intensity (GI). We collected and analyzed vegetation and soil physiochemical properties from four grasslands. Our results showed that climate factors (CF) changed β diversity of soil bacteria and fungi while grazing intensity and their interaction merely affected fungi β diversity. Furthermore, climate factors and grazing intensity impacted changes in vegetation and soil physiochemical properties, with their interaction leading to changes in EC and MBC. Our analysis revealed that climate factors contributed 13.1% to bacteria community variation while grazing intensity contributed 3.01% to fungi community variation. Piecewise SEM analysis demonstrated that MAT and MAP were essential predictors of bacteria β diversity, which was significantly affected by vegetation and soil carbon and nitrogen. At the same time, MAP was an essential factor of fungi β diversity and was mainly affected by soil nitrogen. Our study indicated that bacteria and fungi β diversity was affected by different environmental processes and can adapt to specific grazing intensities over time
- …