63 research outputs found

    MGR: Multi-generator based Rationalization

    Full text link
    Rationalization is to employ a generator and a predictor to construct a self-explaining NLP model in which the generator selects a subset of human-intelligible pieces of the input text to the following predictor. However, rationalization suffers from two key challenges, i.e., spurious correlation and degeneration, where the predictor overfits the spurious or meaningless pieces solely selected by the not-yet well-trained generator and in turn deteriorates the generator. Although many studies have been proposed to address the two challenges, they are usually designed separately and do not take both of them into account. In this paper, we propose a simple yet effective method named MGR to simultaneously solve the two problems. The key idea of MGR is to employ multiple generators such that the occurrence stability of real pieces is improved and more meaningful pieces are delivered to the predictor. Empirically, we show that MGR improves the F1 score by up to 20.9% as compared to state-of-the-art methods. Codes are available at https://github.com/jugechengzi/Rationalization-MGR .Comment: Accepted as a main conference paper of ACL 2023. arXiv admin note: text overlap with arXiv:2209.0828

    D-Separation for Causal Self-Explanation

    Full text link
    Rationalization is a self-explaining framework for NLP models. Conventional work typically uses the maximum mutual information (MMI) criterion to find the rationale that is most indicative of the target label. However, this criterion can be influenced by spurious features that correlate with the causal rationale or the target label. Instead of attempting to rectify the issues of the MMI criterion, we propose a novel criterion to uncover the causal rationale, termed the Minimum Conditional Dependence (MCD) criterion, which is grounded on our finding that the non-causal features and the target label are \emph{d-separated} by the causal rationale. By minimizing the dependence between the unselected parts of the input and the target label conditioned on the selected rationale candidate, all the causes of the label are compelled to be selected. In this study, we employ a simple and practical measure of dependence, specifically the KL-divergence, to validate our proposed MCD criterion. Empirically, we demonstrate that MCD improves the F1 score by up to 13.7%13.7\% compared to previous state-of-the-art MMI-based methods. Our code is available at: \url{https://github.com/jugechengzi/Rationalization-MCD}.Comment: NeurIPS 202

    Controllable Textual Inversion for Personalized Text-to-Image Generation

    Full text link
    The recent large-scale generative modeling has attained unprecedented performance especially in producing high-fidelity images driven by text prompts. Text inversion (TI), alongside the text-to-image model backbones, is proposed as an effective technique in personalizing the generation when the prompts contain user-defined, unseen or long-tail concept tokens. Despite that, we find and show that the deployment of TI remains full of "dark-magics" -- to name a few, the harsh requirement of additional datasets, arduous human efforts in the loop and lack of robustness. In this work, we propose a much-enhanced version of TI, dubbed Controllable Textual Inversion (COTI), in resolving all the aforementioned problems and in turn delivering a robust, data-efficient and easy-to-use framework. The core to COTI is a theoretically-guided loss objective instantiated with a comprehensive and novel weighted scoring mechanism, encapsulated by an active-learning paradigm. The extensive results show that COTI significantly outperforms the prior TI-related approaches with a 26.05 decrease in the FID score and a 23.00% boost in the R-precision.Comment: 10 pages, 6 figures, 2 tables. Project Page: https://github.com/jnzju/COT

    DocAsRef: An Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely

    Full text link
    Automated summary quality assessment falls into two categories: reference-based and reference-free. Reference-based metrics, historically deemed more accurate due to the additional information provided by human-written references, are limited by their reliance on human input. In this paper, we hypothesize that the comparison methodologies used by some reference-based metrics to evaluate a system summary against its corresponding reference can be effectively adapted to assess it against its source document, thereby transforming these metrics into reference-free ones. Experimental results support this hypothesis. After being repurposed reference-freely, the zero-shot BERTScore using the pretrained DeBERTa-large-MNLI model of <0.5B parameters consistently outperforms its original reference-based version across various aspects on the SummEval and Newsroom datasets. It also excels in comparison to most existing reference-free metrics and closely competes with zero-shot summary evaluators based on GPT-3.5.Comment: Accepted into Findings of EMNLP 202

    FLAME: Differentially Private Federated Learning in the Shuffle Model

    Full text link
    Federated Learning (FL) is a promising machine learning paradigm that enables the analyzer to train a model without collecting users' raw data. To ensure users' privacy, differentially private federated learning has been intensively studied. The existing works are mainly based on the \textit{curator model} or \textit{local model} of differential privacy. However, both of them have pros and cons. The curator model allows greater accuracy but requires a trusted analyzer. In the local model where users randomize local data before sending them to the analyzer, a trusted analyzer is not required but the accuracy is limited. In this work, by leveraging the \textit{privacy amplification} effect in the recently proposed shuffle model of differential privacy, we achieve the best of two worlds, i.e., accuracy in the curator model and strong privacy without relying on any trusted party. We first propose an FL framework in the shuffle model and a simple protocol (SS-Simple) extended from existing work. We find that SS-Simple only provides an insufficient privacy amplification effect in FL since the dimension of the model parameter is quite large. To solve this challenge, we propose an enhanced protocol (SS-Double) to increase the privacy amplification effect by subsampling. Furthermore, for boosting the utility when the model size is greater than the user population, we propose an advanced protocol (SS-Topk) with gradient sparsification techniques. We also provide theoretical analysis and numerical evaluations of the privacy amplification of the proposed protocols. Experiments on real-world dataset validate that SS-Topk improves the testing accuracy by 60.7\% than the local model based FL.Comment: accepted by AAAI-2

    PrivateRec: Differentially Private Training and Serving for Federated News Recommendation

    Full text link
    Privacy protection is an essential issue in personalized news recommendation, and federated learning can potentially mitigate the privacy concern by training personalized news recommendation models over decentralized user data.For a theoretical privacy guarantee, differential privacy is necessary. However, applying differential privacy to federated recommendation training and serving conventionally suffers from the unsatisfactory trade-off between privacy and utility due to the high-dimensional characteristics of model gradients and hidden representations. In addition, there is no formal privacy guarantee for both training and serving in federated recommendation. In this paper, we propose a unified federated news recommendation method for effective and privacy-preserving model training and online serving with differential privacy guarantees. We first clarify the notion of differential privacy over users' behavior data for both model training and online serving in the federated recommendation scenario. Next, we propose a privacy-preserving online serving mechanism under this definition with differentially private user interest decomposition. More specifically, it decomposes the high-dimensional and privacy-sensitive user embedding into a combination of public basic vectors and adds noise to the combination coefficients. In this way, it can avoid the dimension curse and improve the utility by reducing the required noise intensity for differential privacy. Besides, we design a federated recommendation model training method with differential privacy, which can avoid the dimension-dependent noise for large models via label permutation and differentially private attention modules. Experiments on real-world news recommendation datasets validate the effectiveness of our method in achieving a good trade-off between privacy protection and utility for federated news recommendations

    Association of inflammatory indicators with intensive care unit mortality in critically ill patients with coronary heart disease.

    Get PDF
    ObjectiveCoronary heart disease (CHD) is one of the major cardiovascular diseases, a common chronic disease in the elderly and a major cause of disability and death in the world. Currently, intensive care unit (ICU) patients have a high probability of concomitant coronary artery disease, and the mortality of this category of patients in the ICU is receiving increasing attention. Therefore, the aim of this study was to verify whether the composite inflammatory indicators are significantly associated with ICU mortality in ICU patients with CHD and to develop a simple personalized prediction model.Method7115 patients from the Multi-Parameter Intelligent Monitoring in Intensive Care Database IV were randomly assigned to the training cohort (n = 5692) and internal validation cohort (n = 1423), and 701 patients from the eICU Collaborative Research Database served as the external validation cohort. The association between various inflammatory indicators and ICU mortality was determined by multivariate Logistic regression analysis and Cox proportional hazards model. Subsequently, a novel predictive model for mortality in ICU patients with CHD was developed in the training cohort and performance was evaluated in the internal and external validation cohorts.ResultsVarious inflammatory indicators were demonstrated to be significantly associated with ICU mortality, 30-day ICU mortality, and 90-day ICU mortality in ICU patients with CHD by Logistic regression analysis and Cox proportional hazards model. The area under the curve of the novel predictive model for ICU mortality in ICU patients with CHD was 0.885 for the internal validation cohort and 0.726 for the external validation cohort. The calibration curve showed that the predicted probabilities of the model matched the actual observed probabilities. Furthermore, the decision curve analysis showed that the novel prediction model had a high net clinical benefit.ConclusionIn ICU patients with CHD, various inflammatory indicators were independent risk factors for ICU mortality. We constructed a novel predictive model of ICU mortality risk in ICU patients with CHD that had great potential to guide clinical decision-making

    Climate, not grazing, influences soil microbial diversity through changes in vegetation and abiotic factors on geographical patterns in the Eurasian steppe

    Get PDF
    Livestock grazing has a significant impact on the biodiversity of nature grassland ecosystems, which is mainly regulated by climate factors. Soil microbes are essential components of biogeochemical cycles. However, the coupling effects of grazing with MAT (mean annual temperature) and MAP (mean annual precipitation) on soil microbial communities remain inconsistent. Our study considered the various climates in four grasslands as natural temperature and precipitation gradients combined with grazing intensity (GI). We collected and analyzed vegetation and soil physiochemical properties from four grasslands. Our results showed that climate factors (CF) changed β diversity of soil bacteria and fungi while grazing intensity and their interaction merely affected fungi β diversity. Furthermore, climate factors and grazing intensity impacted changes in vegetation and soil physiochemical properties, with their interaction leading to changes in EC and MBC. Our analysis revealed that climate factors contributed 13.1% to bacteria community variation while grazing intensity contributed 3.01% to fungi community variation. Piecewise SEM analysis demonstrated that MAT and MAP were essential predictors of bacteria β diversity, which was significantly affected by vegetation and soil carbon and nitrogen. At the same time, MAP was an essential factor of fungi β diversity and was mainly affected by soil nitrogen. Our study indicated that bacteria and fungi β diversity was affected by different environmental processes and can adapt to specific grazing intensities over time
    corecore