1,814 research outputs found
COMPENDIUM: a text summarisation tool for generating summaries of multiple purposes, domains, and genres
In this paper, we present a Text Summarisation tool, compendium, capable of generating the most common types of summaries. Regarding the input, single- and multi-document summaries can be produced; as the output, the summaries can be extractive or abstractive-oriented; and finally, concerning their purpose, the summaries can be generic, query-focused, or sentiment-based. The proposed architecture for compendium is divided in various stages, making a distinction between core and additional stages. The former constitute the backbone of the tool and are common for the generation of any type of summary, whereas the latter are used for enhancing the capabilities of the tool. The main contributions of compendium with respect to the state-of-the-art summarisation systems are that (i) it specifically deals with the problem of redundancy, by means of textual entailment; (ii) it combines statistical and cognitive-based techniques for determining relevant content; and (iii) it proposes an abstractive-oriented approach for facing the challenge of abstractive summarisation. The evaluation performed in different domains and textual genres, comprising traditional texts, as well as texts extracted from the Web 2.0, shows that compendium is very competitive and appropriate to be used as a tool for generating summaries.This research has been supported by the project “Desarrollo de TĂ©cnicas Inteligentes e Interactivas de MinerĂa de Textos” (PROMETEO/2009/119) and the project reference ACOMP/2011/001 from the Valencian Government, as well as by the Spanish Government (grant no. TIN2009-13391-C04-01)
MABEL: Attenuating Gender Bias using Textual Entailment Data
Pre-trained language models encode undesirable social biases, which are
further exacerbated in downstream use. To this end, we propose MABEL (a Method
for Attenuating Gender Bias using Entailment Labels), an intermediate
pre-training approach for mitigating gender bias in contextualized
representations. Key to our approach is the use of a contrastive learning
objective on counterfactually augmented, gender-balanced entailment pairs from
natural language inference (NLI) datasets. We also introduce an alignment
regularizer that pulls identical entailment pairs along opposite gender
directions closer. We extensively evaluate our approach on intrinsic and
extrinsic metrics, and show that MABEL outperforms previous task-agnostic
debiasing approaches in terms of fairness. It also preserves task performance
after fine-tuning on downstream tasks. Together, these findings demonstrate the
suitability of NLI data as an effective means of bias mitigation, as opposed to
only using unlabeled sentences in the literature. Finally, we identify that
existing approaches often use evaluation settings that are insufficient or
inconsistent. We make an effort to reproduce and compare previous methods, and
call for unifying the evaluation settings across gender debiasing methods for
better future comparison.Comment: Accepted to EMNLP 2022. Code and models are publicly available at
https://github.com/princeton-nlp/mabe
DeepEval: An Integrated Framework for the Evaluation of Student Responses in Dialogue Based Intelligent Tutoring Systems
The automatic assessment of student answers is one of the critical components of an Intelligent Tutoring System (ITS) because accurate assessment of student input is needed in order to provide effective feedback that leads to learning. But this is a very challenging task because it requires natural language understanding capabilities. The process requires various components, concepts identification, co-reference resolution, ellipsis handling etc. As part of this thesis, we thoroughly analyzed a set of student responses obtained from an experiment with the intelligent tutoring system DeepTutor in which college students interacted with the tutor to solve conceptual physics problems, designed an automatic answer assessment framework (DeepEval), and evaluated the framework after implementing several important components. To evaluate our system, we annotated 618 responses from 41 students for correctness. Our system performs better as compared to the typical similarity calculation method. We also discuss various issues in automatic answer evaluation
"Beware of deception": Detecting Half-Truth and Debunking it through Controlled Claim Editing
The prevalence of half-truths, which are statements containing some truth but
that are ultimately deceptive, has risen with the increasing use of the
internet. To help combat this problem, we have created a comprehensive pipeline
consisting of a half-truth detection model and a claim editing model. Our
approach utilizes the T5 model for controlled claim editing; "controlled" here
means precise adjustments to select parts of a claim. Our methodology achieves
an average BLEU score of 0.88 (on a scale of 0-1) and a disinfo-debunk score of
85% on edited claims. Significantly, our T5-based approach outperforms other
Language Models such as GPT2, RoBERTa, PEGASUS, and Tailor, with average
improvements of 82%, 57%, 42%, and 23% in disinfo-debunk scores, respectively.
By extending the LIAR PLUS dataset, we achieve an F1 score of 82% for the
half-truth detection model, setting a new benchmark in the field. While
previous attempts have been made at half-truth detection, our approach is, to
the best of our knowledge, the first to attempt to debunk half-truths
- …