Search CORE

89 research outputs found

Recommended from our members

Deciding when, how and for whom to simplify

Author: Madhyastha P.
Scarton C.
Specia L.
Publication venue: 'IOS Press'
Publication date: 01/01/2020
Field of study

Current Automatic Text Simplification (TS) work relies on sequence-to-sequence neural models that learn simplification operations from parallel complex-simple corpora. In this paper we address three open challenges in these approaches: (i) avoiding unnecessary transformations, (ii) determining which operations to perform, and (iii) generating simplifications that are suitable for a given target audience. For (i), we propose joint and two-stage approaches where instances are marked or classified as simple or complex. For (ii) and (iii), we propose fusion-based approaches to incorporate information on the target grade level as well as the types of operation to perform in the models. While grade-level information is provided as metadata, we devise predictors for the type of operation. We study different representations for this information as well as different ways in which it is used in the models. Our approach outperforms previous work on neural TS, with our best model following the two-stage approach and using the information about grade level and type of operation to initialise the encoder and the decoder, respectively

City Research Online

White Rose Research Online

Data-driven sentence simplification: Survey and benchmark

Author: Alva-Manchego F.
Scarton C.
Specia L.
Publication venue: MIT Press
Publication date: 15/09/2019
Field of study

Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand. In order to do so, several rewriting transformations can be performed such as replacement, reordering, and splitting. Executing these transformations while keeping sentences grammatical, preserving their main idea, and generating simpler output, is a challenging and still far from solved problem. In this article, we survey research on SS, focusing on approaches that attempt to learn how to simplify using corpora of aligned original-simplified sentence pairs in English, which is the dominant paradigm nowadays. We also include a benchmark of different approaches on common datasets so as to compare them and highlight their strengths and limitations. We expect that this survey will serve as a starting point for researchers interested in the task and help spark new ideas for future developments

Online Research @ Cardiff

Spiral - Imperial College Digital Repository

White Rose Research Online

Multistage BiCross encoder for multilingual access to COVID-19 health information

Author: Bontcheva K.
Scarton C.
Singh I.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

The Coronavirus (COVID-19) pandemic has led to a rapidly growing ‘infodemic’ of health information online. This has motivated the need for accurate semantic search and retrieval of reliable COVID-19 information across millions of documents, in multiple languages. To address this challenge, this paper proposes a novel high precision and high recall neural Multistage BiCross encoder approach. It is a sequential three-stage ranking pipeline which uses the Okapi BM25 retrieval algorithm and transformer-based bi-encoder and cross-encoder to effectively rank the documents with respect to the given query. We present experimental results from our participation in the Multilingual Information Access (MLIA) shared task on COVID-19 multilingual semantic search. The independently evaluated MLIA results validate our approach and demonstrate that it outperforms other state-of-the-art approaches according to nearly all evaluation metrics in cases of both monolingual and bilingual runs

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

White Rose Research Online

MTCue: learning zero-shot control of extra-textual attributes by leveraging unstructured context in neural machine translation

Author: Flynn R.
Scarton C.
Vincent S.
Publication venue: Association for Computational Linguistics
Publication date: 01/07/2023
Field of study

Efficient utilisation of both intra- and extra-textual context remains one of the critical gaps between machine and human translation. Existing research has primarily focused on providing individual, well-defined types of context in translation, such as the surrounding text or discrete external variables like the speaker's gender. This work introduces MTCUE, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text. MTCUE learns an abstract representation of context, enabling transferability across different data settings and leveraging similar attributes in low-resource scenarios. With a focus on a dialogue domain with access to document and metadata context, we extensively evaluate MTCUE in four language pairs in both translation directions. Our framework demonstrates significant improvements in translation quality over a parameter-matched non-contextual baseline, as measured by BLEU (+0.88) and COMET (+1.58). Moreover, MTCUE significantly outperforms a “tagging” baseline at translating English text. Analysis reveals that the context encoder of MTCUE learns a representation space that organises context based on specific attributes, such as formality, enabling effective zero-shot control. Pretraining on context embeddings also improves MTCUE's few-shot performance compared to the “tagging” baseline. Finally, an ablation study conducted on model components and contextual variables further supports the robustness of MTCUE for context-based NMT

White Rose Research Online

Measuring what counts : the case of rumour stance classification

Author: Bontcheva K.
Scarton C.
Silva D.F.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/12/2020
Field of study

Stance classification can be a powerful tool for understanding whether and which users believe in online rumours. The task aims to automatically predict the stance of replies towards a given rumour, namely support, deny, question, or comment. Numerous methods have been proposed and their performance compared in the RumourEval shared tasks in 2017 and 2019. Results demonstrated that this is a challenging problem since naturally occurring rumour stance data is highly imbalanced. This paper specifically questions the evaluation metrics used in these shared tasks. We re-evaluate the systems submitted to the two RumourEval tasks and show that the two widely adopted metrics – accuracy and macro-F1 – are not robust for the four-class imbalanced task of rumour stance classification, as they wrongly favour systems with highly skewed accuracy towards the majority class. To overcome this problem, we propose new evaluation metrics for rumour stance detection. These are not only robust to imbalanced data but also score higher systems that are capable of recognising the two most informative minority classes (support and deny)

White Rose Research Online

UTDRM: unsupervised method for training debunked-narrative retrieval models

Author: Bontcheva K.
Scarton C.
Singh I.
Publication venue: Springer Science and Business Media LLC
Publication date: 13/12/2023
Field of study

A key task in the fact-checking workflow is to establish whether the claim under investigation has already been debunked or fact-checked before. This is essentially a retrieval task where a misinformation claim is used as a query to retrieve from a corpus of debunks. Prior debunk retrieval methods have typically been trained on annotated pairs of misinformation claims and debunks. The novelty of this paper is an Unsupervised Method for Training Debunked-Narrative Retrieval Models (UTDRM) in a zero-shot setting, eliminating the need for human-annotated pairs. This approach leverages fact-checking articles for the generation of synthetic claims and employs a neural retrieval model for training. Our experiments show that UTDRM tends to match or exceed the performance of state-of-the-art methods on seven datasets, which demonstrates its effectiveness and broad applicability. The paper also analyses the impact of various factors on UTDRM’s performance, such as the quantity of fact-checking articles utilised, the number of synthetically generated claims employed, the proposed entity inoculation method, and the usage of large language models for retrieval

White Rose Research Online

Testing the performance of an innovative markerless technique for quantitative and qualitative gait analysis

Author: Gerli F.
Gori F.
Macchi C.
Pasquini G.
Pogliaghi S.
Scarton A.
Simoni L.
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Gait abnormalities such as high stride and step frequency/cadence (SF-stride/second, CAD-step/second), stride variability (SV) and low harmony may increase the risk of injuries and be a sentinel of medical conditions. This research aims to present a new markerless video-based technology for quantitative and qualitative gait analysis. 86 healthy individuals (mead age 32 years) performed a 90 s test on treadmill at self-selected walking speed. We measured SF and CAD by a photoelectric sensors system; then, we calculated average \ub1 standard deviation (SD) and within-subject coefficient of variation (CV) of SF as an index of SV. We also recorded a 60 fps video of the patient. With a custom-designed web-based video analysis software, we performed a spectral analysis of the brightness over time for each pixel of the image, that reinstituted the frequency contents of the videos. The two main frequency contents (F1 and F2) from this analysis should reflect the forcing/dominant variables, i.e., SF and CAD. Then, a harmony index (HI) was calculated, that should reflect the proportion of the pixels of the image that move consistently with F1 or its supraharmonics. The higher the HI value, the less variable the gait. The correspondence SF-F1 and CAD-F2 was evaluated with both paired t-Test and correlation and the relationship between SV and HI with correlation. SF and CAD were not significantly different from and highly correlated with F1 (0.893 \ub1 0.080 Hz vs. 0.895 \ub1 0.084 Hz, p < 0.001, r2 = 0.99) and F2 (1.787 \ub1 0.163 Hz vs. 1.791 \ub1 0.165 Hz, p < 0.001, r2 = 0.97). The SV was 1.84% \ub1 0.66% and it was significantly and moderately correlated with HI (0.082 \ub1 0.028, p < 0.001, r2 = 0.13). The innovative video-based technique of global, markerless gait analysis proposed in our study accurately identifies the main frequency contents and the variability of gait in healthy individuals, thus providing a time-efficient, low-cost means to quantitatively and qualitatively study human locomotion

Multidisciplinary Digital Publishing Institute

Florence Research

Catalogo dei prodotti della ricerca

The (un)suitability of automatic evaluation metrics for text simplification

Author: Alva-Manchego F.
Scarton C.
Specia L.
Publication venue: 'MIT Press - Journals'
Publication date: 23/12/2021
Field of study

In order to simplify sentences, several rewriting operations can be performed such as replacing complex words per simpler synonyms, deleting unnecessary information, and splitting long sentences. Despite this multi-operation nature, evaluation of automatic simplification systems relies on metrics that moderately correlate with human judgements on the simplicity achieved by executing specific operations (e.g. simplicity gain based on lexical replacements). In this article, we investigate how well existing metrics can assess sentence-level simplifications where multiple operations may have been applied and which, therefore, require more general simplicity judgements. For that, we first collect a new and more reliable dataset for evaluating the correlation of metrics and human judgements of overall simplicity. Second, we conduct the first meta-evaluation of automatic metrics in Text Simplification, using our new dataset (and other existing data) to analyse the variation of the correlation between metrics’ scores and human judgements across three dimensions: the perceived simplicity level, the system type and the set of references used for computation. We show that these three aspects affect the correlations and, in particular, highlight the limitations of commonly-used operation-specific metrics. Finally, based on our findings, we propose a set of recommendations for automatic evaluation of multi-operation simplifications, suggesting which metrics to compute and how to interpret their scores

Online Research @ Cardiff

White Rose Research Online

ASSET : a dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations

Author: Alva-Manchego F.
Bordes A.
Martin L.
Sagot B.
Scarton C.
Specia L.
Publication venue
Publication date
Field of study

In order to simplify a sentence, human editors perform multiple rewriting transformations: they split it into several shorter sentences, paraphrase words (i.e. replacing complex words or phrases by simpler synonyms), reorder components, and/or delete information deemed unnecessary. Despite these varied range of possible text alterations, current models for automatic sentence simplification are evaluated using datasets that are focused on a single transformation, such as lexical paraphrasing or splitting. This makes it impossible to understand the ability of simplification models in more realistic settings. To alleviate this limitation, this paper introduces ASSET, a new dataset for assessing sentence simplification in English. ASSET is a crowdsourced multi-reference corpus where each simplification was produced by executing several rewriting transformations. Through quantitative and qualitative experiments, we show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task. Furthermore, we motivate the need for developing better methods for automatic evaluation using ASSET, since we show that current popular metrics may not be suitable when multiple simplification transformations are performed

White Rose Research Online

Toxic language detection in social media for Brazilian Portuguese : new dataset and multilingual analysis

Author: Bontcheva K.
Leite J.A.
Scarton C.
Silva D.F.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/12/2020
Field of study

Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media. Previous work in automatically detecting toxic comments focus mainly in English, with very few work in languages like Brazilian Portuguese. In this paper, we propose a new large-scale dataset for Brazilian Portuguese with tweets annotated as either toxic or non-toxic or in different types of toxicity. We present our dataset collection and annotation process, where we aimed to select candidates covering multiple demographic groups. State-of-the-art BERT models were able to achieve 76% macro-F1 score using monolingual data in the binary case. We also show that large-scale monolingual data is still needed to create more accurate models, despite recent advances in multilingual approaches. An error analysis and experiments with multi-label classification show the difficulty of classifying certain types of toxic comments that appear less frequently in our data and highlights the need to develop models that are aware of different categories of toxicity

White Rose Research Online