Search CORE

771 research outputs found

GLIMPSED:Improving natural language processing with gaze data

Author: Klerke Sigrid
Publication venue: Det Humanistiske Fakultet, Københavns Universitet
Publication date: 01/01/2016
Field of study

Copenhagen University Research Information System

Machine Learning for Readability Assessment and Text Simplification in Crisis Communication: A Systematic Review

Author: Hansen Hieronymus
Hellingrath Bernd
Ponge Johannes
Widera Adam
Publication venue: AIS Electronic Library (AISeL)
Publication date: 04/01/2021
Field of study

In times of social media, crisis managers can interact with the citizens in a variety of ways. Since machine learning has already been used to classify messages from the population, the question is, whether such technologies can play a role in the creation of messages from crisis managers to the population. This paper focuses on an explorative research revolving around selected machine learning solutions for crisis communication. We present systematic literature reviews of readability assessment and text simplification. Our research suggests that readability assessment has the potential for an effective use in crisis communication, but there is a lack of sufficient training data. This also applies to text simplification, where an exact assessment is only partly possible due to unreliable or non-existent training data and validation measures

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Is this sentence difficult? Do you agree?

Author: Brunato Dominique
De Mattei Lorenzo
Dell'Orletta Felice
Iavarone Benedetta
Venturi Giulia
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

In this paper, we present a crowdsourcing-based approach to model the human perception of sentence complexity. We collect a large corpus of sentences rated with judgments of complexity for two typologically-different languages, Italian and English. We test our approach in two experimental scenarios aimed to investigate the contribution of a wide set of lexical, morpho-syntactic and syntactic phenomena in predicting i) the degree of agreement among annotators independently from the assigned judgment and ii) the perception of sentence complexity

Archivio istituzionale della Ricerca - Scuola Normale Superiore

BERT Embeddings for Automatic Readability Assessment

Author: Imperial Joseph Marvin
Publication venue
Publication date: 15/06/2021
Field of study

Automatic readability assessment (ARA) is the task of evaluating the level of ease or difficulty of text documents for a target audience. For researchers, one of the many open problems in the field is to make such models trained for the task show efficacy even for low-resource languages. In this study, we propose an alternative way of utilizing the information-rich embeddings of BERT models with handcrafted linguistic features through a combined method for readability assessment. Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets, obtaining as high as 12.4% increase in F1 performance. We also show that the general information encoded in BERT embeddings can be used as a substitute feature set for low-resource languages like Filipino with limited semantic and syntactic NLP tools to explicitly extract feature values for the task

OPUS