34 research outputs found
BERT Embeddings for Automatic Readability Assessment
Automatic readability assessment (ARA) is the task of evaluating the level of ease or difficulty of text documents for a target audience. For researchers, one of the many open problems in the field is to make such models trained for the task show efficacy even for low-resource languages. In this study, we propose an alternative way of utilizing the information-rich embeddings of BERT models with handcrafted linguistic features through a combined method for readability assessment. Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets, obtaining as high as 12.4% increase in F1 performance. We also show that the general information encoded in BERT embeddings can be used as a substitute feature set for low-resource languages like Filipino with limited semantic and syntactic NLP tools to explicitly extract feature values for the task
Uniform Complexity for Text Generation
Large pre-trained language models have shown promising results in a wide
array of tasks such as narrative generation, question answering, and machine
translation. Likewise, the current trend in literature has deeply focused on
controlling salient properties of generated texts including sentiment, topic,
and coherence to produce more human-like outputs. In this work, we introduce
Uniform Complexity for Text Generation or UCTG which serves as a challenge to
make existing models generate uniformly complex text with respect to inputs or
prompts used. For example, if the reading level of an input text prompt is
appropriate for low-leveled learners (ex. A2 in the CEFR), then the generated
text by an NLG system should also assume this particular level for increased
readability. In a controlled narrative generation task, we surveyed over 160
linguistic and cognitively-motivated features for evaluating text readability
and found out that GPT-2 models and even humans struggle in preserving the
linguistic complexity of input prompts used. Ultimately, we lay down potential
methods and approaches which can be incorporated into the general framework of
steering language models towards addressing this important challenge
BasahaCorpus: An Expanded Linguistic Resource for Readability Assessment in Central Philippine Languages
Current research on automatic readability assessment (ARA) has focused on
improving the performance of models in high-resource languages such as English.
In this work, we introduce and release BasahaCorpus as part of an initiative
aimed at expanding available corpora and baseline models for readability
assessment in lower resource languages in the Philippines. We compiled a corpus
of short fictional narratives written in Hiligaynon, Minasbate, Karay-a, and
Rinconada -- languages belonging to the Central Philippine family tree subgroup
-- to train ARA models using surface-level, syllable-pattern, and n-gram
overlap features. We also propose a new hierarchical cross-lingual modeling
approach that takes advantage of a language's placement in the family tree to
increase the amount of available training data. Our study yields encouraging
results that support previous work showcasing the efficacy of cross-lingual
models in low-resource settings, as well as similarities in highly informative
linguistic features for mutually intelligible languages.Comment: Final camera-ready paper for EMNLP 2023 (Main
Under the Microscope:Interpreting Readability Assessment Models for Filipino
Readability assessment is the process of identifying the level of ease or difficulty of a certain piece of text for its intended audience. Approaches have evolved from the use of arithmetic formulas to more complex pattern-recognizing models trained using machine learning algorithms. While using these approaches provide competitive results, limited work is done on analyzing how linguistic variables affect model inference quantitatively. In this work, we dissect machine learning-based readability assessment models in Filipino by performing global and local model interpretation to understand the contributions of varying linguistic features and discuss its implications in the context of the Filipino language. Results show that using a model trained with top features from global interpretation obtained higher performance than the ones using features selected by Spearman correlation. Likewise, we also empirically observed local feature weight boundaries for discriminating reading difficulty at an extremely fine-grained level and their corresponding effects if values are perturbed
Automatic Readability Assessment for Closely Related Languages
In recent years, the main focus of research on automatic readability
assessment (ARA) has shifted towards using expensive deep learning-based
methods with the primary goal of increasing models' accuracy. This, however, is
rarely applicable for low-resource languages where traditional handcrafted
features are still widely used due to the lack of existing NLP tools to extract
deeper linguistic representations. In this work, we take a step back from the
technical component and focus on how linguistic aspects such as mutual
intelligibility or degree of language relatedness can improve ARA in a
low-resource setting. We collect short stories written in three languages in
the Philippines-Tagalog, Bikol, and Cebuano-to train readability assessment
models and explore the interaction of data and features in various
cross-lingual setups. Our results show that the inclusion of CrossNGO, a novel
specialized feature exploiting n-gram overlap applied to languages with high
mutual intelligibility, significantly improves the performance of ARA models
compared to the use of off-the-shelf large multilingual language models alone.
Consequently, when both linguistic representations are combined, we achieve
state-of-the-art results for Tagalog and Cebuano, and baseline scores for ARA
in Bikol.Comment: Camera-ready version for ACL 202
A Simple Post-Processing Technique for Improving Readability Assessment of Texts using Word Mover's Distance
Assessing the proper difficulty levels of reading materials or texts in
general is the first step towards effective comprehension and learning. In this
study, we improve the conventional methodology of automatic readability
assessment by incorporating the Word Mover's Distance (WMD) of ranked texts as
an additional post-processing technique to further ground the difficulty level
given by a model. Results of our experiments on three multilingual datasets in
Filipino, German, and English show that the post-processing technique
outperforms previous vanilla and ranking-based models using SVM
Automatic Readability Assessment for Closely Related Languages
In recent years, the main focus of research on automatic readability assessment (ARA) has shifted towards using expensive deep learning-based methods with the primary goal of increasing models' accuracy. This, however, is rarely applicable for low-resource languages where traditional handcrafted features are still widely used due to the lack of existing NLP tools to extract deeper linguistic representations. In this work, we take a step back from the technical component and focus on how linguistic aspects such as mutual intelligibility or degree of language relatedness can improve ARA in a low-resource setting. We collect short stories written in three languages in the Philippines-Tagalog, Bikol, and Cebuano-to train readability assessment models and explore the interaction of data and features in various cross-lingual setups. Our results show that the inclusion of CrossNGO, a novel specialized feature exploiting n-gram overlap applied to languages with high mutual intelligibility, significantly improves the performance of ARA models compared to the use of off-the-shelf large multilingual language models alone. Consequently, when both linguistic representations are combined, we achieve state-of-the-art results for Tagalog and Cebuano, and baseline scores for ARA in Bikol
Automatic Readability Assessment for Closely Related Languages
In recent years, the main focus of research on automatic readability assessment (ARA) has shifted towards using expensive deep learning-based methods with the primary goal of increasing models' accuracy. This, however, is rarely applicable for low-resource languages where traditional handcrafted features are still widely used due to the lack of existing NLP tools to extract deeper linguistic representations. In this work, we take a step back from the technical component and focus on how linguistic aspects such as mutual intelligibility or degree of language relatedness can improve ARA in a low-resource setting. We collect short stories written in three languages in the Philippines-Tagalog, Bikol, and Cebuano-to train readability assessment models and explore the interaction of data and features in various cross-lingual setups. Our results show that the inclusion of CrossNGO, a novel specialized feature exploiting n-gram overlap applied to languages with high mutual intelligibility, significantly improves the performance of ARA models compared to the use of off-the-shelf large multilingual language models alone. Consequently, when both linguistic representations are combined, we achieve state-of-the-art results for Tagalog and Cebuano, and baseline scores for ARA in Bikol