172 research outputs found
Paronyms for Accelerated Correction of Semantic Errors
* Work done under partial support of Mexican Government (CONACyT, SNI), IPN (CGPI, COFAA) and Korean Government (KIPA
Professorship for Visiting Faculty Positions). The second author is currently on Sabbatical leave at Chung-Ang University.The errors usually made by authors during text preparation are classified. The notion of semantic
errors is elaborated, and malapropisms are pointed among them as “similar” to the intended word but
essentially distorting the meaning of the text. For whatever method of malapropism correction, we propose to
beforehand compile dictionaries of paronyms, i.e. of words similar to each other in letters, sounds or morphs.
The proposed classification of errors and paronyms is illustrated by English and Russian examples being valid
for many languages. Specific dictionaries of literal and morphemic paronyms are compiled for Russian. It is
shown that literal paronyms drastically cut down (up to 360 times) the search of correction candidates, while
morphemic paronyms permit to correct errors not studied so far and characteristic for foreigners
PolyHope: Two-Level Hope Speech Detection from Tweets
Hope is characterized as openness of spirit toward the future, a desire,
expectation, and wish for something to happen or to be true that remarkably
affects human's state of mind, emotions, behaviors, and decisions. Hope is
usually associated with concepts of desired expectations and
possibility/probability concerning the future. Despite its importance, hope has
rarely been studied as a social media analysis task. This paper presents a hope
speech dataset that classifies each tweet first into "Hope" and "Not Hope",
then into three fine-grained hope categories: "Generalized Hope", "Realistic
Hope", and "Unrealistic Hope" (along with "Not Hope"). English tweets in the
first half of 2022 were collected to build this dataset. Furthermore, we
describe our annotation process and guidelines in detail and discuss the
challenges of classifying hope and the limitations of the existing hope speech
detection corpora. In addition, we reported several baselines based on
different learning approaches, such as traditional machine learning, deep
learning, and transformers, to benchmark our dataset. We evaluated our
baselines using weighted-averaged and macro-averaged F1-scores. Observations
show that a strict process for annotator selection and detailed annotation
guidelines enhanced the dataset's quality. This strict annotation process
resulted in promising performance for simple machine learning classifiers with
only bi-grams; however, binary and multiclass hope speech detection results
reveal that contextual embedding models have higher performance in this
dataset.Comment: 20 pages, 9 figure
Recent Trends in Deep Learning Based Personality Detection
Recently, the automatic prediction of personality traits has received a lot
of attention. Specifically, personality trait prediction from multimodal data
has emerged as a hot topic within the field of affective computing. In this
paper, we review significant machine learning models which have been employed
for personality detection, with an emphasis on deep learning-based methods.
This review paper provides an overview of the most popular approaches to
automated personality detection, various computational datasets, its industrial
applications, and state-of-the-art machine learning models for personality
detection with specific focus on multimodal approaches. Personality detection
is a very broad and diverse topic: this survey only focuses on computational
approaches and leaves out psychological studies on personality detection
Representación computacional del lenguaje natural escrito
When humans read, or hear, words, they immediately relatethem to a concept. This is possible due to the informationalready stored in the brain and also to human’s ability toselect, process, and associate such information with words.However, for a computer, natural language text is only asequence of bits that does not convey any meaning on itsown, unless properly processed. A computer interprets thisbit sequence by modeling the processing that takes place inhuman minds, namely structuring and linking the text withpreviously stored information. During this process, as wellas when describing its results, the text is represented usingvarious formal structures that permit automatic processing,interpretation, and comparison of information. In this paper,we present a detailed description of these structures.Cuando el ser humano lee o escucha una palabra, inmediatamente la relaciona con un concepto. Esto es posible gracias a la acumulación de información y a la posibilidad de filtrar, procesar y relacionar dicha información. Para la máquina, una expresión escrita en el lenguaje natural es una cadena de bits que no aporta información por sí sola. Un computador interpreta esta cadena de bits, modelando el proceso que tiene lugar en la mente humana, estructurando y relacionado la cadena con información previamente almacenada. En el proceso, así como al momento de describir los resultados, el texto es representado por estructuras formales que permiten el procesamiento automático, la interpretación y la comparación de la información. Este artículo presenta una descripción detallada de estas estructuras
- …