Search CORE

391 research outputs found

Paronyms for Accelerated Correction of Semantic Errors

Author: Bolshakov Igor
Gelbukh Alexander
Publication venue: Institute of Information Theories and Applications FOI ITHEA
Publication date: 01/01/2003
Field of study

* Work done under partial support of Mexican Government (CONACyT, SNI), IPN (CGPI, COFAA) and Korean Government (KIPA Professorship for Visiting Faculty Positions). The second author is currently on Sabbatical leave at Chung-Ang University.The errors usually made by authors during text preparation are classified. The notion of semantic errors is elaborated, and malapropisms are pointed among them as “similar” to the intended word but essentially distorting the meaning of the text. For whatever method of malapropism correction, we propose to beforehand compile dictionaries of paronyms, i.e. of words similar to each other in letters, sounds or morphs. The proposed classification of errors and paronyms is illustrated by English and Russian examples being valid for many languages. Specific dictionaries of literal and morphemic paronyms are compiled for Russian. It is shown that literal paronyms drastically cut down (up to 360 times) the search of correction candidates, while morphemic paronyms permit to correct errors not studied so far and characteristic for foreigners

Bulgarian Digital Mathematics Library at IMI-BAS

DialogueRNN: An Attentive RNN for Emotion Detection in Conversations

Author: Cambria Erik
Gelbukh Alexander
Hazarika Devamanyu
Majumder Navonil
Mihalcea Rada
Poria Soujanya
Publication venue
Publication date: 25/05/2019
Field of study

Emotion detection in conversations is a necessary step for a number of applications, including opinion mining over chat history, social media threads, debates, argumentation mining, understanding consumer feedback in live conversations, etc. Currently, systems do not treat the parties in the conversation individually by adapting to the speaker of each utterance. In this paper, we describe a new method based on recurrent neural networks that keeps track of the individual party states throughout the conversation and uses this information for emotion classification. Our model outperforms the state of the art by a significant margin on two different datasets.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

PolyHope: Two-Level Hope Speech Detection from Tweets

Author: Balouchzahi Fazlourrahman
Gelbukh Alexander
Sidorov Grigori
Publication venue
Publication date: 03/11/2022
Field of study

Hope is characterized as openness of spirit toward the future, a desire, expectation, and wish for something to happen or to be true that remarkably affects human's state of mind, emotions, behaviors, and decisions. Hope is usually associated with concepts of desired expectations and possibility/probability concerning the future. Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into "Hope" and "Not Hope", then into three fine-grained hope categories: "Generalized Hope", "Realistic Hope", and "Unrealistic Hope" (along with "Not Hope"). English tweets in the first half of 2022 were collected to build this dataset. Furthermore, we describe our annotation process and guidelines in detail and discuss the challenges of classifying hope and the limitations of the existing hope speech detection corpora. In addition, we reported several baselines based on different learning approaches, such as traditional machine learning, deep learning, and transformers, to benchmark our dataset. We evaluated our baselines using weighted-averaged and macro-averaged F1-scores. Observations show that a strict process for annotator selection and detailed annotation guidelines enhanced the dataset's quality. This strict annotation process resulted in promising performance for simple machine learning classifiers with only bi-grams; however, binary and multiclass hope speech detection results reveal that contextual embedding models have higher performance in this dataset.Comment: 20 pages, 9 figure

arXiv.org e-Print Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Automatic Estimation of Parameters of Complex Fuzzy Control Systems

Author: Alexander Gelbukh
Rene Garcia Hernandez
Yulia Ledeneva
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Fractal Power Law in Literary English

Author: Ebeling
Gelbukh
Havlin
Hergarten
Kurnaz
L.B. Gonçalves
L.L. Gonçalves
Malamud
Mantegna
Perling
Pöschel
Solovyev
Sornette
Suki
Turcotte
Vilensky
Zipf
Zipf
Publication venue: 'Elsevier BV'
Publication date: 03/06/2005
Field of study

We present in this paper a numerical investigation of literary texts by various well-known English writers, covering the first half of the twentieth century, based upon the results obtained through corpus analysis of the texts. A fractal power law is obtained for the lexical wealth defined as the ratio between the number of different words and the total number of words of a given text. By considering as a signature of each author the exponent and the amplitude of the power law, and the standard deviation of the lexical wealth, it is possible to discriminate works of different genres and writers and show that each writer has a very distinct signature, either considered among other literary writers or compared with writers of non-literary texts. It is also shown that, for a given author, the signature is able to discriminate between short stories and novels.Comment: 27 pages, 10 tables,15 figures. Revised version accepted in Physica

arXiv.org e-Print Archive

Crossref