Search CORE

93 research outputs found

Towards Robust Word Embeddings for Noisy Texts

Author: Doval Yerai
Gómez-Rodríguez Carlos
Vilares Jesús
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

[Abstract] Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media. In this work, we propose a simple extension to the skipgram model in which we introduce the concept of bridge-words, which are artificial words added to the model to strengthen the similarity between standard words and their noisy variants. Our new embeddings outperform baseline models on noisy texts on a wide range of evaluation tasks, both intrinsic and extrinsic, while retaining a good performance on standard texts. To the best of our knowledge, this is the first explicit approach at dealing with these types of noisy texts at the word embedding level that goes beyond the support for out-of-vocabulary words.Ministerio de Economía, Industria y Competitividad. MINECO; TIN2017-85160-C2-2-RMinisterio de Economía, Industria y Competitividad. MINECO; TIN2017-85160-C2-1-REuropean Social Fund. ESF; BES-2015-073768Xunta de Galicia; ED431D 2017/12Xunta de Galicia; ED431B 2017/01Xunta de Galicia; ED431C 2020/11Xunta de Galicia; ED431G/0

arXiv.org e-Print Archive

Sentiment Analysis for Social Media

Author: Iglesias Carlos A.
Moreno Antonio
Publication venue: 'MDPI AG'
Publication date: 09/06/2020
Field of study

Sentiment analysis is a branch of natural language processing concerned with the study of the intensity of the emotions expressed in a piece of text. The automated analysis of the multitude of messages delivered through social media is one of the hottest research fields, both in academy and in industry, due to its extremely high potential applicability in many different domains. This Special Issue describes both technological contributions to the field, mostly based on deep learning techniques, and specific applications in areas like health insurance, gender classification, recommender systems, and cyber aggression detection

The pipeline processing of NLP

Author: Elov B. B.
Khamroeva Sh. M.
Xusainova Z. Y.
Publication venue: EDP Sciences
Publication date: 01/01/2023
Field of study

The problem of NLP should be divided into several small parts and solved step by step. In this article, where NLP is necessary at every stage of solving the problem, all forms of text processing are considered. The step-by-step text processing is called a pipeline process in NLP. When creating any NLP model, the pipeline process is a sequence of steps that must be carried out. The planning and development of the text processing is considered as the starting point for the creation of any NLP project. This article discusses the steps involved in implementing a pipeline process and their role in solving NLP tasks. This article analyzed the most common preliminary processing steps on the NLP processing pipeline. All processing stages are pre-trained in various NLP libraries, identified as usable models. If necessary, additional, modified preprocessing steps can be developed depending on the given problem condition. One can determine how a particular initial processing stage serves a given NLP problem by many experimentations

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals