3,589 research outputs found
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
Topic Classification for Short Texts
In the context of TV and social media surveillance, constructing models to automate topic identification of short texts is key task. This paper formalizes the topic classification as a top-K multinomial classification problem and constructs worth-to-consider models for practical usage. We describe the full data processing pipeline, discussing about dataset selection, text preprocessing, feature extraction, model selection and learning, including hyperparameter optimization. When computing time and resources are limited, we show that a classical model like SVM performs as well as an advanced deep neural network, but with shorter model training time
Two Text Classifiers in Online Discussion: Support Vector Machine vs Back-Propagation Neural Network
The purpose of this research is to compare the performance of two text classifiers; support vector machine (SVM) and back-propagation neural network (BPNN) within categorize messages from an online discussion. SVM has been recognized as one of the best algorithm for text categorization. BPNN is also a popular categorization method that can handle linear and non linear problems and can achieve good result. However, using SVM and BPNN in online discussion is rare. In this research, several SVM data are trained in multi-class categorization to classify the same set with BPNN. The effectiveness of these two text classifiers are measured and then statistically compared based on error rate, precision, recall and F-measure. The experimental result shows that for text message categorization in online discussion, the performances of SVM outperform BPNN in term of error rate and precision; and falls behind BPNN in term of recall and F-measure
Classifiers and text mining: application to a specific context
[Abstract]: The constant growth of social networks has not only brought us new ways of interacting
with each other, but has also given way to a severe increase in negative behaviors: hate
speech, racism, gender harassment, cyberbullying, etc. Manually trying to detect this kind of
behaviours in millions of daily social media posts is out of the question. The solution lies in
developing intelligent systems to automate such detection tasks.
As the nature of these texts is completely subjective, this problem falls under the field
of sentiment analysis, which aims to systematically identify and study affective states and
subjective information in textual data using natural language processing techniques.
In particular, this project is focused on the research of different machine learning techniques
related to natural language processing, in order to automate and perform a reliable
detection and classification of sexist-related behaviours in social media texts. We will tackle
the task of adequately processing the extracted data from social media, as well as researching
various text classification techniques and models that we will use to develop and evaluate a
variety of classifiers.Traballo fin de grao (UDC.FIC). EnxeñarĂa Informática. Curso 2021/202
Aspect-Based Sentiment Analysis using Machine Learning and Deep Learning Approaches
Sentiment analysis (SA) is also known as opinion mining, it is the process of gathering and analyzing people's opinions about a particular service, good, or company on websites like Twitter, Facebook, Instagram, LinkedIn, and blogs, among other places. This article covers a thorough analysis of SA and its levels. This manuscript's main focus is on aspect-based SA, which helps manufacturing organizations make better decisions by examining consumers' viewpoints and opinions of their products. The many approaches and methods used in aspect-based sentiment analysis are covered in this review study (ABSA). The features associated with the aspects were manually drawn out in traditional methods, which made it a time-consuming and error-prone operation. Nevertheless, these restrictions may be overcome as artificial intelligence develops. Therefore, to increase the effectiveness of ABSA, researchers are increasingly using AI-based machine learning (ML) and deep learning (DL) techniques. Additionally, certain recently released ABSA approaches based on ML and DL are examined, contrasted, and based on this research, gaps in both methodologies are discovered. At the conclusion of this study, the difficulties that current ABSA models encounter are also emphasized, along with suggestions that can be made to improve the efficacy and precision of ABSA systems
- …