1,808 research outputs found

    Text Classification: A Review, Empirical, and Experimental Evaluation

    Full text link
    The explosive and widespread growth of data necessitates the use of text classification to extract crucial information from vast amounts of data. Consequently, there has been a surge of research in both classical and deep learning text classification methods. Despite the numerous methods proposed in the literature, there is still a pressing need for a comprehensive and up-to-date survey. Existing survey papers categorize algorithms for text classification into broad classes, which can lead to the misclassification of unrelated algorithms and incorrect assessments of their qualities and behaviors using the same metrics. To address these limitations, our paper introduces a novel methodological taxonomy that classifies algorithms hierarchically into fine-grained classes and specific techniques. The taxonomy includes methodology categories, methodology techniques, and methodology sub-techniques. Our study is the first survey to utilize this methodological taxonomy for classifying algorithms for text classification. Furthermore, our study also conducts empirical evaluation and experimental comparisons and rankings of different algorithms that employ the same specific sub-technique, different sub-techniques within the same technique, different techniques within the same category, and categorie

    Opinion Mining Summarization and Automation Process: A Survey

    Get PDF
    In this modern age, the internet is a powerful source of information. Roughly, one-third of the world population spends a significant amount of their time and money on surfing the internet. In every field of life, people are gaining vast information from it such as learning, amusement, communication, shopping, etc. For this purpose, users tend to exploit websites and provide their remarks or views on any product, service, event, etc. based on their experience that might be useful for other users. In this manner, a huge amount of feedback in the form of textual data is composed of those webs, and this data can be explored, evaluated and controlled for the decision-making process. Opinion Mining (OM) is a type of Natural Language Processing (NLP) and extraction of the theme or idea from the user's opinions in the form of positive, negative and neutral comments. Therefore, researchers try to present information in the form of a summary that would be useful for different users. Hence, the research community has generated automatic summaries from the 1950s until now, and these automation processes are divided into two categories, which is abstractive and extractive methods. This paper presents an overview of the useful methods in OM and explains the idea about OM regarding summarization and its automation process

    A Survey on Semantic Processing Techniques

    Full text link
    Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

    Universal, Unsupervised (Rule-Based), Uncovered Sentiment Analysis

    Get PDF
    We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules. On the one hand, we exploit some of the main advantages of unsupervised algorithms: (1) the interpretability of their output, in contrast with most supervised models, which behave as a black box and (2) their robustness across different corpora and domains. On the other hand, by introducing the concept of compositional operations and exploiting syntactic information in the form of universal dependencies, we tackle one of their main drawbacks: their rigidity on data that are structured differently depending on the language concerned. Experiments show an improvement both over existing unsupervised methods, and over state-of-the-art supervised models when evaluating outside their corpus of origin. Experiments also show how the same compositional operations can be shared across languages. The system is available at http://www.grupolys.org/software/UUUSA/Comment: 19 pages, 5 Tables, 6 Figures. This is the authors version of a work that was accepted for publication in Knowledge-Based System
    corecore