Search CORE

67 research outputs found

A New Feature Selection Method based on Intuitionistic Fuzzy Entropy to Categorize Text Documents

Author: Harish B S
Revanasiddappa M B
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 04/02/2022
Field of study

Selection of highly discriminative feature in text document plays a major challenging role in categorization. Feature selection is an important task that involves dimensionality reduction of feature matrix, which in turn enhances the performance of categorization. This article presents a new feature selection method based on Intuitionistic Fuzzy Entropy (IFE) for Text Categorization. Firstly, Intuitionistic Fuzzy C-Means (IFCM) clustering method is employed to compute the intuitionistic membership values. The computed intuitionistic membership values are used to estimate intuitionistic fuzzy entropy via Match degree. Further, features with lower entropy values are selected to categorize the text documents. To find the efficacy of the proposed method, experiments are conducted on three standard benchmark datasets using three classifiers. F-measure is used to assess the performance of the classifiers. The proposed method shows impressive results as compared to other well known feature selection methods. Moreover, Intuitionistic Fuzzy Set (IFS) property addresses the uncertainty limitations of traditional fuzzy set

Re-UNIR

Topic Models and Fusion Methods: a Union to Improve Text Clustering and Cluster Labeling

Author: Omidvarborna Hosna
Orlando Salvatore
Pourvali Mohsen
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 21/02/2022
Field of study

Topic modeling algorithms are statistical methods that aim to discover the topics running through the text documents. Using topic models in machine learning and text mining is popular due to its applicability in inferring the latent topic structure of a corpus. In this paper, we represent an enriching document approach, using state-of-the-art topic models and data fusion methods, to enrich documents of a collection with the aim of improving the quality of text clustering and cluster labeling. We propose a bi-vector space model in which every document of the corpus is represented by two vectors: one is generated based on the fusion-based topic modeling approach, and one simply is the traditional vector model. Our experiments on various datasets show that using a combination of topic modeling and fusion methods to create documents’ vectors can significantly improve the quality of the results in clustering the documents

Re-UNIR

Aspect Based Opinion Mining & Sentiment Analysis

Author: Londhe Alka
Rao P. V. R. D. Prasada
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/10/2022
Field of study

Opinion mining is a relatively new field that refers to the practice of collecting feedback in the form of online reviews and ratings left by users on various topics. Researchers are now able to monitor the states of consciousness of individuals in real-time because to this development. Just lately, a number of research papers for sentiment analysis were implemented, each of which was based on a unique categorization and ranking procedure. However, the amount of time necessary for the newline performing class has not decreased in any way. Sentiment Sensitivity newline word list SST was provided as a solution to the problem of function mismatch in the go-domain sentiment class across the source area and the target domain; however, achieving improved accuracy and identifying distributional similarities of words became less effective as time went on. Hidden Markov’s persistent development may be seen at the beginning. Cosine In order to achieve more effective and clean pre-processing, a method that is conceptually quite similar to HM-CPCS has been devised. The HM-CPCS methodology, which has recently been suggested, makes use of the POS tagger, a variant of which is based on the Hidden Markov algorithm. Evaluations are created using data from a wide variety of different domains. Similar to a newline, the tags that come before and after it compute the possibility of transitions and the existence of the term newline among the tags in order to increase capability. This is done in order to improve capability

International Journal on Recent and Innovation Trends in Computing and Communication

Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method

Author: Darshan H K
Harish B S
Kumar Keerthi
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 28/02/2022
Field of study

Social Networking sites have become popular and common places for sharing wide range of emotions through short texts. These emotions include happiness, sadness, anxiety, fear, etc. Analyzing short texts helps in identifying the sentiment expressed by the crowd. Sentiment Analysis on IMDb movie reviews identifies the overall sentiment or opinion expressed by a reviewer towards a movie. Many researchers are working on pruning the sentiment analysis model that clearly identifies and distinguishes between a positive review and a negative review. In the proposed work, we show that the use of Hybrid features obtained by concatenating Machine Learning features (TF, TF-IDF) with Lexicon features (Positive-Negative word count, Connotation) gives better results both in terms of accuracy and complexity when tested against classifiers like SVM, Naïve Bayes, KNN and Maximum Entropy. The proposed model clearly differentiates between a positive review and negative review. Since understanding the context of the reviews plays an important role in classification, using hybrid features helps in capturing the context of the movie reviews and hence increases the accuracy of classification

Re-UNIR

How to Rank Answers in Text Mining

Author: Zhang Guandong
Publication venue: Scholarship@Western
Publication date: 17/06/2019
Field of study

In this thesis, we mainly focus on case studies about answers. We present the methodology CEW-DTW and assess its performance about ranking quality. Based on the CEW-DTW, we improve this methodology by combining Kullback-Leibler divergence with CEW-DTW, since Kullback-Leibler divergence can check the difference of probability distributions in two sequences. However, CEW-DTW and KL-CEW-DTW do not care about the effect of noise and keywords from the viewpoint of probability distribution. Therefore, we develop a new methodology, the General Entropy, to see how probabilities of noise and keywords affect answer qualities. We firstly analyze some properties of the General Entropy, such as the value range of the General Entropy. Especially, we try to find an objective goal, which can be regarded as a standard to assess answers. Therefore, we introduce the maximum general entropy. We try to use the general entropy methodology to find an imaginary answer with the maximum entropy from the mathematical viewpoint (though this answer may not exist). This answer can also be regarded as an “ideal” answer. By comparing maximum entropy probabilities and global probabilities of noise and keywords respectively, the maximum entropy probability of noise is smaller than the global probability of noise, maximum entropy probabilities of chosen keywords are larger than global probabilities of keywords in some conditions. This allows us to determinably select the max number of keywords. We also use Amazon dataset and a small group of survey to assess the general entropy. Though these developed methodologies can analyze answer qualities, they do not incorporate the inner connections among keywords and noise. Based on the Markov transition matrix, we develop the Jump Probability Entropy. We still adapt Amazon dataset to compare maximum jump entropy probabilities and global jump probabilities of noise and keywords respectively. Finally, we give steps about how to get answers from Amazon dataset, including obtaining original answers from Amazon dataset, removing stopping words and collinearity. We compare our developed methodologies to see if these methodologies are consistent. Also, we introduce Wald–Wolfowitz runs test and compare it with developed methodologies to verify their relationships. Depending on results of comparison, we get conclusions about consistence of these methodologies and illustrate future plans

Scholarship@Western

The Impact of Artificial Intelligence on Strategic and Operational Decision Making

Author: COLETTO GIACOMO
Publication venue
Publication date: 20/10/2023
Field of study

openEffective decision making lies at the core of organizational success. In the era of digital transformation, businesses are increasingly adopting data-driven approaches to gain a competitive advantage. According to existing literature, Artificial Intelligence (AI) represents a significant advancement in this area, with the ability to analyze large volumes of data, identify patterns, make accurate predictions, and provide decision support to organizations. This study aims to explore the impact of AI technologies on different levels of organizational decision making. By separating these decisions into strategic and operational according to their properties, the study provides a more comprehensive understanding of the feasibility, current adoption rates, and barriers hindering AI implementation in organizational decision making

Padua Thesis and Dissertation Archive

Fuzzy Techniques for Decision Making 2018

Author
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

Zadeh's fuzzy set theory incorporates the impreciseness of data and evaluations, by imputting the degrees by which each object belongs to a set. Its success fostered theories that codify the subjectivity, uncertainty, imprecision, or roughness of the evaluations. Their rationale is to produce new flexible methodologies in order to model a variety of concrete decision problems more realistically. This Special Issue garners contributions addressing novel tools, techniques and methodologies for decision making (inclusive of both individual and group, single- or multi-criteria decision making) in the context of these theories. It contains 38 research articles that contribute to a variety of setups that combine fuzziness, hesitancy, roughness, covering sets, and linguistic approaches. Their ranges vary from fundamental or technical to applied approaches

Directory of Open Access Books (DOAB)

Quality in crowdsourced experience-based evaluations : handling subjective responses

Author: Loor Romero Marcelo Eduardo
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2018
Field of study

Experience-based evaluations (XBEs) are appraisals based on what someone has understood or learned about a topic by experience. Although XBEs can be highly subjective, imprecise, and diverse, information extracted from them can result in significant benefits for companies and organizations. However, handling XBEs can entail several challenges especially when potential data quality issues, such as a lack of reliability on XBEs provided by a large and heterogeneous group of (anonymous) sources, need to be handled. In this dissertation, challenges connected with the characterization, processing and quality of XBEs have been handled. Thereby, it is studied if and how existing and novel concepts and methods in the area of computational intelligence can be used to characterize and process XBEs in such a way that one can adequately handle data quality issues on subjective data provided by a large and heterogeneous group of respondents. It has been shown that existing and novel concepts and methods connected to fuzzy set theory, which aims to find approximate, achievable and robust solutions, can be used to address these challenges. Among the novel proposed concepts, augmented appraisal degrees and augmented (Atanassov) intuitionistic fuzzy sets are deemed to be the most important contributions of this dissertation

Ghent University Academic Bibliography

Binary Multi-Verse Optimization (BMVO) Approaches for Feature Selection

Author: Hans Rahul
Kaur Harjot
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 21/03/2022
Field of study

Multi-Verse Optimization (MVO) is one of the newest meta-heuristic optimization algorithms which imitates the theory of Multi-Verse in Physics and resembles the interaction among the various universes. In problem domains like feature selection, the solutions are often constrained to the binary values viz. 0 and 1. With regard to this, in this paper, binary versions of MVO algorithm have been proposed with two prime aims: firstly, to remove redundant and irrelevant features from the dataset and secondly, to achieve better classification accuracy. The proposed binary versions use the concept of transformation functions for the mapping of a continuous version of the MVO algorithm to its binary versions. For carrying out the experiments, 21 diverse datasets have been used to compare the Binary MVO (BMVO) with some binary versions of existing metaheuristic algorithms. It has been observed that the proposed BMVO approaches have outperformed in terms of a number of features selected and the accuracy of the classification process

Re-UNIR

Segmentation of images by color features: a survey

Author: Cervantes Canales Jair
Cervantes Canales Jair
García Lamont Farid
García Lamont Farid
LOPEZ CHAU ASDRUBAL
LOPEZ CHAU ASDRUBAL
Rodríguez Mazahua Lisbeth
Rodríguez Mazahua Lisbeth
Publication venue: Neurocomputing
Publication date: 31/05/2018
Field of study

En este articulo se hace la revisión del estado del arte sobre la segmentación de imagenes de colorImage segmentation is an important stage for object recognition. Many methods have been proposed in the last few years for grayscale and color images. In this paper, we present a deep review of the state of the art on color image segmentation methods; through this paper, we explain the techniques based on edge detection, thresholding, histogram-thresholding, region, feature clustering and neural networks. Because color spaces play a key role in the methods reviewed, we also explain in detail the most commonly color spaces to represent and process colors. In addition, we present some important applications that use the methods of image segmentation reviewed. Finally, a set of metrics frequently used to evaluate quantitatively the segmented images is shown

Red Mexicana de Repositorios Institucionales

Repositorio Institucional de la Universidad Autónoma del Estado de México