Search CORE

2,010 research outputs found

Local feature weighting in nearest prototype classification

Author: Fernández Fernando
Isasi Pedro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

The distance metric is the corner stone of nearest neighbor (NN)-based methods, and therefore, of nearest prototype (NP) algorithms. That is because they classify depending on the similarity of the data. When the data is characterized by a set of features which may contribute to the classification task in different levels, feature weighting or selection is required, sometimes in a local sense. However, local weighting is typically restricted to NN approaches. In this paper, we introduce local feature weighting (LFW) in NP classification. LFW provides each prototype its own weight vector, opposite to typical global weighting methods found in the NP literature, where all the prototypes share the same one. Providing each prototype its own weight vector has a novel effect in the borders of the Voronoi regions generated: They become nonlinear. We have integrated LFW with a previously developed evolutionary nearest prototype classifier (ENPC). The experiments performed both in artificial and real data sets demonstrate that the resulting algorithm that we call LFW in nearest prototype classification (LFW-NPC) avoids overfitting on training data in domains where the features may have different contribution to the classification task in different areas of the feature space. This generalization capability is also reflected in automatically obtaining an accurate and reduced set of prototypes.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Evolving clustering, classification and regression with TEDA

Author: Angelov Plamen Parvanov
Kangin Dmitry
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/07/2015
Field of study

In this article the novel clustering and regression methods TEDACluster and TEDAPredict methods are described additionally to recently proposed evolving classifier TEDAClass. The algorithms for classification, clustering and regression are based on the recently proposed AnYa type fuzzy rule based system. The novel methods use the recently proposed TEDA framework capable of recursive processing of large amounts of data. The framework is capable of computationally cheap exact update of data per sample, and can be used for training `from scratch'. All three algorithms are evolving that is they are capable of changing its own structure during the update stage, which allows to follow the changes within the model pattern

Lancaster E-Prints

A systematic review of data quality issues in knowledge discovery tasks

Author: Corrales David Camilo
Corrales Juan Carlos
Ledezma Agapito Ismael
Publication venue: 'Universidad de Medellin'
Publication date: 07/11/2015
Field of study

Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Universidad de Medellín: Revistas Científicas

Repositorio Institucional Universidad de Medellín

DIALNET

Fuzzy Support Vector Machine Using Function Linear Membership and Exponential with Mahanalobis Distance

Author: Sukeiti Wiwi Widia
Surono Sugiyarto
Publication venue: 'Universitas Muhammadiyah Mataram'
Publication date: 01/04/2022
Field of study

Support vector machine (SVM) is one of effective biner classification technic with structural risk minimization (SRM) principle. SVM method is known as one of successful method in classification technic. But the real-life data problem lies in the occurrence of noise and outlier. Noise will create confusion for the SVM when the data is being processed. On this research, SVM is being developed by adding its fuzzy membership function to lessen the noise and outlier effect in data when trying to figure out the hyperplane solution. Distance calculation is also being considered while determining fuzzy value because it is a basic thing in determining the proximity between data elements, which in general is built depending on the distance between the point into the real class mass center. Fuzzy support vector machine (FSVM) uses Mahalanobis distances with the goal of finding the best hyperplane by separating data between defined classes. The data used will be going over trial for several dividing partition percentage transforming into training set and testing set. Although theoretically FSVM is able to overcome noise and outliers, the results show that the accuracy of FSVM, namely 0.017170689 and 0.018668421, is lower than the accuracy of the classical SVM method, which is 0.018838348. The existence of fuzzy membership function is extremely influential in deciding the best hyperplane. Based on that, determining the correct fuzzy membership is critical in FSVM problem

Neliti

Directory of Open Access Journals

UMMAT Scientific Journals (Universitas Muhammadiyah Mataram)

Fuzzy approach for Arabic character recognition

Author: El-Nasan Adnan
Publication venue: RIT Scholar Works
Publication date: 01/10/1994
Field of study

Pattern recognition/classification is increasingly drawing the attention of scientific research because of its important roll in automation and human-machine communication. Even though many models have been introduced to deal with classification, because of the inherited imprecision and ambiguity, these models did not tackle the problem in an efficient way. Traditional models deal only with statistical uncertainty (randomness) but not with the non-statistical uncertainty (vagueness). Fuzzy set theory allows us to better understand imprecision in both of its categories: vagueness and randomness. The incorporation of fuzzy set theory in existing algorithms helped in many cases to improve the performance and increase the efficiency of those algorithms. This thesis will explore fuzzy logic as it pertains to pattern recognition. In order to demonstrate fuzzy logic, the problem of recognizing the Arabic alphabet is discussed. In this problem moments and central moments were used as discriminating features. A fuzzy classifier was designed in a way that incorporated some statistical knowledge of the problem in hand. Performance of this classifier was compared to a Bayesian classifier and a neural network classifier. Performance, evaluation, and advantages and disadvantages of each classifier is reported and discussed

RIT Scholar Works

Fuzzy Inference System for Data Processing in Industrial Applications

Author: Silvia Cateni
Valentina Colla
Publication venue: 'IntechOpen'
Publication date: 01/01/2012
Field of study

IntechOpen

Archivio della ricerca della Scuola Superiore Sant'Anna

Incremental learning algorithm based on support vector machine with Mahalanobis distance (ISVMM) for intrusion prevention

Author: Meesad Phayung
Myint Hnin Ohnmar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2009
Field of study

In this paper we propose a new classifier called an incremental learning algorithm based on support vector machine with Mahalanobis distance (ISVMM). Prediction of the incoming data type by supervised learning of support vector machine (SVM), reducing the step of calculation and complexity of the algorithm by finding a support set, error set and remaining set, providing of hard and soft decisions, saving the time for repeatedly training the datasets by applying the incremental learning, a new approach for building an ellipsoidal kernel for multidimensional data instead of a sphere kernel by using Mahalanobis distance, and the concept of handling the covariance matrix from dividing by zero are various features of this new algorithm. To evaluate the classification performance of the algorithm, it was applied on intrusion prevention by employing the data from the third international knowledge discovery and data mining tools competition (KDDcup'99). According to the experimental results, ISVMM can predict well on all of the 41 features of incoming datasets without even reducing the enlarged dimensions and it can compete with the similar algorithm which uses a Euclidean measurement at the kernel distance

Crossref

Open Research Online (The Open University)

Data-based fault detection in chemical processes: Managing records with operator intervention and uncertain labels

Author: Askarian Mahdieh
Benítez Iglesias Raúl
Graells Sobré Moisès
Zarghami Reza
Publication venue: 'Elsevier BV'
Publication date: 23/06/2016
Field of study

Developing data-driven fault detection systems for chemical plants requires managing uncertain data labels and dynamic attributes due to operator-process interactions. Mislabeled data is a known problem in computer science that has received scarce attention from the process systems community. This work introduces and examines the effects of operator actions in records and labels, and the consequences in the development of detection models. Using a state space model, this work proposes an iterative relabeling scheme for retraining classifiers that continuously refines dynamic attributes and labels. Three case studies are presented: a reactor as a motivating example, flooding in a simulated de-Butanizer column, as a complex case, and foaming in an absorber as an industrial challenge. For the first case, detection accuracy is shown to increase by 14% while operating costs are reduced by 20%. Moreover, regarding the de-Butanizer column, the performance of the proposed strategy is shown to be 10% higher than the filtering strategy. Promising results are finally reported in regard of efficient strategies to deal with the presented problemPeer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC