Search CORE

1,024 research outputs found

Evolutionary design of nearest prototype classifiers

Author: Fernández Fernando
Isasi Pedro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

In pattern classification problems, many works have been carried out with the aim of designing good classifiers from different perspectives. These works achieve very good results in many domains. However, in general they are very dependent on some crucial parameters involved in the design. These parameters have to be found by a trial and error process or by some automatic methods, like heuristic search and genetic algorithms, that strongly decrease the performance of the method. For instance, in nearest prototype approaches, main parameters are the number of prototypes to use, the initial set, and a smoothing parameter. In this work, an evolutionary approach based on Nearest Prototype Classifier (ENPC) is introduced where no parameters are involved, thus overcoming all the problems that classical methods have in tuning and searching for the appropiate values. The algorithm is based on the evolution of a set of prototypes that can execute several operators in order to increase their quality in a local sense, and with a high classification accuracy emerging for the whole classifier. This new approach has been tested using four different classical domains, including such artificial distributions as spiral and uniform distibuted data sets, the Iris Data Set and an application domain about diabetes. In all the cases, the experiments show successfull results, not only in the classification accuracy, but also in the number and distribution of the prototypes achieved.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

One-Class Classification: Taxonomy of Study and Review of Techniques

Author: Khan Shehroz S.
Madden Michael G.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 29/11/2013
Field of study

One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

arXiv.org e-Print Archive

Crossref

Access to Research at National University of Ireland, Galway

Intrusion detection in wi-fi networks by modular and optimized ensemble of classifiers

Author: Alessio Martino
Antonello Rizzi
Giuseppe Granato
Luca Baldini
Publication venue: 'Scitepress'
Publication date: 01/01/2020
Field of study

4noopenWith the breakthrough of pervasive advanced networking infrastructures and paradigms such as 5G and IoT, cybersecurity became an active and crucial field in the last years. Furthermore, machine learning techniques are gaining more and more attention as prospective tools for mining of (possibly malicious) packet traces and automatic synthesis of network intrusion detection systems. In this work, we propose a modular ensemble of classifiers for spotting malicious attacks on Wi-Fi networks. Each classifier in the ensemble is tailored to characterize a given attack class and is individually optimized by means of a genetic algorithm wrapper with the dual goal of hyper-parameters tuning and retaining only relevant features for a specific attack class. Our approach also considers a novel false alarm management procedure thanks to a proper reliability measure formulation. The proposed system has been tested on the well-known AWID dataset, showing performances comparable with other state of the art works both in terms of accuracy and knowledge discovery capabilities. Our system is also characterized by a modular design of the classification model, allowing to include new possible attack classes in an efficient way.openAccademicoGiuseppe Granato; Alessio Martino; Luca Baldini; Antonello RizziGranato, Giuseppe; Martino, Alessio; Baldini, Luca; Rizzi, Antonell

Crossref

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Archivio della ricerca- Università di Roma La Sapienza

Memetic micro-genetic algorithms for cancer data classification

Author: Carballido Jessica Andrea
Olivera Ana Carolina
Rojas Matias Gabriel
Vidal Pablo Javier
Publication venue: Elsevier
Publication date: 01/01/2023
Field of study

Fast and precise medical diagnosis of human cancer is crucial for treatment decisions. Gene selection consists of identifying a set of informative genes from microarray data to allow high predictive accuracy in human cancer classification. This task is a combinatorial search problem, and optimisation methods can be applied for its resolution. In this paper, two memetic micro-genetic algorithms (MμV1 and MμV2) with different hybridisation approaches are proposed for feature selection of cancer microarray data. Seven gene expression datasets are used for experimentation. The comparison with stochastic state-of-the-art optimisation techniques concludes that problem-dependent local search methods combined with micro-genetic algorithms improve feature selection of cancer microarray data.Fil: Rojas, Matias Gabriel. Universidad Nacional de Lujan. Centro de Investigacion Docencia y Extension En Tecnologias de la Informacion y Las Comunicaciones.; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Olivera, Ana Carolina. Universidad Nacional de Cuyo. Facultad de Ingeniería; Argentina. Universidad Nacional de Lujan. Centro de Investigacion Docencia y Extension En Tecnologias de la Informacion y Las Comunicaciones.; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Carballido, Jessica Andrea. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Vidal, Pablo Javier. Universidad Nacional de Cuyo. Facultad de Ingeniería; Argentina. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentin

CONICET Digital

Directory of Open Access Journals

An ensemble based approach for effective intrusion detection using majority voting

Author: Abrar Iram
Bamhdi Alwi M.
Masoodi Faheem
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/04/2021
Field of study

Of late, Network Security Research is taking center stage given the vulnerability of computing ecosystem with networking systems increasingly falling to hackers. On the network security canvas, Intrusion detection system (IDS) is an essential tool used for timely detection of cyber-attacks. A designated set of reliable safety has been put in place to check any severe damage to the network and the user base. Machine learning (ML) is being frequently used to detect intrusion owing to their understanding of intrusion detection systems in minimizing security threats. However, several single classifiers have their limitation and pose challenges to the development of effective IDS. In this backdrop, an ensemble approach has been proposed in current work to tackle the issues of single classifiers and accordingly, a highly scalable and constructive majority voting-based ensemble model was proposed which can be employed in real-time for successfully scrutinizing the network traffic to proactively warn about the possibility of attacks. By taking into consideration the properties of existing machine learning algorithms, an effective model was developed and accordingly, an accuracy of 99%, 97.2%, 97.2%, and 93.2% were obtained for DoS, Probe, R2L, and U2R attacks and thus, the proposed model is effective for identifying intrusion

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Recommended from our members

Parallelizing support vector machines for scalable image annotation

Author: Alham Nasullah Khalid
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments. SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced. The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers. The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications

Brunel University Research Archive

Systematic Review on Missing Data Imputation Techniques with Machine Learning Algorithms for Healthcare

Author: Abidin Nadzurah Zainal
Ismail Amelia Ritahani
Maen Mhd Khaled
Publication venue: 'Universitas Muhammadiyah Yogyakarta'
Publication date: 05/02/2022
Field of study

Missing data is one of the most common issues encountered in data cleaning process especially when dealing with medical dataset. A real collected dataset is prone to be incomplete, inconsistent, noisy and redundant due to potential reasons such as human errors, instrumental failures, and adverse death. Therefore, to accurately deal with incomplete data, a sophisticated algorithm is proposed to impute those missing values. Many machine learning algorithms have been applied to impute missing data with plausible values. However, among all machine learning imputation algorithms, KNN algorithm has been widely adopted as an imputation for missing data due to its robustness and simplicity and it is also a promising method to outperform other machine learning methods. This paper provides a comprehensive review of different imputation techniques used to replace the missing data. The goal of the review paper is to bring specific attention to potential improvements to existing methods and provide readers with a better grasps of imputation technique trends

The International Islamic University Malaysia Repository

Leading & Enlightening Journal UMY