Search CORE

1,518 research outputs found

EPRENNID: An evolutionary prototype reduction based ensemble for nearest neighbor classification of imbalanced data

Author: Alcalá-Fdez
Alpaydin
Barua
Batista
Blaszczynski
Breiman
Cano
Castro
Chawla
Chris Cornelis
Cover
Das
Datta
Demšar
Díez-Pastor
Fawcett
Friedman
Galar
García
García
García
García
García-Pedrajas
Hand
He
Hido
Isaac Triguero
Khoshgoftaar
Kononenko
Krawczyk
Krawczyk
Kuncheva
Lee
Lin
López
López
Neri
Pawlak
Ramentol
Sarah Vluymans
Schapire
Seiffert
Storn
Ting
Triguero
Triguero
Triguero
Triguero
Wang
Wilson
Wilson
Yijing
Yu
Yule
Yvan Saeys
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Classification problems with an imbalanced class distribution have received an increased amount of attention within the machine learning community over the last decade. They are encountered in a growing number of real-world situations and pose a challenge to standard machine learning techniques. We propose a new hybrid method specifically tailored to handle class imbalance, called EPRENNID. It performs an evolutionary prototype reduction focused on providing diverse solutions to prevent the method from overfitting the training set. It also allows us to explicitly reduce the underrepresented class, which the most common preprocessing solutions handling class imbalance usually protect. As part of the experimental study, we show that the proposed prototype reduction method outperforms state-of-the-art preprocessing techniques. The preprocessing step yields multiple prototype sets that are later used in an ensemble, performing a weighted voting scheme with the nearest neighbor classifier. EPRENNID is experimentally shown to significantly outperform previous proposals

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

Ghent University Academic Bibliography

Using computational intelligence for knowledge discovery from the human microbiome

Author: Wingfield Benjamin
Publication venue
Publication date: 01/03/2019
Field of study

Ulster University's Research Portal

Fraud Detection in Telecommunications Industry: Bridging the Gap with Random Rough Subspace Based Neural Network Ensemble Method

Author: Adebisi Rachael
Adebomi Adenike
Amoo Adekemi
Awoyelu Iyabo
Mabude Charles
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 31/10/2015
Field of study

Fraud has been very common in the society and it affects private enterprises as well as public entities. Telecommunication companies worldwide suffer from customers who use the provided services without paying. There are also different types of telecommunication fraud such as subscription fraud, clip on fraud, call forwarding, cloning fraud, roaming fraud and calling card fraud. Thus, detection and prevention of these frauds are the main targets of the telecommunication industry. This paper addresses the various techniques of detecting fraud, giving the limitations of each technique and proposes random rough subspace-based neural network ensemble method for effective fraud detection. Keywords: Fraud, Fraud detection, Random rough subspace, Neural network, Telecommunication

International Institute for Science, Technology and Education (IISTE): E-Journals

Efficient image retrieval by fuzzy rules from boosting and metaheuristic

Author: Angryk Rafal A.
Kordos Miroslaw
Korytkowski Marcin
Scherer Magdalena M.
Siwocha Agnieszka
Šenkeřík Roman
Publication venue: Sciendo
Publication date: 01/01/2020
Field of study

Fast content-based image retrieval is still a challenge for computer systems. We present a novel method aimed at classifying images by fuzzy rules and local image features. The fuzzy rule base is generated in the first stage by a boosting procedure. Boosting meta-learning is used to find the most representative local features. We briefly explore the utilization of metaheuristic algorithms for the various tasks of fuzzy systems optimization. We also provide a comprehensive description of the current best-performing DISH algorithm, which represents a powerful version of the differential evolution algorithm with effective embedded mechanisms for stronger exploration and preservation of the population diversity, designed for higher dimensional and complex optimization tasks. The algorithm is used to fine-tune the fuzzy rule base. The fuzzy rules can also be used to create a database index to retrieve images similar to the query image fast. The proposed approach is tested on a state-of-the-art image dataset and compared with the bag-of-features image representation model combined with the Support Vector Machine classification. The novel method gives a better classification accuracy, and the time of the training and testing process is significantly shorter. © 2020 Marcin Korytkowski et al., published by Sciendo.program of the Polish Minister of Science and Higher Education under the name "Regional Initiative of Excellence" in the years 2019-2022 [020/RID/2018/19

Biblioteka Nauki - repozytorium artykuÅÃ³w

Institutional repository of Tomas Bata University Library

Water filtration by using apple and banana peels as activated carbon

Author: Ahmad Siti Ajariah
Jumaat Nur Amirah
Mohd Sahimi Amir Muhaimin
Ramli Nurul Natasah Haziqah
Publication venue: 'Penerbit UTHM'
Publication date: 01/01/2020
Field of study

Water filter is an important devices for reducing the contaminants in raw water. Activated from charcoal is used to absorb the contaminants. Fruit peels are some of the suitable alternative carbon to substitute the charcoal. Determining the role of fruit peels which were apple and banana peels powder as activated carbon in water filter is the main goal. Drying and blending the peels till they become powder is the way to allow them to absorb the contaminants. Comparing the results for raw water before and after filtering is the observation. After filtering the raw water, the reading for pH was 6.8 which is in normal pH and turbidity reading recorded was 658 NTU. As for the colour, the water becomes more clear compared to the raw water. This study has found that fruit peels such as banana and apple are an effective substitute to charcoal as natural absorbent

UTHM Institutional Repository

Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

Author: Bazlur Rashid A. N. M.
Choudhury Tonmoy
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2019
Field of study

The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

Research Online @ ECU

Multiple Relevant Feature Ensemble Selection Based on Multilayer Co-Evolutionary Consensus MapReduce

Author: Ding W
Lin CT
Pedrycz W
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/08/2018
Field of study

IEEE Although feature selection for large data has been intensively investigated in data mining, machine learning, and pattern recognition, the challenges are not just to invent new algorithms to handle noisy and uncertain large data in applications, but rather to link the multiple relevant feature sources, structured, or unstructured, to develop an effective feature reduction method. In this paper, we propose a multiple relevant feature ensemble selection (MRFES) algorithm based on multilayer co-evolutionary consensus MapReduce (MCCM). We construct an effective MCCM model to handle feature ensemble selection of large-scale datasets with multiple relevant feature sources, and explore the unified consistency aggregation between the local solutions and global dominance solutions achieved by the co-evolutionary memeplexes, which participate in the cooperative feature ensemble selection process. This model attempts to reach a mutual decision agreement among co-evolutionary memeplexes, which calls for the need for mechanisms to detect some noncooperative co-evolutionary behaviors and achieve better Nash equilibrium resolutions. Extensive experimental comparative studies substantiate the effectiveness of MRFES to solve large-scale dataset problems with the complex noise and multiple relevant feature sources on some well-known benchmark datasets. The algorithm can greatly facilitate the selection of relevant feature subsets coming from the original feature space with better accuracy, efficiency, and interpretability. Moreover, we apply MRFES to human cerebral cortex-based classification prediction. Such successful applications are expected to significantly scale up classification prediction for large-scale and complex brain data in terms of efficiency and feasibility

OPUS - University of Technology Sydney

IRS-BAG-Integrated Radius-SMOTE Algorithm with Bagging Ensemble Learning Model for Imbalanced Data Set Classification

Author: Ayu Putu Desiana Wulaning
Hermawan Dadang
Hostiadi Dandy Pramana
Huizen Roy Rudolf
Pradipta Gede Angga
Yuningsih Lilis
Publication venue: Ital Publication
Publication date: 01/10/2023
Field of study

Imbalanced learning problems are a challenge faced by classifiers when data samples have an unbalanced distribution among classes. The Synthetic Minority Over-Sampling Technique (SMOTE) is one of the most well-known data pre-processing methods. Problems that arise when oversampling with SMOTE are the phenomenon of noise, small disjunct samples, and overfitting due to a high imbalance ratio in a dataset. A high level of imbalance ratio and low variance conditions cause the results of synthetic data generation to be collected in narrow areas and conflicting regions among classes and make them susceptible to overfitting during the learning process by machine learning methods. Therefore, this research proposes a combination between Radius-SMOTE and Bagging Algorithm called the IRS-BAG Model. For each sub-sample generated by bootstrapping, oversampling was done using Radius SMOTE. Oversampling on the sub-sample was likely to overcome overfitting problems that might occur. Experiments were carried out by comparing the performance of the IRS-BAG model with various previous oversampling methods using the imbalanced public dataset. The experiment results using three different classifiers proved that all classifiers had gained a notable improvement when combined with the proposed IRS-BAG model compared with the previous state-of-the-art oversampling methods. Doi: 10.28991/ESJ-2023-07-05-04 Full Text: PD

Emerging Science Journal (ESJ)