Search CORE

1,062 research outputs found

Learning from a Class Imbalanced Public Health Dataset: a Cost-based Comparison of Classifier Performance

Author: Makkithaya Krishnamoorthi
Rao Rohini R
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/08/2017
Field of study

Public health care systems routinely collect health-related data from the population. This data can be analyzed using data mining techniques to find novel, interesting patterns, which could help formulate effective public health policies and interventions. The occurrence of chronic illness is rare in the population and the effect of this class imbalance, on the performance of various classifiers was studied. The objective of this work is to identify the best classifiers for class imbalanced health datasets through a cost-based comparison of classifier performance. The popular, open-source data mining tool WEKA, was used to build a variety of core classifiers as well as classifier ensembles, to evaluate the classifiers’ performance. The unequal misclassification costs were represented in a cost matrix, and cost-benefit analysis was also performed. In another experiment, various sampling methods such as under-sampling, over-sampling, and SMOTE was performed to balance the class distribution in the dataset, and the costs were compared. The Bayesian classifiers performed well with a high recall, low number of false negatives and were not affected by the class imbalance. Results confirm that total cost of Bayesian classifiers can be further reduced using cost-sensitive learning methods. Classifiers built using the random under-sampled dataset showed a dramatic drop in costs and high classification accuracy

IAES journal

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Generation of Controlled Synthetic Samples and Impact of Hyper-Tuning Parameters to Effectively Classify the Complex Structure of Overlapping Region

Author: Aslam Muhammad
Badshah Afzal
Butt Naveed Anwer
Jilani Syeda Fizzah
Mahmood Zafar
Rehman Ghani Ur
Zubair Muhammad
Publication venue
Publication date: 22/08/2022
Field of study

Aberystwyth Research Portal

Research Repository and Portal - University of the West of Scotland

Customer purchase behavior prediction in E-commerce: a conceptual framework and research agenda

Author: Bezbradica Marija
Cirqueira Douglas
Helfert Markus
Hofer Markus
Nedbal Dietmar
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 14/05/2020
Field of study

Digital retailers are experiencing an increasing number of transactions coming from their consumers online, a consequence of the convenience in buying goods via E-commerce platforms. Such interactions compose complex behavioral patterns which can be analyzed through predictive analytics to enable businesses to understand consumer needs. In this abundance of big data and possible tools to analyze them, a systematic review of the literature is missing. Therefore, this paper presents a systematic literature review of recent research dealing with customer purchase prediction in the E-commerce context. The main contributions are a novel analytical framework and a research agenda in the field. The framework reveals three main tasks in this review, namely, the prediction of customer intents, buying sessions, and purchase decisions. Those are followed by their employed predictive methodologies and are analyzed from three perspectives. Finally, the research agenda provides major existing issues for further research in the field of purchase behavior prediction online

DCU Online Research Access Service

On the class overlap problem in imbalanced data classification.

Author: Elyan Eyad
Petrovski Andrei
Vuttipittayamongkol Pattaramon
Publication venue: 'Elsevier BV'
Publication date: 27/11/2020
Field of study

Class imbalance is an active research area in the machine learning community. However, existing and recent literature showed that class overlap had a higher negative impact on the performance of learning algorithms. This paper provides detailed critical discussion and objective evaluation of class overlap in the context of imbalanced data and its impact on classification accuracy. First, we present a thorough experimental comparison of class overlap and class imbalance. Unlike previous work, our experiment was carried out on the full scale of class overlap and an extreme range of class imbalance degrees. Second, we provide an in-depth critical technical review of existing approaches to handle imbalanced datasets. Existing solutions from selective literature are critically reviewed and categorised as class distribution-based and class overlap-based methods. Emerging techniques and the latest development in this area are also discussed in detail. Experimental results in this paper are consistent with existing literature and show clearly that the performance of the learning algorithm deteriorates across varying degrees of class overlap whereas class imbalance does not always have an effect. The review emphasises the need for further research towards handling class overlap in imbalanced datasets to effectively improve learning algorithms’ performance

Open Access Institutional Repository at Robert Gordon University

Enhancing multi-class classification in FARC-HD fuzzy classifier: on the synergy between n-dimensional overlap functions and decomposition strategies

Author: Barrenechea Tartas Edurne
Bustince Sola Humberto
Elkano Ilintxeta Mikel
Fernández Alberto
Galar Idoate Mikel
Herrera Francisco
Sanz Delgado José Antonio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

There are many real-world classification problems involving multiple classes, e.g., in bioinformatics, computer vision or medicine. These problems are generally more difficult than their binary counterparts. In this scenario, decomposition strategies usually improve the performance of classifiers. Hence, in this paper we aim to improve the behaviour of FARC-HD fuzzy classifier in multi-class classification problems using decomposition strategies, and more specifically One-vs-One (OVO) and One-vs-All (OVA) strategies. However, when these strategies are applied on FARC-HD a problem emerges due to the low confidence values provided by the fuzzy reasoning method. This undesirable condition comes from the application of the product t-norm when computing the matching and association degrees, obtaining low values, which are also dependent on the number of antecedents of the fuzzy rules. As a result, robust aggregation strategies in OVO such as the weighted voting obtain poor results with this fuzzy classifier. In order to solve these problems, we propose to adapt the inference system of FARC-HD replacing the product t-norm with overlap functions. To do so, we define n-dimensional overlap functions. The usage of these new functions allows one to obtain more adequate outputs from the base classifiers for the subsequent aggregation in OVO and OVA schemes. Furthermore, we propose a new aggregation strategy for OVO to deal with the problem of the weighted voting derived from the inappropriate confidences provided by FARC-HD for this aggregation method. The quality of our new approach is analyzed using twenty datasets and the conclusions are supported by a proper statistical analysis. In order to check the usefulness of our proposal, we carry out a comparison against some of the state-of-the-art fuzzy classifiers. Experimental results show the competitiveness of our method.This work was supported in part by the Spanish Ministry of Science and Technology under projects TIN2011-28488, TIN-2012-33856 and TIN-2013- 40765-P and the Andalusian Research Plan P10-TIC-6858 and P11-TIC-7765

Academica-e

Introducing artificial data generation in active learning for land use/land cover classification

Author: Bacao Fernando
Douzas Georgios
Fonseca Joao
Publication venue: 'MDPI AG'
Publication date: 01/07/2021
Field of study

Fonseca, J., Douzas, G., & Bacao, F. (2021). Increasing the effectiveness of active learning: Introducing artificial data generation in active learning for land use/land cover classification. Remote Sensing, 13(13), 1-20. [2619]. https://doi.org/10.3390/rs13132619In remote sensing, Active Learning (AL) has become an important technique to collect informative ground truth data “on-demand” for supervised classification tasks. Despite its effectiveness, it is still significantly reliant on user interaction, which makes it both expensive and time consuming to implement. Most of the current literature focuses on the optimization of AL by modifying the selection criteria and the classifiers used. Although improvements in these areas will result in more effective data collection, the use of artificial data sources to reduce human–computer interaction remains unexplored. In this paper, we introduce a new component to the typical AL framework, the data generator, a source of artificial data to reduce the amount of user-labeled data required in AL. The implementation of the proposed AL framework is done using Geometric SMOTE as the data generator. We compare the new AL framework to the original one using similar acquisition functions and classifiers over three AL-specific performance metrics in seven benchmark datasets. We show that this modification of the AL framework significantly reduces cost and time requirements for a successful AL implementation in all of the datasets used in the experiment.publishersversionpublishe

Directory of Open Access Journals

Repositório da Universidade Nova de Lisboa

Ensemble classifier of long short-term memory with fuzzy temporal windows on binary sensors for activity recognition

Author: Alcalá
Banos
Batista
Belley
Bishop
Carnevali
Catal
Chawla
Chen
Chris Nugent
Collins
Cook
Cook
De Pietro
Dietterich
Donahue
Espinilla
Espinilla
Espinilla
Espinilla
Espinilla
Galar
Gu
Guo
Hochreiter
Huynh
Japkowicz
Javier Medina-Quero
Koutitas
Krishnan
Krüger
Lester
Logan
M. Espinilla
Martin
Mazurowski
Medina
Medina
Medina
Medina
Möller-Acuña
Ordónez
Ordóñez
Ortiz
Patterson
Rege
Reyes-Ortiz
Robertson
Roggen
Rossetti
Salimans
San Martín
Sanchez
Shahi
Shewell
Shuai Zhang
Singh
Singla
Sixsmith
Srivastava
Stikic
Storf
Tapia
van Kasteren
Van Kasteren
Van Kasteren
Wilson
Wu
Yala
Yan
Yin
Zadeh
Publication venue: 'Elsevier BV'
Publication date: 01/08/2018
Field of study

Crossref

Ulster University's Research Portal