13,031 research outputs found

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    Automated screening of children with obstructive sleep apnea using nocturnal oximetry: An alternative to respiratory polygraphy in unattended settings

    Get PDF
    Producción CientíficaStudy Objectives: Nocturnal oximetry has emerged as a simple, readily available, and potentially useful diagnostic tool of childhood obstructive sleep apnea-hypopnea syndrome (OSAHS). However, at-home respiratory polygraphy (HRP) remains the preferred alternative to polysomnography (PSG) in unattended settings. The aim of this study was two-fold: (1) to design and assess a novel methodology for pediatric OSAHS screening based on automated analysis of at-home oxyhemoglobin saturation (SpO2), and (2) to compare its diagnostic performance with HRP. Methods: SpO2 recordings were parameterized by means of time, frequency, and conventional oximetric measures. Logistic regression (LR) models were optimized using genetic algorithms (GAs) for 3 cutoffs for OSAHS: 1, 3, and 5 events per hour (e/h). The diagnostic performance of LR models, manual obstructive apnea-hypopnea index (OAHI) from HRP, and the conventional oxygen desaturation index ≥3% (ODI3) were assessed. Results: For a cutoff of 1 e/h, the optimal LR model significantly outperformed both conventional HRP-derived ODI3 and OAHI: 85.5% Accuracy (HRP 74.6%; ODI3 65.9%) and 0.97 AUC (HRP 0.78; ODI3 0.75) were reached. For a cutoff of 3 e/h, the LR model achieved 83.4% Accuracy (HRP 85.0%; ODI3 74.5%) and 0.96 AUC (HRP 0.93; ODI3 0.85) whereas using a cutoff of 5 e/h, oximetry reached 82.8% Accuracy (HRP 85.1%; ODI3 76.7) and 0.97 AUC (HRP 0.95; ODI3 0.84). Conclusions: Automated analysis of at-home SpO2 recordings provide accurate detection of children with high pre-test probability of OSAHS. Thus, unsupervised nocturnal oximetry may enable a simple and effective alternative to HRP and PSG in unattended settings.This research has been partially supported by the project 153/2015 of the Sociedad Española de Neumología y Cirugía Torácica (SEPAR), the project RTC-2015-3446-1 from the Ministerio de Economía y Competitividad and the European Regional Development Fund (FEDER), and the project VA037U16 from the Consejería de Educación de la Junta de Castilla y León and FEDER. L. Kheirandish-Gozal is supported by NIH grant 1R01HL130984-01. D. Álvarez was in receipt of a Juan de la Cierva grant from the Ministerio de Economía y Competitividad

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    The influence of mutation on population dynamics in multiobjective genetic programming

    Get PDF
    Using multiobjective genetic programming with a complexity objective to overcome tree bloat is usually very successful but can sometimes lead to undesirable collapse of the population to all single-node trees. In this paper we report a detailed examination of why and when collapse occurs. We have used different types of crossover and mutation operators (depth-fair and sub-tree), different evolutionary approaches (generational and steady-state), and different datasets (6-parity Boolean and a range of benchmark machine learning problems) to strengthen our conclusion. We conclude that mutation has a vital role in preventing population collapse by counterbalancing parsimony pressure and preserving population diversity. Also, mutation controls the size of the generated individuals which tends to dominate the time needed for fitness evaluation and therefore the whole evolutionary process. Further, the average size of the individuals in a GP population depends on the evolutionary approach employed. We also demonstrate that mutation has a wider role than merely culling single-node individuals from the population; even within a diversity-preserving algorithm such as SPEA2 mutation has a role in preserving diversity
    corecore