1,245 research outputs found

    Multilabel Prototype Generation for data reduction in K-Nearest Neighbour classification

    Get PDF
    Prototype Generation (PG) methods are typically considered for improving the efficiency of the k-Nearest Neighbour (kNN) classifier when tackling high-size corpora. Such approaches aim at generating a reduced version of the corpus without decreasing the classification performance when compared to the initial set. Despite their large application in multiclass scenarios, very few works have addressed the proposal of PG methods for the multilabel space. In this regard, this work presents the novel adaptation of four multiclass PG strategies to the multilabel case. These proposals are evaluated with three multilabel kNN-based classifiers, 12 corpora comprising a varied range of domains and corpus sizes, and different noise scenarios artificially induced in the data. The results obtained show that the proposed adaptations are capable of significantly improving—both in terms of efficiency and classification performance—the only reference multilabel PG work in the literature as well as the case in which no PG method is applied, also presenting statistically superior robustness in noisy scenarios. Moreover, these novel PG strategies allow prioritising either the efficiency or efficacy criteria through its configuration depending on the target scenario, hence covering a wide area in the solution space not previously filled by other works.This research was partially funded by the Spanish Ministerio de Ciencia e Innovación through the MultiScore (PID2020-118447RA-I00) and DOREMI (TED2021-132103A-I00) projects. The first author is supported by grant APOSTD/2020/256 from “Programa I+D+i de la Generalitat Valenciana”

    Multilabel Prototype Generation for Data Reduction in k-Nearest Neighbour classification

    Get PDF
    Prototype Generation (PG) methods are typically considered for improving the efficiency of the kk-Nearest Neighbour (kkNN) classifier when tackling high-size corpora. Such approaches aim at generating a reduced version of the corpus without decreasing the classification performance when compared to the initial set. Despite their large application in multiclass scenarios, very few works have addressed the proposal of PG methods for the multilabel space. In this regard, this work presents the novel adaptation of four multiclass PG strategies to the multilabel case. These proposals are evaluated with three multilabel kkNN-based classifiers, 12 corpora comprising a varied range of domains and corpus sizes, and different noise scenarios artificially induced in the data. The results obtained show that the proposed adaptations are capable of significantly improving -- both in terms of efficiency and classification performance -- the only reference multilabel PG work in the literature as well as the case in which no PG method is applied, also presenting a statistically superior robustness in noisy scenarios. Moreover, these novel PG strategies allow prioritising either the efficiency or efficacy criteria through its configuration depending on the target scenario, hence covering a wide area in the solution space not previously filled by other works

    A new fuzzy set merging technique using inclusion-based fuzzy clustering

    Get PDF
    This paper proposes a new method of merging parameterized fuzzy sets based on clustering in the parameters space, taking into account the degree of inclusion of each fuzzy set in the cluster prototypes. The merger method is applied to fuzzy rule base simplification by automatically replacing the fuzzy sets corresponding to a given cluster with that pertaining to cluster prototype. The feasibility and the performance of the proposed method are studied using an application in mobile robot navigation. The results indicate that the proposed merging and rule base simplification approach leads to good navigation performance in the application considered and to fuzzy models that are interpretable by experts. In this paper, we concentrate mainly on fuzzy systems with Gaussian membership functions, but the general approach can also be applied to other parameterized fuzzy sets

    Improving the family orientation process in Cuban Special Schools trough Nearest Prototype classification

    Get PDF
    Cuban Schools for children with Affective – Behavioral Maladies (SABM) have as goal to accomplish a major change in children behavior, to insert them effectively into society. One of the key elements in this objective is to give an adequate orientation to the children’s families; due to the family is one of the most important educational contexts in which the children will develop their personality. The family orientation process in SABM involves clustering and classification of mixed type data with non-symmetric similarity functions. To improve this process, this paper includes some novel characteristics in clustering and prototype selection. The proposed approach uses a hierarchical clustering based on compact sets, making it suitable for dealing with non-symmetric similarity functions, as well as with mixed and incomplete data. The proposal obtains very good results on the SABM data, and over repository databases

    Evolving Spiking Neural Networks for online learning over drifting data streams

    Get PDF
    Nowadays huge volumes of data are produced in the form of fast streams, which are further affected by non-stationary phenomena. The resulting lack of stationarity in the distribution of the produced data calls for efficient and scalable algorithms for online analysis capable of adapting to such changes (concept drift). The online learning field has lately turned its focus on this challenging scenario, by designing incremental learning algorithms that avoid becoming obsolete after a concept drift occurs. Despite the noted activity in the literature, a need for new efficient and scalable algorithms that adapt to the drift still prevails as a research topic deserving further effort. Surprisingly, Spiking Neural Networks, one of the major exponents of the third generation of artificial neural networks, have not been thoroughly studied as an online learning approach, even though they are naturally suited to easily and quickly adapting to changing environments. This work covers this research gap by adapting Spiking Neural Networks to meet the processing requirements that online learning scenarios impose. In particular the work focuses on limiting the size of the neuron repository and making the most of this limited size by resorting to data reduction techniques. Experiments with synthetic and real data sets are discussed, leading to the empirically validated assertion that, by virtue of a tailored exploitation of the neuron repository, Spiking Neural Networks adapt better to drifts, obtaining higher accuracy scores than naive versions of Spiking Neural Networks for online learning environments.This work was supported by the EU project Pacific AtlanticNetwork for Technical Higher Education and Research—PANTHER(grant number 2013-5659/004-001 EMA2)
    • …
    corecore