162 research outputs found

    On the evolutionary optimization of k-NN by label-dependent feature weighting

    Get PDF
    Different approaches of feature weighting and k-value selection to improve the nearest neighbour technique can be found in the literature. In this work, we show an evolutionary approach called k-Label Dependent Evolutionary Distance Weighting (kLDEDW) which calculates a set of local weights depending on each class besides an optimal k value. Thus, we attempt to carry out two improvements simultaneously: we locally transform the feature space to improve the accuracy of the k-nearest-neighbour rule whilst we search for the best value for k from the training data. Rigorous statistical tests demonstrate that our approach improves the general k-nearest-neighbour rule and several approaches based on local weighting

    An evolutionary voting for k-nearest neighbours

    Get PDF
    This work presents an evolutionary approach to modify the voting system of the k-nearest neighbours (kNN) rule we called EvoNN. Our approach results in a real-valued vector which provides the optimal relative con-tribution of the k-nearest neighbours. We compare two possible versions of our algorithm. One of them (EvoNN1) introduces a constraint on the resulted real-valued vector where the greater value is assigned to the nearest neighbour. The second version (EvoNN2) does not include any particular constraint on the order of the weights. We compare both versions with classical kNN and 4 other weighted variants of the kNN on 48 datasets of the UCI repository. Results show that EvoNN1 outperforms EvoNN2 and statistically obtains better results than the rest of the compared methods

    An evolutionary-weighted majority voting and support vector machines applied to contextual classification of LiDAR and imagery data fusion

    Get PDF
    Data classification is a critical step to convert remotely sensed data into thematic information. Environmental researchers have recently maximized the synergy between passive sensors and LiDAR (Light Detection and Ranging) for land cover classification by means of machine learning. Although object-based paradigm is frequently used to classify high resolution imagery, it often requires a high level of expertise and time effort. Contextual classification may lead to similar results with a decrease in time and costs for non-expert users. This work shows a novel contextual classifier based on a Support Vector Machine (SVM) and an Evolutionary Majority Voting (SVM–EMV) to develop thematic maps from LiDAR and imagery data. Subsequently, the performance of SVM–EMV is compared to that achieved by a pixel-based SVM as well as to a contextual classified based on SVM and MRF. The classifiers were tested over three different areas of Spain with well differentiated environmental characteristics. Results show that SVM-EMV statistically outperforms the rest (SVM, SVM–MRF) for the three datasets obtaining a 77%, 91% and 92% of global accuracy for Trabada, Huelva and Alto Tajo, respectively.Xunta de Galicia CSO2010-15807Ministerio de Ciencia y Tecnología TIN2011-28956-C02Junta de Andalucia P11-TIC-752

    On the evolutionary weighting of neighbours and features in the k-nearest neighbour rule

    Get PDF
    This paper presents an evolutionary method for modifying the behaviour of the k-Nearest-Neighbour clas sifier (kNN) called Simultaneous Weighting of Attributes and Neighbours (SWAN). Unlike other weighting methods, SWAN presents the ability of adjusting the contribution of the neighbours and the significance of the features of the data. The optimization process focuses on the search of two real-valued vectors. One of them represents the votes of neighbours, and the other one represents the weight of each feature. The synergy between the two sets of weights found in the optimization process helps to improve significantly, the classification accuracy. The results on 35 datasets from the UCI repository suggest that SWAN statistically outperforms the other weighted kNN method

    A Preliminary Study of the Suitability of Deep Learning to Improve LiDAR-Derived Biomass Estimation

    Get PDF
    Light Detection and Ranging (LiDAR) is a remote sensor able to extract three-dimensional information about forest structure. Bio physical models have taken advantage of the use of LiDAR-derived infor mation to improve their accuracy. Multiple Linear Regression (MLR) is the most common method in the literature regarding biomass estima tion to define the relation between the set of field measurements and the statistics extracted from a LiDAR flight. Unfortunately, there exist open issues regarding the generalization of models from one area to another due to the lack of knowledge about noise distribution, relation ship between statistical features and risk of overfitting. Autoencoders (a type of deep neural network) has been applied to improve the results of machine learning techniques in recent times by undoing possible data corruption process and improving feature selection. This paper presents a preliminary comparison between the use of MLR with and without preprocessing by autoencoders on real LiDAR data from two areas in the province of Lugo (Galizia, Spain). The results show that autoen coders statistically increased the quality of MLR estimations by around 15–30%

    Support vector regression in NIST SRE 2008 multichannel core task

    Full text link
    Actas de las V Jornadas en Tecnología del Habla (JTH 2008)This paper explores two alternatives for speaker verification using Generalized Linear Discriminant Sequence (GLDS) kernel: classical Support Vector Classification (SVC), and Support Vector Regression (SVR), recently proposed by the authors as a more robust approach for telephone speech. In this work we address a more challenging environment, the NIST SRE 2008 multichannel core task, where strong mismatch is introduced by the use of different microphones and recordings from interviews. Channel compensation based in Nuisance Attribute Projection (NAP) has also been investigated in order to analyze its impact for both approaches. Experiments show that, although both techniques show a significant improvement over SVC-GLDS when NAP is used, SVR is also robust to channel mismatch even when channel compensation is not used. This avoids the need of a considerable set of training data adapted to the operational scenario, whose availability is not frequent in general. Results show a similar performance for SVR-GLDS without NAP and SVC-GLDS with NAP. Moreover, SVR-GLDS results are promising, since other configurations and methods for channel compensation can further improve performance.This work has been supported by the Spanish Ministry of Education under project TEC2006-13170-C02-01

    Evolutionary segmentation of yeast genome

    Get PDF
    Segmentation algorithms differ from clustering algorithms with regard to how to deal with the physical location of genes throughout the sequence. Therefore, segments have to keep the original positions of consecutive genes, which is not a constraint for clustering algorithms. It has been proven that exist functional relations among neighbour-genes, so the localization of the boundaries between these functionally similar groups of genes has turned out an important challenge. In this paper, we present an evolutionary algorithm to segment the yeast genome

    Support vector machine regression for robust speaker verification in mismatching and forensic conditions

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-01793-3_50Proceedings of Third International Conference, ICB 2009, Alghero, ItalyIn this paper we propose the use of Support Vector Machine Regression (SVR) for robust speaker verification in two scenarios: i) strong mismatch in speech conditions and ii) forensic environment. The proposed approach seeks robustness to situations where a proper background database is reduced or not present, a situation typical in forensic cases which has been called database mismatch. For the mismatching condition scenario, we use the NIST SRE 2008 core task as a highly variable environment, but with a mostly representative background set coming from past NIST evaluations. For the forensic scenario, we use the Ahumada III database, a public corpus in Spanish coming from real authored forensic cases collected by Spanish Guardia Civil. We show experiments illustrating the robustness of a SVR scheme using a GLDS kernel under strong session variability, even when no session variability is applied, and especially in the forensic scenario, under database mismatch.This work has been supported by the Spanish Ministry of Education under project TEC2006-13170-C02-0

    Incremento de prestaciones en el acceso en Grid de datos

    Get PDF
    Ponencias de las Decimosextas Jornadas de Paralelismo celebradas del 13 al 16 de septiembre de 2005 en GranadaEl modelo de computación Grid ha evolucionado en los últimos años para proporcionar un entorno de computación de altas prestaciones en redes de área amplia. Sin embargo, uno de los mayores problemas se encuentra en las aplicaciones que hacen uso intensivo y masivo de datos. Como solución a los problemas de estas aplicaciones se ha utilizado la replicación. Sin embargo, la replicación clásica adolece de ciertos problemas como la adaptabilidad y la alta latencia del nuevo entorno. Por ello se propone un nuevo algoritmo de replicación y organización de datos que proporciona un acceso de altas prestaciones en un Data Grid.Publicad

    How is educational legislation mirrored in textbooks? A mixed-method study in relation to the minimum contents of music education in Early Childhood Education (3-6 years) = ¿Cómo se refleja la legislación educativa en los libros de texto? Un estudio de metodología mixta en relación a los contenidos mínimos de educación musical en el segundo ciclo (3-6 años) de educación infantil

    Get PDF
    Although there are many works dedicated to the study of either educational legislation or textbooks, only a few are devoted to their joint research. In the present study, we aimed to investigate how the national regulations regarding the mandatory contents related to music in Early Childhood Education in the stage of 3 to 6 years are reflected in textbooks in Spain. By means of a multistage mixed methodology, including analyses based on grounded theory, content analysis, as well as cluster and variance analyses, our results provide evidence for: 1) a modern music education philosophy underlying the national Spanish regulations, 2) discrepancies between this philosophy and its development in textbooks, and 3) the existence of different publisher profiles regarding the treatment of contents in music. Finally, we discuss the implications of our results regarding the use of textbooks at this educational stage and in relation to music education
    corecore