2,738 research outputs found

    On the evolutionary weighting of neighbours and features in the k-nearest neighbour rule

    Get PDF
    This paper presents an evolutionary method for modifying the behaviour of the k-Nearest-Neighbour clas sifier (kNN) called Simultaneous Weighting of Attributes and Neighbours (SWAN). Unlike other weighting methods, SWAN presents the ability of adjusting the contribution of the neighbours and the significance of the features of the data. The optimization process focuses on the search of two real-valued vectors. One of them represents the votes of neighbours, and the other one represents the weight of each feature. The synergy between the two sets of weights found in the optimization process helps to improve significantly, the classification accuracy. The results on 35 datasets from the UCI repository suggest that SWAN statistically outperforms the other weighted kNN method

    An evolutionary voting for k-nearest neighbours

    Get PDF
    This work presents an evolutionary approach to modify the voting system of the k-nearest neighbours (kNN) rule we called EvoNN. Our approach results in a real-valued vector which provides the optimal relative con-tribution of the k-nearest neighbours. We compare two possible versions of our algorithm. One of them (EvoNN1) introduces a constraint on the resulted real-valued vector where the greater value is assigned to the nearest neighbour. The second version (EvoNN2) does not include any particular constraint on the order of the weights. We compare both versions with classical kNN and 4 other weighted variants of the kNN on 48 datasets of the UCI repository. Results show that EvoNN1 outperforms EvoNN2 and statistically obtains better results than the rest of the compared methods

    Improving the k-Nearest Neighbour Rule by an Evolutionary Voting Approach

    Get PDF
    This work presents an evolutionary approach to modify the voting system of the k-Nearest Neighbours (kNN). The main novelty of this article lies on the optimization process of voting regardless of the distance of every neighbour. The calculated real-valued vector through the evolutionary process can be seen as the relative contribution of every neighbour to select the label of an unclassified example. We have tested our approach on 30 datasets of the UCI repository and results have been compared with those obtained from other 6 variants of the kNN predictor, resulting in a realistic improvement statistically supported

    On the evolutionary optimization of k-NN by label-dependent feature weighting

    Get PDF
    Different approaches of feature weighting and k-value selection to improve the nearest neighbour technique can be found in the literature. In this work, we show an evolutionary approach called k-Label Dependent Evolutionary Distance Weighting (kLDEDW) which calculates a set of local weights depending on each class besides an optimal k value. Thus, we attempt to carry out two improvements simultaneously: we locally transform the feature space to improve the accuracy of the k-nearest-neighbour rule whilst we search for the best value for k from the training data. Rigorous statistical tests demonstrate that our approach improves the general k-nearest-neighbour rule and several approaches based on local weighting

    Inferring the rules of social interaction in migrating caribou

    Get PDF
    Social interactions are a significant factor that influence the decision-making of species ranging from humans to bacteria. In the context of animal migration, social interactions may lead to improved decision-making, greater ability to respond to environmental cues, and the cultural transmission of optimal routes. Despite their significance, the precise nature of social interactions in migrating species remains largely unknown. Here we deploy unmanned aerial systems to collect aerial footage of caribou as they undertake their migration from Victoria Island to mainland Canada. Through a Bayesian analysis of trajectories we reveal the fine-scale interaction rules of migrating caribou and show they are attracted to one another and copy directional choices of neighbours, but do not interact through clearly defined metric or topological interaction ranges. By explicitly considering the role of social information on movement decisions we construct a map of near neighbour influence that quantifies the nature of information flow in these herds. These results will inform more realistic, mechanism-based models of migration in caribou and other social ungulates, leading to better predictions of spatial use patterns and responses to changing environmental conditions. Moreover, we anticipate that the protocol we developed here will be broadly applicable to study social behaviour in a wide range of migratory and non-migratory taxa. This article is part of the theme issue ‘Collective movement ecology’

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    Fuzzy-rough-learn 0.1 : a Python library for machine learning with fuzzy rough sets

    Get PDF
    We present fuzzy-rough-learn, the first Python library of fuzzy rough set machine learning algorithms. It contains three algorithms previously implemented in R and Java, as well as two new algorithms from the recent literature. We briefly discuss the use cases of fuzzy-rough-learn and the design philosophy guiding its development, before providing an overview of the included algorithms and their parameters

    A unified weighting framework for evaluating nearest neighbour classification

    Full text link
    We present the first comprehensive and large-scale evaluation of classical (NN), fuzzy (FNN) and fuzzy rough (FRNN) nearest neighbour classification. We show that existing proposals for nearest neighbour weighting can be standardised in the form of kernel functions, applied to the distance values and/or ranks of the nearest neighbours of a test instance. Furthermore, we identify three commonly used distance functions and four scaling measures. We systematically evaluate these choices on a collection of 85 real-life classification datasets. We find that NN, FNN and FRNN all perform best with Boscovich distance. NN and FRNN perform best with a combination of Samworth rank- and distance weights and scaling by the mean absolute deviation around the median (r1r_1), the standard deviaton (r2r_2) or the interquartile range (rr_{\infty}^*), while FNN performs best with only Samworth distance-weights and r1r_1- or r2r_2-scaling. We also introduce a new kernel based on fuzzy Yager negation, and show that NN achieves comparable performance with Yager distance-weights, which are simpler to implement than a combination of Samworth distance- and rank-weights. Finally, we demonstrate that FRNN generally outperforms NN, which in turns performs systematically better than FNN
    corecore