74,151 research outputs found

    k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)

    Get PDF
    Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data. This paper is the second edition of a paper previously published as a technical report. Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN

    A Nearest Neighbours-Based Algorithm for Big Time Series Data Forecasting

    Get PDF
    A forecasting algorithm for big data time series is presented in this work. A nearest neighbours-based strategy is adopted as the main core of the algorithm. A detailed explanation on how to adapt and imple ment the algorithm to handle big data is provided. Although some parts remain iterative, and consequently requires an enhanced implementation, execution times are considered as satisfactory. The performance of the proposed approach has been tested on real-world data related to elec tricity consumption from a public Spanish university, by using a Spark cluster.Ministerio de Economía y Competitividad TIN2014-55894-C2-RJunta de Andalucía P12-TIC-1728Centro de Estudios Andaluces PRY153/14Universidad Pablo de Olavide APPB81309

    k-Nearest Neighbour Classifiers - A Tutorial

    Get PDF
    Perhaps the most straightforward classifier in the arsenal or Machine Learning techniques is the Nearest Neighbour Classifier – classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data.This paper is the second edition of a paper previously published as a technical report . Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods

    Further Empirical Examination of an Improved Sales Comparison Approach

    Get PDF
    Despite the significant advances in applying regression analysis into property valuation, the main features of the sales comparison approach lack thorough research. A series of works have endeavoured to retain the essence of the sales comparison approach, while at the same time take advantage of regressions to derive not only the implicit values of property attributes, but also the degree of similarity between properties. Despite these improvements, the determination of the best regression forms and the piecemeal-type of price adjustment remain vexing problems. The nearest neighbours method assumes that the effects of all attribute differences between the subject and comparable properties are captured by the Mahalanobis distance. The indicated market value of the subject property is simply a weighted average of the actual selling prices of the comparable properties. This method sidesteps the above vexing difficulties and seems worth employing. The present study extends the application of the nearest neighbours method to high-density residential properties, which have not previously been examined. In terms of both the average and coefficient of variations for prediction errors, neither the conventional regression nor the nearest neighbours method outperforms the other. Nevertheless, the distribution of the accumulated prediction errors suggests that the nearest neighbours method is superior over the regression analysis approach. Our empirical findings are, therefore, in favour of further pursuit along the small sample (comparables) methods

    An improved k-nearest neighbours method for traffic time series imputation

    Full text link

    Sit-to-Stand Movement Recognition Using Kinect

    Get PDF
    This paper examines the application of machine-learning techniques to human movement data in order to recognise and compare movements made by different people. Data from an experimental set-up using a sit-to-stand movement are first collected using the Microsoft Kinect input sensor, then normalized and subsequently compared using the assigned labels for correct and incorrect movements. We show that attributes can be extracted from the time series produced by the Kinect sensor using a dynamic time-warping technique. The extracted attributes are then fed to a random forest algorithm, to recognise anomalous behaviour in time series of joint measurements over the whole movement. For comparison, the k-Nearest Neighbours algorithm is also used on the same attributes with good results. Both methods’ results are compared using Multi-Dimensional Scaling for clustering visualisation

    VISUAL ANALYSIS OF RECURRENCE OF TIME SERIES OF THE COORDINATES ENU IN THE GPS STATIONS

    Get PDF
    The time series content information about the dynamic behavior of the system under study. This behavior could be complex, irregular and no lineal. For this reason, it is necessary to study new models that can solve this dynamic more satisfactorily. In this work a visual analysis of recurrence from time series of the coordinate’s variation ENU (East, North, Up) will be made. This analysis was obtained from nine continuous monitoring stations GPS (Global Navigation Satellite System); the intention is to study their behavior, they belong to the Equatorian GPS Network that materializes the reference system SIRGAS – ECUADOR. The presence of noise in the observations was reduced using digital low pass filters with Finite Impulse Response (FIR). For these series, the time delay was determined using the average mutual information, and for the minimum embedding dimension the False Nearest Neighbours (FNN) method was used; the purpose is to obtain the recurrent maps of each coordinates. The results of visual analysis show a strong tendency, especially in the East and North coordinates, while the Up coordinates indicate discontinued, symmetric and periodic behavior

    Forecasting of process disturbances using k-nearest neighbours, with an application in process control

    Get PDF
    This paper examines the prediction of disturbances based on their past measurements using k-nearest neighbours. The aim is to provide a prediction of a measured disturbance to a controller, in order to improve the feed-forward action. This prediction method works in an unsupervised way, it is robust against changes of the characteristics of the disturbance, and its functioning is simple and transparent. The method is tested on data from industrial process plants and compared with predictions from an autoregressive model. A qualitative as well as a quantitative method for analysing the predictability of the time series is provided. As an example, the method is implemented in an MPC framework to control a simple benchmark model

    Neural networks and non-parametric methods for improving real-time flood forecasting through conceptual hydrological models

    No full text
    International audienceTime-series analysis techniques for improving the real-time flood forecasts issued by a deterministic lumped rainfall-runoff model are presented. Such techniques are applied for forecasting the short-term future rainfall to be used as real-time input in a rainfall-runoff model and for updating the discharge predictions provided by the model. Along with traditional linear stochastic models, both stationary (ARMA) and non-stationary (ARIMA), the application of non-linear time-series models is proposed such as Artificial Neural Networks (ANNs) and the ?nearest-neighbours' method, which is a non-parametric regression methodology. For both rainfall forecasting and discharge updating, the implementation of each time-series technique is investigated and the forecasting schemes which perform best are identified. The performances of the models are then compared and the improvement in the efficiency of the discharge forecasts achievable is demonstrated when i) short-term rainfall forecasting is performed, ii) the discharge is updated and iii) both rainfall forecasting and discharge updating are performed in cascade. The proposed techniques, especially those based on ANNs, allow a remarkable improvement in the discharge forecast, compared with the use of heuristic rainfall prediction approaches or the not-updated discharge forecasts given by the deterministic rainfall-runoff model alone
    corecore