3 research outputs found

    A Variable Metric Probabilistic k-Nearest-Neighbours Classifier

    Get PDF
    Copyright © 2004 Springer Verlag. The final publication is available at link.springer.com5th International Conference, Exeter, UK. August 25-27, 2004. ProceedingsBook title: Intelligent Data Engineering and Automated Learning – IDEAL 2004k-nearest neighbour (k-nn) model is a simple, popular classifier. Probabilistic k-nn is a more powerful variant in which the model is cast in a Bayesian framework using (reversible jump) Markov chain Monte Carlo methods to average out the uncertainy over the model parameters.The k-nn classifier depends crucially on the metric used to determine distances between data points. However, scalings between features, and indeed whether some subset of features is redundant, are seldom known a priori. Here we introduce a variable metric extension to the probabilistic k-nn classifier, which permits averaging over all rotations and scalings of the data. In addition, the method permits automatic rejection of irrelevant features. Examples are provided on synthetic data, illustrating how the method can deform feature space and select salient features, and also on real-world data

    Definition of valid proteomic biomarkers: a bayesian solution

    Get PDF
    Clinical proteomics is suffering from high hopes generated by reports on apparent biomarkers, most of which could not be later substantiated via validation. This has brought into focus the need for improved methods of finding a panel of clearly defined biomarkers. To examine this problem, urinary proteome data was collected from healthy adult males and females, and analysed to find biomarkers that differentiated between genders. We believe that models that incorporate sparsity in terms of variables are desirable for biomarker selection, as proteomics data typically contains a huge number of variables (peptides) and few samples making the selection process potentially unstable. This suggests the application of a two-level hierarchical Bayesian probit regression model for variable selection which assumes a prior that favours sparseness. The classification performance of this method is shown to improve that of the Probabilistic K-Nearest Neighbour model

    On the use of diagonal and class-dependent weighted distances for the Probabilistic k-nearest neighbor

    Full text link
    A probabilistic k-nn (PKnn) method was introduced in [13] under the Bayesian point of view. This work showed that posterior inference over the parameter k can be performed in a relatively straightforward manner using Markov Chain Monte Carlo (MCMC) methods. This method was extended by Everson and Fieldsen [14] to deal with metric learning. In this work we propose two different dissimilarities functions to be used inside this PKnn framework. These dissimilarities functions can be seen as a simplified version of the full-covariance distance functions just proposed. Furthermore we propose to use a class- dependent dissimilarity function as proposed in [8] aim at improving the k-nn classifier. In the present work we pursue a simultaneously learning of the dissimilarity function parameters together with the parameter k of the k-nn classifier. The experiments show that this simultaneous learning lead to an improvement of the classifier with respect to the standard k-nn and state-of-the-art technique as well.Work supported by the Spanish MEC/MICINN under the MIPRCV Consolider Ingenio 2010 program (CSD2007-00018).Paredes Palacios, R.; Girolami, M. (2011). On the use of diagonal and class-dependent weighted distances for the Probabilistic k-nearest neighbor. En Pattern Recognition and Image Analysis. Springer Verlag (Germany). 6669:265-272. https://doi.org/10.1007/978-3-642-21257-4_33S2652726669Short, R., Fukunaga, K.: A new nearest neighbor distance measure. In: Proceedings 5th IEEE Int. Conf. Pattern Recognition, Miami Beach, FL, pp. 81–86 (1980)Ricci, F., Avesani, P.: Data Compression and Local Metrics for Nearest Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(4), 380–384 (1999)Paredes, R., Vidal, E.: A class-dependent weighted dissimilarity measure for nearest neighbor classification problems. Pattern Recognition Letters 21, 1027–1036 (2000)Domeniconi, C., Peng, J., Gunopulos, D.: Locally Adaptive Metric Nearest Neighbor Classification. IEEE Transaction on Pattern Analysis and Machine Intelligence 24(9), 1281–1285 (2002)de Ridder, D., Kouropteva, O., Okun, O., PietikÃd’inen, M., Duin, R.P.W.: Supervised locally linear embedding. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, pp. 333–341. Springer, Heidelberg (2003)Peng, J., Heisterkamp, D.R., Dai, H.: Adaptive Quasiconformal Kernel Nearest Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(5)de Ridder, D., Loog, M., Reinders, M.J.T.: Local fisher embedding. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), vol. 2, pp. 295–298 (2004)Paredes, R., Vidal, E.: Learning weighted metrics to minimize nearest-neighbor classification error. IEEE Transactions on Pattern Analisys and Machine Intelligence 28(7), 1100–1111 (2006)Wilson, D.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trnas. Syst., Man, Cyber. SMC-2, 408–421 (1972)Ferri, F., Albert, J., Vidal, E.: Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules. IEEE Trnas. Syst., Man, Cyber. Part B: Cybernetics 29(4), 667–672 (1999)Paredes, R., Vidal, E.: Weighting prototypes. A new editing approach. In: Proceedings 15th. International Conference on Pattern Recognition, Barcelona, vol. 2, pp. 25–28 (2000)Paredes, R., Vidal, E.: Learning prototypes and distances: a prototype reduction technique based on nearest neighbor error minimization. Pattern Recognition 39(2), 180–188 (2006)Holmes, C.C., Adams, N.M.: A probabilistic nearest neighbour method for statistical pattern recognition. Journal of the Royal Statistical Society Series B 64(2), 295–306 (2002)Everson, R., Fieldsend, J.: A variable metric probabilistic k-nearest-neighbours classifier. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 654–659. Springer, Heidelberg (2004)Manocha, S., Girolami, M.A.: An empirical analysis of the probabilistic k-nearest neighbour classifier. Pattern Recogn. Lett. 28(13), 1818–1824 (2007)Blake, C., Keogh, E., Merz, C.: UCI Repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.htmlD. Statistics, M. S. S. S. University., Statlog Corpora, ftp.strath.ac.ukRaudys, S., Jain, A.: Small Sample Effects in Statistical Pattern Recognition: Recommendations for Practitioners. IEEE Trans. on Pattern Analysis and Machine Intelligence 13(3), 252–264 (1991
    corecore