Search CORE

2,753 research outputs found

KB-CB-N classification: towards unsupervised approach for supervised learning

Author: Abdallah Z.
Gaber M.
Publication venue
Publication date: 01/01/2011
Field of study

Portsmouth University Research Portal (Pure)

Explore Bristol Research

Ptolemaic Indexing

Author: Hetland Magnus Lie
Publication venue
Publication date: 01/01/2015
Field of study

This paper discusses a new family of bounds for use in similarity search, related to those used in metric indexing, but based on Ptolemy's inequality, rather than the metric axioms. Ptolemy's inequality holds for the well-known Euclidean distance, but is also shown here to hold for quadratic form metrics in general, with Mahalanobis distance as an important special case. The inequality is examined empirically on both synthetic and real-world data sets and is also found to hold approximately, with a very low degree of error, for important distances such as the angular pseudometric and several Lp norms. Indexing experiments demonstrate a highly increased filtering power compared to existing, triangular methods. It is also shown that combining the Ptolemaic and triangular filtering can lead to better results than using either approach on its own

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Journal of Computational Geometry (JoCG - Carleton University, Computational Geometry Lab)

NORA - Norwegian Open Research Archives

One-class classifiers based on entropic spanning graphs

Author: Alippi Cesare
Livi Lorenzo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/08/2016
Field of study

One-class classifiers offer valuable tools to assess the presence of outliers in data. In this paper, we propose a design methodology for one-class classifiers based on entropic spanning graphs. Our approach takes into account the possibility to process also non-numeric data by means of an embedding procedure. The spanning graph is learned on the embedded input data and the outcoming partition of vertices defines the classifier. The final partition is derived by exploiting a criterion based on mutual information minimization. Here, we compute the mutual information by using a convenient formulation provided in terms of the

\alpha

-Jensen difference. Once training is completed, in order to associate a confidence level with the classifier decision, a graph-based fuzzy model is constructed. The fuzzification process is based only on topological information of the vertices of the entropic spanning graph. As such, the proposed one-class classifier is suitable also for data characterized by complex geometric structures. We provide experiments on well-known benchmarks containing both feature vectors and labeled graphs. In addition, we apply the method to the protein solubility recognition problem by considering several representations for the input samples. Experimental results demonstrate the effectiveness and versatility of the proposed method with respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN, Vancouver, Canad

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Open Research Exeter

Recommended from our members

Using CBR to improve the usability of numerical models

Author: Woon Fei Ling
Publication venue: University of Greenwich,
Publication date: 01/10/2005
Field of study

In this thesis we show that CBR systems can be constructed from numerical models, so as to improve their usability. It is shown that CBR models may be queried in a flexible manner, and that the user may formulate queries consisting of constraints over both “input” and “output” variables of the numerical model. It is also shown that the constraints may be formulated using either nominal or continuous variables. A generalization of the CBR retrieval process to include constraints over unified “input-output” space is formulated as a framework for the method. The method is illustrated with practical engineering models: the pneumatic conveyor problem and the projectile problem. Comparisons are made on usability of CBR and numerical models for specific problems. It is shown that CBR models can answer questions difficult or impossible to formulate using numerical models, and that CBR models can be faster. The thesis also addresses a latent problem with the general method, which is of importance generally. This is to do with interpolation over nominal values in unified space. A novel method is proposed for interpolation over nominal values, termed Generalised Shepard Nearest Neighbour method (GSNN). GSNN can utilise distance metrics defined on the solution space of a CBR system. The properties and advantages of GSNN are examined in the thesis. A comparison is made with other CBR retrieval methods, using several examples, including the travel domain case base. It is shown that GSNN can out-perform conventional nearest neighbour methods. It is shown that GSNN has advantages in that it can find solutions not in the case base and it can find solutions not in the retrieval set. It is also shown that the performance of GSNN can be improved further by using it in conjunction with a diversity algorithm. The merit of using GSNN as a case selection component is examined, and it is shown that it can give good results in sparse case bases. Finally the thesis concludes with a survey of numerical models where CBR construction can be useful, and where benefits can be expected

Greenwich Academic Literature Archive