Search CORE

3,987 research outputs found

The similarity metric

Author: Chen Xin
Li Ming
Li Xin
Ma Bin
Vitanyi Paul
Publication venue
Publication date: 01/01/2003
Field of study

A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new ``normalized information distance'', based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the {\em similarity metric}. This theory forms the foundation for a new practical tool. To evidence generality and robustness we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.Comment: 13 pages, LaTex, 5 figures, Part of this work appeared in Proc. 14th ACM-SIAM Symp. Discrete Algorithms, 2003. This is the final, corrected, version to appear in IEEE Trans Inform. T

arXiv.org e-Print Archive

CiteSeerX

International Migration, Integration and Social Cohesion online publications

On The Similarity Metric

Author: Alhajjar Elie
Lefèvre Clément
Publication venue: USMA Digital Commons
Publication date: 24/06/2019
Field of study

In mathematics, and more specifically in topology, the notion of distance metric is well known since the nineteenth century. It is used to measure the difference between two objects. When it comes to characterizing the similarity between two objects, a similarity metric is needed. Although widely used in computer science, such a metric is not clearly defined mathematically. We fill in the existing gap in the current literature concerning similarity metrics, connecting them to the well-known notion of partial metrics in general topology

USMA Digital Commons (United States Military Academy, West Point)

Information similarity metrics in information security and forensics

Author: Quach Tu-Thach
Publication venue: UNM Digital Repository
Publication date: 09/02/2010
Field of study

We study two information similarity measures, relative entropy and the similarity metric, and methods for estimating them. Relative entropy can be readily estimated with existing algorithms based on compression. The similarity metric, based on algorithmic complexity, proves to be more difficult to estimate due to the fact that algorithmic complexity itself is not computable. We again turn to compression for estimating the similarity metric. Previous studies rely on the compression ratio as an indicator for choosing compressors to estimate the similarity metric. This assumption, however, is fundamentally flawed. We propose a new method to benchmark compressors for estimating the similarity metric. To demonstrate its use, we propose to quantify the security of a stegosystem using the similarity metric. Unlike other measures of steganographic security, the similarity metric is not only a true distance metric, but it is also universal in the sense that it is asymptotically minimal among all computable metrics between two objects. Therefore, it accounts for all similarities between two objects. In contrast, relative entropy, a widely accepted steganographic security definition, only takes into consideration the statistical similarity between two random variables. As an application, we present a general method for benchmarking stegosystems. The method is general in the sense that it is not restricted to any covertext medium and therefore, can be applied to a wide range of stegosystems. For demonstration, we analyze several image stegosystems using the newly proposed similarity metric as the security metric. The results show the true security limits of stegosystems regardless of the chosen security metric or the existence of steganalysis detectors. In other words, this makes it possible to show that a stegosystem with a large similarity metric is inherently insecure, even if it has not yet been broken

On Empirical Entropy

Author: Vitányi Paul M. B.
Publication venue
Publication date: 30/03/2011
Field of study

We propose a compression-based version of the empirical entropy of a finite string over a finite alphabet. Whereas previously one considers the naked entropy of (possibly higher order) Markov processes, we consider the sum of the description of the random variable involved plus the entropy it induces. We assume only that the distribution involved is computable. To test the new notion we compare the Normalized Information Distance (the similarity metric) with a related measure based on Mutual Information in Shannon's framework. This way the similarities and differences of the last two concepts are exposed.Comment: 14 pages, LaTe

arXiv.org e-Print Archive

CWI's Institutional Repository

Deep Metric Learning via Facility Location

Author: Jegelka Stefanie
Murphy Kevin
Rathod Vivek
Song Hyun Oh
Publication venue
Publication date: 11/04/2017
Field of study

Learning the representation and the similarity metric in an end-to-end fashion with deep networks have demonstrated outstanding results for clustering and retrieval. However, these recent approaches still suffer from the performance degradation stemming from the local metric training procedure which is unaware of the global structure of the embedding space. We propose a global metric learning scheme for optimizing the deep metric embedding with the learnable clustering function and the clustering metric (NMI) in a novel structured prediction framework. Our experiments on CUB200-2011, Cars196, and Stanford online products datasets show state of the art performance both on the clustering and retrieval tasks measured in the NMI and Recall@K evaluation metrics.Comment: Submission accepted at CVPR 201

arXiv.org e-Print Archive

Crossref

Learning Deep Similarity Metric for 3D MR-TRUS Registration

Author: Haskins Grant
Kruecker Jochen
Kruger Uwe
Pinto Peter A.
Wood Brad J.
Xu Sheng
Yan Pingkun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/10/2018
Field of study

Purpose: The fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images for guiding targeted prostate biopsy has significantly improved the biopsy yield of aggressive cancers. A key component of MR-TRUS fusion is image registration. However, it is very challenging to obtain a robust automatic MR-TRUS registration due to the large appearance difference between the two imaging modalities. The work presented in this paper aims to tackle this problem by addressing two challenges: (i) the definition of a suitable similarity metric and (ii) the determination of a suitable optimization strategy. Methods: This work proposes the use of a deep convolutional neural network to learn a similarity metric for MR-TRUS registration. We also use a composite optimization strategy that explores the solution space in order to search for a suitable initialization for the second-order optimization of the learned metric. Further, a multi-pass approach is used in order to smooth the metric for optimization. Results: The learned similarity metric outperforms the classical mutual information and also the state-of-the-art MIND feature based methods. The results indicate that the overall registration framework has a large capture range. The proposed deep similarity metric based approach obtained a mean TRE of 3.86mm (with an initial TRE of 16mm) for this challenging problem. Conclusion: A similarity metric that is learned using a deep neural network can be used to assess the quality of any given image registration and can be used in conjunction with the aforementioned optimization framework to perform automatic registration that is robust to poor initialization.Comment: To appear on IJCAR

arXiv.org e-Print Archive

George Washington University: Health Sciences Research Commons (HSRC)

Experimental study of a similarity metric for retrieving pieces from structured plan cases: its role in the originality of plan case solutions

Author: Cardoso Fernando Amílcar Bandeira
Grilo Carlos Fernando Almeida
Macedo Luís
Pereira Francisco C.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/07/1996
Field of study

This paper describes a quantitative similarity metric and its contribution to achieve original plan solutions. This similarity metric is used by an iterative process of piece retrieval from structured plan cases. Within our approach plan cases are tree-like networks of pieces (goals and actions). These case pieces are ill-related each other by links (explanations). These links may be classified as hierarchical or temporal, antecedent or consequent, and explicit or implicit. Besides links, each case piece has also information about its properties (the attributes-value pairs), its hierarchical and temporal position in the case (the address), and about its constraints in the relationship with others (the constraints). The similarity metric computes a similarity value between two case pieces taking into account similarities between these case piece’s information types. Each time a problem is proposed, different weights are given to some of those similarities, with the aim of solving it with an original solution. This similarity metric is used by the system INSPIRER (ImagiNation taking as Source Past and Imperfectly REalated Reasonings). We illustrate the role of the similarity metric in the creativity of solutions, focusing specially their originality, with the presentation of the experimental results obtained in the musical composition domain, which is considered by us as a planning domain

IC-online