389,275 research outputs found

    Semantic Similarity Tailored on the Application Context

    Get PDF
    The paper proposes an approach to assess the semantic similarity among instances of an ontology. It aims to define a sensitive measurement of semantic similarity, which takes into account different hints hidden in the ontology definition and explicitly considers the application context. The similarity measurement is computed by analyzing, combining and extending some of the existing similarity measures and tailoring them according to the criteria induced by specific application context

    Locally Orderless Registration

    Get PDF
    Image registration is an important tool for medical image analysis and is used to bring images into the same reference frame by warping the coordinate field of one image, such that some similarity measure is minimized. We study similarity in image registration in the context of Locally Orderless Images (LOI), which is the natural way to study density estimates and reveals the 3 fundamental scales: the measurement scale, the intensity scale, and the integration scale. This paper has three main contributions: Firstly, we rephrase a large set of popular similarity measures into a common framework, which we refer to as Locally Orderless Registration, and which makes full use of the features of local histograms. Secondly, we extend the theoretical understanding of the local histograms. Thirdly, we use our framework to compare two state-of-the-art intensity density estimators for image registration: The Parzen Window (PW) and the Generalized Partial Volume (GPV), and we demonstrate their differences on a popular similarity measure, Normalized Mutual Information (NMI). We conclude, that complicated similarity measures such as NMI may be evaluated almost as fast as simple measures such as Sum of Squared Distances (SSD) regardless of the choice of PW and GPV. Also, GPV is an asymmetric measure, and PW is our preferred choice.Comment: submitte

    Measuring mashup similarity in open data innovation contests

    Get PDF
    Contests have become an important instrument for fostering the development of novel open data mash-ups, in short open data innovations. Literature calls for new methods for measuring the similarity of open data mash-ups in order to identify code cloning and creative re-use of components of applications. Theoretically grounded computationally methods for identifying the similarity of open data contests are lacking. This study explores the similarity measurement of data-based mashups in the context of an open data innovation contest. Three different dimensions of mashup similarity are defined: code similarity, functional feature similarity, and visualized feature similarity. The results from the contest, including the source code, the running project and the descriptive documents, are collected as the research data for this study. Data analysis is based on the design and development of computational approaches to measure technology and functional similarity. The findings of this study will be helpful in better understanding the similarity of solutions in an open data innovation contest. This study contributes to the theoretical and practical approaches for similarity measurement, especially in the field of mashup development

    LESIM: A Novel Lexical Similarity Measure Technique for Multimedia Information Retrieval

    Get PDF
    Metadata-based similarity measurement is far from obsolete in our days, despite research’s focus on content and context. It allows for aggregating information from textual references, measuring similarity when content is not available, traditional keyword search in search engines, merging results in meta-search engines and many more research and industry interesting activities. Existing similarity measures do not take into consideration neither the unique nature of multimedia’s metadata nor the requirements of metadata-based information retrieval of multimedia. This work proposes a customised for the commonly available author-title multimedia metadata hybrid similarity measure that is shown through experimentation to be significantly more effective than baseline measures

    An Evaluation of Context Awareness in Similarity Measurement: Total-Set Versus Classic Pairwise Comparison

    Get PDF
    How someone perceives similarity gives insight into how they learn and process information. Pairwise comparison is a useful tool in determining a person’s perception of similarity. In classic pairwise comparison, a participant is shown two items of a set at a time, repeating the process until the entirety of the set has been evaluated. Total-set pairwise comparison shows the participant the entire set while highlighting items in the set to evaluate. It has been assumed that, in the classic method, the participant’s judgments across trials are made with increasing awareness of the total-set context, and that the total-set method fixes this awareness from the start. Our study will systematically evaluate changes in awareness across trials in each procedure and explore whether there is a bias toward basic-level representation of the (unknown) total set in the classic method. The program created will evaluate and probe participants on three levels of superordinate, basic, and subordinate in both classic and total-set pairwise comparison. Thus far, the program measuring this study has been created and is currently in the stages of data collection. The results of our study will help researchers to more intelligently choose between classic and total-set pairwise comparison when measuring subjective similarity

    The role of social tags in web resource discovery:  an evaluation of user-generated keywords

    Get PDF
    Social tags are user generated metadata and play vital role in Information Retrieval (IR) of web resources. This study is an attempt to determine the similarities between social tags extracted from LibraryThing and Library of Congress Subject Headings (LCSH) for the titles chosen for study by adopting Cosine similarity method. The result shows that social tags and controlled vocabularies are not quite similar due to the free nature of social tags mostly assigned by users whereas controlled vocabularies are attributed by subject experts. In the context of information retrieval and text mining, the Cosine similarity is most commonly adopted method to evaluate the similarity of vectors as it provides an important measurement in terms of degree to know how similar two documents are likely to be in relation to their subject matter. The LibraryThing tags and LCSH are represented in vectors to measure Cosine similarity between them

    Document similarity

    Get PDF
    In recent years, development of tools and methods for measuring document similarity has become a thriving field in informatics, computer science, and digital humanities. Historically, questions of document similarity have been (and still are) important or even crucial in a large variety of situations. Typically, similarity is judged by criteria which depend on context. The move from traditional to digital text technology has not only provided new possibilities for discovery and measurement of document similarity, it has also posed new challenges. Some of these challenges are technical, others conceptual. This paper argues that a particular, well-established, traditional way of starting with an arbitrary document and constructing a document similar to it, namely transcription, may fruitfully be brought to bear on questions concerning similarity criteria for digital documents. Some simple similarity measures are presented and their application to marked up documents are discussed. We conclude that when documents are encoded in the same vocabulary, n-grams constructed to include markup can be used to recognize structural similarities between documents.publishedVersio
    • …
    corecore