389,275 research outputs found
Semantic Similarity Tailored on the Application Context
The paper proposes an approach to assess the semantic similarity among instances of an ontology. It aims to define a sensitive measurement of semantic similarity, which takes into account different hints hidden in the ontology definition and explicitly considers the application context. The similarity measurement is computed by analyzing, combining and extending some of the existing similarity measures and tailoring them according to the criteria induced by specific application context
Locally Orderless Registration
Image registration is an important tool for medical image analysis and is
used to bring images into the same reference frame by warping the coordinate
field of one image, such that some similarity measure is minimized. We study
similarity in image registration in the context of Locally Orderless Images
(LOI), which is the natural way to study density estimates and reveals the 3
fundamental scales: the measurement scale, the intensity scale, and the
integration scale.
This paper has three main contributions: Firstly, we rephrase a large set of
popular similarity measures into a common framework, which we refer to as
Locally Orderless Registration, and which makes full use of the features of
local histograms. Secondly, we extend the theoretical understanding of the
local histograms. Thirdly, we use our framework to compare two state-of-the-art
intensity density estimators for image registration: The Parzen Window (PW) and
the Generalized Partial Volume (GPV), and we demonstrate their differences on a
popular similarity measure, Normalized Mutual Information (NMI).
We conclude, that complicated similarity measures such as NMI may be
evaluated almost as fast as simple measures such as Sum of Squared Distances
(SSD) regardless of the choice of PW and GPV. Also, GPV is an asymmetric
measure, and PW is our preferred choice.Comment: submitte
Measuring mashup similarity in open data innovation contests
Contests have become an important instrument for fostering the development of novel open data mash-ups, in short open data innovations. Literature calls for new methods for measuring the similarity of open data mash-ups in order to identify code cloning and creative re-use of components of applications. Theoretically grounded computationally methods for identifying the similarity of open data contests are lacking. This study explores the similarity measurement of data-based mashups in the context of an open data innovation contest. Three different dimensions of mashup similarity are defined: code similarity, functional feature similarity, and visualized feature similarity. The results from the contest, including the source code, the running project and the descriptive documents, are collected as the research data for this study. Data analysis is based on the design and development of computational approaches to measure technology and functional similarity. The findings of this study will be helpful in better understanding the similarity of solutions in an open data innovation contest. This study contributes to the theoretical and practical approaches for similarity measurement, especially in the field of mashup development
Recommended from our members
Toward Common Data Elements for International Research in Long-term Care Homes: Advancing Person-Centered Care
To support person-centered, residential long-term care internationally, a consortium of researchers in medicine, nursing, behavioral, and social sciences from 21 geographically and economically diverse countries have launched the WE-THRIVE consortium to develop a common data infrastructure. WE-THRIVE aims to identify measurement domains that are internationally relevant, including in low-, middle-, and high-income countries, prioritize concepts to operationalize domains, and specify a set of data elements to measure concepts that can be used across studies for data sharing and comparisons. This article reports findings from consortium meetings at the 2016 meeting of the Gerontological Society of America and the 2017 meeting of the International Association of Gerontology and Geriatrics, to identify domains and prioritize concepts, following best practices to identify common data elements (CDEs) that were developed through the US National Institutes of Health/National Institute of Nursing Research's CDEs initiative. Four domains were identified, including organizational context, workforce and staffing, person-centered care, and care outcomes. Using a nominal group process, WE-THRIVE prioritized 21 concepts across the 4 domains. Several concepts showed similarity to existing measurement structures, whereas others differed. Conceptual similarity (convergence; eg, concepts in the care outcomes domain of functional level and harm-free care) provides further support of the critical foundational work in LTC measurement endorsed and implemented by regulatory bodies. Different concepts (divergence; eg, concepts in the person-centered care domain of knowing the person and what matters most to the person) highlights current gaps in measurement efforts and is consistent with WE-THRIVE's focus on supporting resilience and thriving for residents, family, and staff. In alignment with the World Health Organization's call for comparative measurement work for health systems change, WE-THRIVE's work to date highlights the benefits of engaging with diverse LTC researchers, including those in low-, middle-, and high-income countries, to develop a measurement infrastructure that integrates the aspirations of person-centered LTC
LESIM: A Novel Lexical Similarity Measure Technique for Multimedia Information Retrieval
Metadata-based similarity measurement is far from obsolete in our days, despite research’s focus on content and context. It allows for aggregating information from textual references, measuring similarity when content is not available, traditional keyword search in search engines, merging results in meta-search engines and many more research and industry interesting activities. Existing similarity measures do not take into consideration neither the unique nature of multimedia’s metadata nor the requirements of metadata-based information retrieval of multimedia. This work proposes a customised for the commonly available author-title multimedia metadata hybrid similarity measure that is shown through experimentation to be significantly more effective than baseline measures
An Evaluation of Context Awareness in Similarity Measurement: Total-Set Versus Classic Pairwise Comparison
How someone perceives similarity gives insight into how they learn and process information. Pairwise comparison is a useful tool in determining a person’s perception of similarity. In classic pairwise comparison, a participant is shown two items of a set at a time, repeating the process until the entirety of the set has been evaluated. Total-set pairwise comparison shows the participant the entire set while highlighting items in the set to evaluate. It has been assumed that, in the classic method, the participant’s judgments across trials are made with increasing awareness of the total-set context, and that the total-set method fixes this awareness from the start. Our study will systematically evaluate changes in awareness across trials in each procedure and explore whether there is a bias toward basic-level representation of the (unknown) total set in the classic method. The program created will evaluate and probe participants on three levels of superordinate, basic, and subordinate in both classic and total-set pairwise comparison. Thus far, the program measuring this study has been created and is currently in the stages of data collection. The results of our study will help researchers to more intelligently choose between classic and total-set pairwise comparison when measuring subjective similarity
The role of social tags in web resource discovery:Â an evaluation of user-generated keywords
Social tags are user generated metadata and play vital role in Information Retrieval (IR) of web resources. This study is an attempt to determine the similarities between social tags extracted from LibraryThing and Library of Congress Subject Headings (LCSH) for the titles chosen for study by adopting Cosine similarity method. The result shows that social tags and controlled vocabularies are not quite similar due to the free nature of social tags mostly assigned by users whereas controlled vocabularies are attributed by subject experts. In the context of information retrieval and text mining, the Cosine similarity is most commonly adopted method to evaluate the similarity of vectors as it provides an important measurement in terms of degree to know how similar two documents are likely to be in relation to their subject matter. The LibraryThing tags and LCSH are represented in vectors to measure Cosine similarity between them
Document similarity
In recent years, development of tools and methods for measuring document similarity has become a thriving field in informatics, computer science, and digital humanities. Historically, questions of document similarity have been (and still are) important or even crucial in a large variety of situations. Typically, similarity is judged by criteria which depend on context. The move from traditional to digital text technology has not only provided new possibilities for discovery and measurement of document similarity, it has also posed new challenges. Some of these challenges are technical, others conceptual. This paper argues that a particular, well-established, traditional way of starting with an arbitrary document and constructing a document similar to it, namely transcription, may fruitfully be brought to bear on questions concerning similarity criteria for digital documents. Some simple similarity measures are presented and their application to marked up documents are discussed. We conclude that when documents are encoded in the same vocabulary, n-grams constructed to include markup can be used to recognize structural similarities between documents.publishedVersio
- …