4,564 research outputs found
An information retrieval approach to ontology mapping
In this paper, we present a heuristic mapping method and a prototype mapping system that support the process of semi-automatic ontology mapping for the purpose of improving semantic interoperability in heterogeneous systems. The approach is based on the idea of semantic enrichment, i.e., using instance information of the ontology to enrich the original ontology and calculate similarities between concepts in two ontologies. The functional settings for the mapping system are discussed and the evaluation of the prototype implementation of the approach is reported. \ud
\u
On the Effect of Semantically Enriched Context Models on Software Modularization
Many of the existing approaches for program comprehension rely on the
linguistic information found in source code, such as identifier names and
comments. Semantic clustering is one such technique for modularization of the
system that relies on the informal semantics of the program, encoded in the
vocabulary used in the source code. Treating the source code as a collection of
tokens loses the semantic information embedded within the identifiers. We try
to overcome this problem by introducing context models for source code
identifiers to obtain a semantic kernel, which can be used for both deriving
the topics that run through the system as well as their clustering. In the
first model, we abstract an identifier to its type representation and build on
this notion of context to construct contextual vector representation of the
source code. The second notion of context is defined based on the flow of data
between identifiers to represent a module as a dependency graph where the nodes
correspond to identifiers and the edges represent the data dependencies between
pairs of identifiers. We have applied our approach to 10 medium-sized open
source Java projects, and show that by introducing contexts for identifiers,
the quality of the modularization of the software systems is improved. Both of
the context models give results that are superior to the plain vector
representation of documents. In some cases, the authoritativeness of
decompositions is improved by 67%. Furthermore, a more detailed evaluation of
our approach on JEdit, an open source editor, demonstrates that inferred topics
through performing topic analysis on the contextual representations are more
meaningful compared to the plain representation of the documents. The proposed
approach in introducing a context model for source code identifiers paves the
way for building tools that support developers in program comprehension tasks
such as application and domain concept location, software modularization and
topic analysis
Semantic Graph for Zero-Shot Learning
Zero-shot learning aims to classify visual objects without any training data
via knowledge transfer between seen and unseen classes. This is typically
achieved by exploring a semantic embedding space where the seen and unseen
classes can be related. Previous works differ in what embedding space is used
and how different classes and a test image can be related. In this paper, we
utilize the annotation-free semantic word space for the former and focus on
solving the latter issue of modeling relatedness. Specifically, in contrast to
previous work which ignores the semantic relationships between seen classes and
focus merely on those between seen and unseen classes, in this paper a novel
approach based on a semantic graph is proposed to represent the relationships
between all the seen and unseen class in a semantic word space. Based on this
semantic graph, we design a special absorbing Markov chain process, in which
each unseen class is viewed as an absorbing state. After incorporating one test
image into the semantic graph, the absorbing probabilities from the test data
to each unseen class can be effectively computed; and zero-shot classification
can be achieved by finding the class label with the highest absorbing
probability. The proposed model has a closed-form solution which is linear with
respect to the number of test images. We demonstrate the effectiveness and
computational efficiency of the proposed method over the state-of-the-arts on
the AwA (animals with attributes) dataset.Comment: 9 pages, 5 figure
Evaluation Methodologies for Visual Information Retrieval and Annotation
Die automatisierte Evaluation von Informations-Retrieval-Systemen erlaubt
Performanz und Qualität der Informationsgewinnung zu bewerten. Bereits in
den 60er Jahren wurden erste Methodologien für die system-basierte
Evaluation aufgestellt und in den Cranfield Experimenten überprüft.
Heutzutage gehören Evaluation, Test und Qualitätsbewertung zu einem aktiven
Forschungsfeld mit erfolgreichen Evaluationskampagnen und etablierten
Methoden. Evaluationsmethoden fanden zunächst in der Bewertung von
Textanalyse-Systemen Anwendung. Mit dem rasanten Voranschreiten der
Digitalisierung wurden diese Methoden sukzessive auf die Evaluation von
Multimediaanalyse-Systeme übertragen. Dies geschah häufig, ohne die
Evaluationsmethoden in Frage zu stellen oder sie an die veränderten
Gegebenheiten der Multimediaanalyse anzupassen. Diese Arbeit beschäftigt
sich mit der system-basierten Evaluation von Indizierungssystemen für
Bildkollektionen. Sie adressiert drei Problemstellungen der Evaluation von
Annotationen: Nutzeranforderungen für das Suchen und Verschlagworten von
Bildern, Evaluationsmaße für die Qualitätsbewertung von
Indizierungssystemen und Anforderungen an die Erstellung visueller
Testkollektionen. Am Beispiel der Evaluation automatisierter
Photo-Annotationsverfahren werden relevante Konzepte mit Bezug zu
Nutzeranforderungen diskutiert, Möglichkeiten zur Erstellung einer
zuverlässigen Ground Truth bei geringem Kosten- und Zeitaufwand vorgestellt
und Evaluationsmaße zur Qualitätsbewertung eingeführt, analysiert und
experimentell verglichen. Traditionelle Maße zur Ermittlung der Performanz
werden in vier Dimensionen klassifiziert. Evaluationsmaße vergeben
üblicherweise binäre Kosten für korrekte und falsche Annotationen. Diese
Annahme steht im Widerspruch zu der Natur von Bildkonzepten. Das gemeinsame
Auftreten von Bildkonzepten bestimmt ihren semantischen Zusammenhang und
von daher sollten diese auch im Zusammenhang auf ihre Richtigkeit hin
überprüft werden. In dieser Arbeit wird aufgezeigt, wie semantische
Ähnlichkeiten visueller Konzepte automatisiert abgeschätzt und in den
Evaluationsprozess eingebracht werden können. Die Ergebnisse der Arbeit
inkludieren ein Nutzermodell für die konzeptbasierte Suche von Bildern,
eine vollständig bewertete Testkollektion und neue Evaluationsmaße für die
anforderungsgerechte Qualitätsbeurteilung von Bildanalysesystemen.Performance assessment plays a major role in the research on Information
Retrieval (IR) systems. Starting with the Cranfield experiments in the
early 60ies, methodologies for the system-based performance assessment
emerged and established themselves, resulting in an active research field
with a number of successful benchmarking activities. With the rise of the
digital age, procedures of text retrieval evaluation were often transferred
to multimedia retrieval evaluation without questioning their direct
applicability. This thesis investigates the problem of system-based
performance assessment of annotation approaches in generic image
collections. It addresses three important parts of annotation evaluation,
namely user requirements for the retrieval of annotated visual media,
performance measures for multi-label evaluation, and visual test
collections. Using the example of multi-label image annotation evaluation,
I discuss which concepts to employ for indexing, how to obtain a reliable
ground truth to moderate costs, and which evaluation measures are
appropriate. This is accompanied by a thorough analysis of related work on
system-based performance assessment in Visual Information Retrieval (VIR).
Traditional performance measures are classified into four dimensions and
investigated according to their appropriateness for visual annotation
evaluation. One of the main ideas in this thesis adheres to the common
assumption on the binary nature of the score prediction dimension in
annotation evaluation. However, the predicted concepts and the set of true
indexed concepts interrelate with each other. This work will show how to
utilise these semantic relationships for a fine-grained evaluation
scenario. Outcomes of this thesis result in a user model for concept-based
image retrieval, a fully assessed image annotation test collection, and a
number of novel performance measures for image annotation evaluation
- …