240 research outputs found
Automatic multi-label subject indexing in a multilingual environment
This paper presents an approach to automatically subject index fulltext documents with multiple labels based on binary support vector machines(SVM). The aim was to test the applicability of SVMs with a real world dataset. We have also explored the feasibility of incorporating multilingual background knowledge, as represented in thesauri or ontologies, into our text document representation for indexing purposes. The test set for our evaluations has been compiled from an extensive document base maintained by the Food and Agriculture Organization (FAO) of the United Nations (UN). Empirical results show that SVMs are a good method for automatic multi- label classification of documents in multiple languages
Mapping Bibliographic Records with Bibliographic Hash Keys
This poster presents a set of hash keys for bibliographic records called bibkeys. Unlike other methods of duplicate detection, bibkeys can directly be calculated from a set of basic metadata fields (title, authors/editors, year). It is shown how bibkeys are used to map similar bibliographic records in BibSonomy and among distributed library catalogs and other distributed databases
On Background Bias in Deep Metric Learning
Deep Metric Learning trains a neural network to map input images to a
lower-dimensional embedding space such that similar images are closer together
than dissimilar images. When used for item retrieval, a query image is embedded
using the trained model and the closest items from a database storing their
respective embeddings are returned as the most similar items for the query.
Especially in product retrieval, where a user searches for a certain product by
taking a photo of it, the image background is usually not important and thus
should not influence the embedding process. Ideally, the retrieval process
always returns fitting items for the photographed object, regardless of the
environment the photo was taken in. In this paper, we analyze the influence of
the image background on Deep Metric Learning models by utilizing five common
loss functions and three common datasets. We find that Deep Metric Learning
networks are prone to so-called background bias, which can lead to a severe
decrease in retrieval performance when changing the image background during
inference. We also show that replacing the background of images during training
with random background images alleviates this issue. Since we use an automatic
background removal method to do this background replacement, no additional
manual labeling work and model changes are required while inference time stays
the same. Qualitative and quantitative analyses, for which we introduce a new
evaluation metric, confirm that models trained with replaced backgrounds attend
more to the main object in the image, benefitting item retrieval systems.Comment: To be published at ICMV 202
Semantic Network Analysis of Ontologies
A key argument for modeling knowledge in ontologies is the easy re-use and re-engineering of the knowledge. However, current ontology engineering tools provide only basic functionalities for analyzing ontologies. Since ontologies can be considered as graphs, graph analysis techniques are a suitable answer for this need. Graph analysis has been performed by sociologists for over 60 years, and resulted in the vivid research area of Social Network Analysis (SNA). While social network structures currently receive high attention in the Semantic Web community, there are only very few SNA applications, and virtually none for analyzing the structure of ontologies. We illustrate the benefits of applying SNA to ontologies and the Semantic Web, and discuss which research topics arise on the edge between the two areas. In particular, we discuss how different notions of centrality describe the core content and structure of an ontology. From the rather simple notion of degree centrality over betweenness centrality to the more complex eigenvector centrality, we illustrate the insights these measures provide on two ontologies, which are different in purpose, scope, and size
- …