726 research outputs found
Vermeidung von ReprÀsentationsheterogenitÀten in realweltlichen Wissensgraphen
Knowledge graphs are repositories providing factual knowledge about entities. They are a great source of knowledge to support modern AI applications for Web search, question answering, digital assistants, and online shopping. The advantages of machine learning techniques and the Web's growth have led to colossal knowledge graphs with billions of facts about hundreds of millions of entities collected from a large variety of sources. While integrating independent knowledge sources promises rich information, it inherently leads to heterogeneities in representation due to a large variety of different conceptualizations. Thus, real-world knowledge graphs are threatened in their overall utility. Due to their sheer size, they are hardly manually curatable anymore. Automatic and semi-automatic methods are needed to cope with these vast knowledge repositories. We first address the general topic of representation heterogeneity by surveying the problem throughout various data-intensive fields: databases, ontologies, and knowledge graphs. Different techniques for automatically resolving heterogeneity issues are presented and discussed, while several open problems are identified. Next, we focus on entity heterogeneity. We show that automatic matching techniques may run into quality problems when working in a multi-knowledge graph scenario due to incorrect transitive identity links. We present four techniques that can be used to improve the quality of arbitrary entity matching tools significantly. Concerning relation heterogeneity, we show that synonymous relations in knowledge graphs pose several difficulties in querying. Therefore, we resolve these heterogeneities with knowledge graph embeddings and by Horn rule mining. All methods detect synonymous relations in knowledge graphs with high quality. Furthermore, we present a novel technique for avoiding heterogeneity issues at query time using implicit knowledge storage. We show that large neural language models are a valuable source of knowledge that is queried similarly to knowledge graphs already solving several heterogeneity issues internally.Wissensgraphen sind eine wichtige Datenquelle von EntitĂ€tswissen. Sie unterstĂŒtzen viele moderne KI-Anwendungen. Dazu gehören unter anderem Websuche, die automatische Beantwortung von Fragen, digitale Assistenten und Online-Shopping. Neue Errungenschaften im maschinellen Lernen und das auĂerordentliche Wachstum des Internets haben zu riesigen Wissensgraphen gefĂŒhrt. Diese umfassen hĂ€ufig Milliarden von Fakten ĂŒber Hunderte von Millionen von EntitĂ€ten; hĂ€ufig aus vielen verschiedenen Quellen. WĂ€hrend die Integration unabhĂ€ngiger Wissensquellen zu einer groĂen Informationsvielfalt fĂŒhren kann, fĂŒhrt sie inhĂ€rent zu HeterogenitĂ€ten in der WissensreprĂ€sentation. Diese HeterogenitĂ€t in den Daten gefĂ€hrdet den praktischen Nutzen der Wissensgraphen. Durch ihre GröĂe lassen sich die Wissensgraphen allerdings nicht mehr manuell bereinigen. DafĂŒr werden heutzutage hĂ€ufig automatische und halbautomatische Methoden benötigt. In dieser Arbeit befassen wir uns mit dem Thema ReprĂ€sentationsheterogenitĂ€t. Wir klassifizieren HeterogenitĂ€t entlang verschiedener Dimensionen und erlĂ€utern HeterogenitĂ€tsprobleme in Datenbanken, Ontologien und Wissensgraphen. Weiterhin geben wir einen knappen Ăberblick ĂŒber verschiedene Techniken zur automatischen Lösung von HeterogenitĂ€tsproblemen. Im nĂ€chsten Kapitel beschĂ€ftigen wir uns mit EntitĂ€tsheterogenitĂ€t. Wir zeigen Probleme auf, die in einem Multi-Wissensgraphen-Szenario aufgrund von fehlerhaften transitiven Links entstehen. Um diese Probleme zu lösen stellen wir vier Techniken vor, mit denen sich die QualitĂ€t beliebiger Entity-Alignment-Tools deutlich verbessern lĂ€sst. Wir zeigen, dass RelationsheterogenitĂ€t in Wissensgraphen zu Problemen bei der Anfragenbeantwortung fĂŒhren kann. Daher entwickeln wir verschiedene Methoden um synonyme Relationen zu finden. Eine der Methoden arbeitet mit hochdimensionalen Wissensgrapheinbettungen, die andere mit einem Rule Mining Ansatz. Beide Methoden können synonyme Relationen in Wissensgraphen mit hoher QualitĂ€t erkennen. DarĂŒber hinaus stellen wir eine neuartige Technik zur Vermeidung von HeterogenitĂ€tsproblemen vor, bei der wir eine implizite WissensreprĂ€sentation verwenden. Wir zeigen, dass groĂe neuronale Sprachmodelle eine wertvolle Wissensquelle sind, die Ă€hnlich wie Wissensgraphen angefragt werden können. Im Sprachmodell selbst werden bereits viele der HeterogenitĂ€tsprobleme aufgelöst, so dass eine Anfrage heterogener Wissensgraphen möglich wird
Efficient Data Analytics on Augmented Similarity Triplets
Many machine learning methods (classification, clustering, etc.) start with a
known kernel that provides similarity or distance measure between two objects.
Recent work has extended this to situations where the information about objects
is limited to comparisons of distances between three objects (triplets). Humans
find the comparison task much easier than the estimation of absolute
similarities, so this kind of data can be easily obtained using crowd-sourcing.
In this work, we give an efficient method of augmenting the triplets data, by
utilizing additional implicit information inferred from the existing data.
Triplets augmentation improves the quality of kernel-based and kernel-free data
analytics tasks. Secondly, we also propose a novel set of algorithms for common
supervised and unsupervised machine learning tasks based on triplets. These
methods work directly with triplets, avoiding kernel evaluations. Experimental
evaluation on real and synthetic datasets shows that our methods are more
accurate than the current best-known techniques
Depth Functions for Partial Orders with a Descriptive Analysis of Machine Learning Algorithms
We propose a framework for descriptively analyzing sets of partial orders
based on the concept of depth functions. Despite intensive studies of depth
functions in linear and metric spaces, there is very little discussion on depth
functions for non-standard data types such as partial orders. We introduce an
adaptation of the well-known simplicial depth to the set of all partial orders,
the union-free generic (ufg) depth. Moreover, we utilize our ufg depth for a
comparison of machine learning algorithms based on multidimensional performance
measures. Concretely, we analyze the distribution of different classifier
performances over a sample of standard benchmark data sets. Our results
promisingly demonstrate that our approach differs substantially from existing
benchmarking approaches and, therefore, adds a new perspective to the vivid
debate on the comparison of classifiers.Comment: Accepted to ISIPTA 2023; Forthcoming in: Proceedings of Machine
Learning Researc
SurfelMeshing: Online Surfel-Based Mesh Reconstruction
We address the problem of mesh reconstruction from live RGB-D video, assuming
a calibrated camera and poses provided externally (e.g., by a SLAM system). In
contrast to most existing approaches, we do not fuse depth measurements in a
volume but in a dense surfel cloud. We asynchronously (re)triangulate the
smoothed surfels to reconstruct a surface mesh. This novel approach enables to
maintain a dense surface representation of the scene during SLAM which can
quickly adapt to loop closures. This is possible by deforming the surfel cloud
and asynchronously remeshing the surface where necessary. The surfel-based
representation also naturally supports strongly varying scan resolution. In
particular, it reconstructs colors at the input camera's resolution. Moreover,
in contrast to many volumetric approaches, ours can reconstruct thin objects
since objects do not need to enclose a volume. We demonstrate our approach in a
number of experiments, showing that it produces reconstructions that are
competitive with the state-of-the-art, and we discuss its advantages and
limitations. The algorithm (excluding loop closure functionality) is available
as open source at https://github.com/puzzlepaint/surfelmeshing .Comment: Version accepted to IEEE Transactions on Pattern Analysis and Machine
Intelligenc
Comparing Machine Learning Algorithms by Union-Free Generic Depth
We propose a framework for descriptively analyzing sets of partial orders
based on the concept of depth functions. Despite intensive studies in linear
and metric spaces, there is very little discussion on depth functions for
non-standard data types such as partial orders. We introduce an adaptation of
the well-known simplicial depth to the set of all partial orders, the
union-free generic (ufg) depth. Moreover, we utilize our ufg depth for a
comparison of machine learning algorithms based on multidimensional performance
measures. Concretely, we provide two examples of classifier comparisons on
samples of standard benchmark data sets. Our results demonstrate promisingly
the wide variety of different analysis approaches based on ufg methods.
Furthermore, the examples outline that our approach differs substantially from
existing benchmarking approaches, and thus adds a new perspective to the vivid
debate on classifier comparison.Comment: arXiv admin note: substantial text overlap with arXiv:2304.0987
- âŠ