94 research outputs found

    Searching and ranking in entity-relationship graphs

    Get PDF
    The Web bears the potential to become the world';s most comprehensive knowledge base. Organizing information from the Web into entity-relationship graph structures could be a first step towards unleashing this potential. In a second step, the inherent semantics of such structures would have to be exploited by expressive search techniques that go beyond today';s keyword search paradigm. In this realm, as a first contribution of this thesis, we present NAGA (Not Another Google Answer), a new semantic search engine. NAGA provides an expressive, graph-based query language that enables queries with entities and relationships. The results are retrieved based on subgraph matching techniques and ranked by means of a statistical ranking model. As a second contribution, we present STAR (Steiner Tree Approximation in Relationship Graphs), an efficient technique for finding "close'; relations (i.e., compact connections) between k(> 2) entities of interest in large entity-relationship graphs. Our third contribution is MING (Mining Informative Graphs). MING is an efficient method for retrieving "informative'; subgraphs for k(> 2) entities of interest from an entity-relationship graph. Intuitively, these would be subgraphs that can explain the relations between the k entities of interest. The knowledge discovery tasks supported by MING have a stronger semantic flavor than the ones supported by STAR. STAR and MING are integrated into the query answering component of the NAGA engine. NAGA itself is a fully implemented prototype system and is part of the YAGONAGA project.Das Web birgt in sich das Potential zur umfangreichsten Wissensbasis der Welt zu werden. Das Organisieren der Information aus dem Web in Entity-Relationship-Graphstrukturen könnte ein erster Schritt sein, um dieses Potential zu entfalten. In einem zweiten Schritt müssten ausdrucksstarke Suchtechniken entwickelt werden, die über das heutige Keyword-basierte Suchparadigma hinausgehen und die inhärente Semantik solcher Strukturen ausnutzen. In diesem Rahmen stellen wir als ersten Beitrag dieser Arbeit NAGA (Not Another Google Answer) vor, eine neue semantische Suchmaschine. NAGA bietet eine ausdrucksstarke, graphbasierte Anfragesprache, die Anfragen mit Entitäten und Relationen ermöglicht. Die Ergebnisse werden durch Subgraph-Matching-Techniken gefunden und mithilfe eines statistischen Modells in eine Rangliste gebracht. Als zweiten Beitrag stellen wir STAR (Steiner Tree Approximation in Relationship Graphs) vor, eine effiziente Technik, um "nahe'; Relationen (d.h. kompakte Verbindungen) zwischen k(> 2) Entitäten in großen Entity-Relationship-Graphen zu finden. Unser dritter Beitrag ist MING (Mining Informative Graphs). MING ist eine effiziente Methode, die das Finden von "informativen'; Subgraphen für k(> 2) Entitäten aus einem Entity-Relationship-Graphen ermöglicht. Dies sind Subgraphen, die die Beziehungen zwischen den k Entitäten erklären können. Im Vergleich zu STAR unterstützt MING Aufgaben der Wissensexploration, die einen stärkeren semantischen Charakter haben. Sowohl STAR als auch MING sind in die Query-Answering-Komponente der NAGA-Suchmaschine integriert. NAGA selbst ist ein vollständig implementiertes Prototypsystem und Teil des YAGO-NAGA-Projekts

    Entry Dependent Expert Selection in Distributed Gaussian Processes Using Multilabel Classification

    Full text link
    By distributing the training process, local approximation reduces the cost of the standard Gaussian Process. An ensemble technique combines local predictions from Gaussian experts trained on different partitions of the data. Ensemble methods aggregate models' predictions by assuming a perfect diversity of local predictors. Although it keeps the aggregation tractable, this assumption is often violated in practice. Even though ensemble methods provide consistent results by assuming dependencies between experts, they have a high computational cost, which is cubic in the number of experts involved. By implementing an expert selection strategy, the final aggregation step uses fewer experts and is more efficient. However, a selection approach that assigns a fixed set of experts to each new data point cannot encode the specific properties of each unique data point. This paper proposes a flexible expert selection approach based on the characteristics of entry data points. To this end, we investigate the selection task as a multi-label classification problem where the experts define labels, and each entry point is assigned to some experts. The proposed solution's prediction quality, efficiency, and asymptotic properties are discussed in detail. We demonstrate the efficacy of our method through extensive numerical experiments using synthetic and real-world data sets.Comment: A condensed version of this work has been accepted at the Gaussian Processes, Spatiotemporal Modeling, and Decision-making Systems workshop during NeurIPS 202

    User Intent Recognition and Satisfaction with Large Language Models: A User Study with ChatGPT

    Full text link
    The rapid evolution of large language models such as GPT-4 Turbo represents an impactful paradigm shift in digital interaction and content engagement. While these models encode vast amounts of human-generated knowledge and excel in processing diverse data types, recent research shows that they often face the challenge of accurately responding to specific user intents, leading to increased user dissatisfaction. Based on a fine-grained intent taxonomy and intent-based prompt reformulations, we analyze (1) the quality of intent recognition and (2) user satisfaction with answers from intent-based prompt reformulations for two recent ChatGPT models, GPT-3.5 Turbo and GPT-4 Turbo. The results reveal that GPT-4 outperforms GPT-3.5 on the recognition of common intents, but is conversely often outperformed by GPT-3.5 on the recognition of less frequent intents. Moreover, whenever the user intent is correctly recognized, while users are more satisfied with the answers to intent-based reformulations of GPT 4 compared to GPT-3.5, they tend to be more satisfied with the answers of the models to their original prompts compared to the reformulated ones. Finally, the study indicates that users can quickly learn to formulate their prompts more effectively, once they are shown possible reformulation templates
    corecore