3,157 research outputs found

    Semantic Recommender System

    Get PDF
    Though Content-based recommender systems proved to have better quality than Collaborative Filtering recommenders, the later is more used because the former suffers from complex mathematical calculations and inadequate data modeling techniques. Using Ontology(ies) to model the data allows machines to better understand both items and users’ preferences and thus not just suggesting better recommendations, but also providing accurate justifications. In this work we present a Semantic Recommender system that uses a novel way of generating recommendations depending on a Recommender Ontology that provides controlled vocabularies in the context of recommendations, and that is built upon the idea that not all classes and properties are important from item-similarities point of view. If the domain Ontology is annotated with the Recommender Ontology, the Semantic Recommender should be able to generate recommendations. As a result, the proposed system works with any domain data. Thanks to The Semantic Web standards. The proposed mathematical model takes into consideration, in addition to items’ features and users’ profiles, the context of the users and the temporal context, so some items, as an event’s ticket, should never be recommended if the event is over, and should get more presence before the event. The Recommender Ontology grants business owners a way to boost the recommended items according to their needs. This guarantees more diversity, which satisfies the business requirements. For the experiments, we have tested the proposed solution with many domains including movies, books, music, and with a real business company. We got 55% accuracy when testing on a movie domain though we knew just one feature about the movies. The main limitation we have faced is the absent of a content-based domain case that contains ABox, TBox, and ratings together

    Theme and rheme in English and German

    Get PDF

    Automated Fixing of Programs with Contracts

    Full text link
    This paper describes AutoFix, an automatic debugging technique that can fix faults in general-purpose software. To provide high-quality fix suggestions and to enable automation of the whole debugging process, AutoFix relies on the presence of simple specification elements in the form of contracts (such as pre- and postconditions). Using contracts enhances the precision of dynamic analysis techniques for fault detection and localization, and for validating fixes. The only required user input to the AutoFix supporting tool is then a faulty program annotated with contracts; the tool produces a collection of validated fixes for the fault ranked according to an estimate of their suitability. In an extensive experimental evaluation, we applied AutoFix to over 200 faults in four code bases of different maturity and quality (of implementation and of contracts). AutoFix successfully fixed 42% of the faults, producing, in the majority of cases, corrections of quality comparable to those competent programmers would write; the used computational resources were modest, with an average time per fix below 20 minutes on commodity hardware. These figures compare favorably to the state of the art in automated program fixing, and demonstrate that the AutoFix approach is successfully applicable to reduce the debugging burden in real-world scenarios.Comment: Minor changes after proofreadin

    Vermeidung von ReprÀsentationsheterogenitÀten in realweltlichen Wissensgraphen

    Get PDF
    Knowledge graphs are repositories providing factual knowledge about entities. They are a great source of knowledge to support modern AI applications for Web search, question answering, digital assistants, and online shopping. The advantages of machine learning techniques and the Web's growth have led to colossal knowledge graphs with billions of facts about hundreds of millions of entities collected from a large variety of sources. While integrating independent knowledge sources promises rich information, it inherently leads to heterogeneities in representation due to a large variety of different conceptualizations. Thus, real-world knowledge graphs are threatened in their overall utility. Due to their sheer size, they are hardly manually curatable anymore. Automatic and semi-automatic methods are needed to cope with these vast knowledge repositories. We first address the general topic of representation heterogeneity by surveying the problem throughout various data-intensive fields: databases, ontologies, and knowledge graphs. Different techniques for automatically resolving heterogeneity issues are presented and discussed, while several open problems are identified. Next, we focus on entity heterogeneity. We show that automatic matching techniques may run into quality problems when working in a multi-knowledge graph scenario due to incorrect transitive identity links. We present four techniques that can be used to improve the quality of arbitrary entity matching tools significantly. Concerning relation heterogeneity, we show that synonymous relations in knowledge graphs pose several difficulties in querying. Therefore, we resolve these heterogeneities with knowledge graph embeddings and by Horn rule mining. All methods detect synonymous relations in knowledge graphs with high quality. Furthermore, we present a novel technique for avoiding heterogeneity issues at query time using implicit knowledge storage. We show that large neural language models are a valuable source of knowledge that is queried similarly to knowledge graphs already solving several heterogeneity issues internally.Wissensgraphen sind eine wichtige Datenquelle von EntitĂ€tswissen. Sie unterstĂŒtzen viele moderne KI-Anwendungen. Dazu gehören unter anderem Websuche, die automatische Beantwortung von Fragen, digitale Assistenten und Online-Shopping. Neue Errungenschaften im maschinellen Lernen und das außerordentliche Wachstum des Internets haben zu riesigen Wissensgraphen gefĂŒhrt. Diese umfassen hĂ€ufig Milliarden von Fakten ĂŒber Hunderte von Millionen von EntitĂ€ten; hĂ€ufig aus vielen verschiedenen Quellen. WĂ€hrend die Integration unabhĂ€ngiger Wissensquellen zu einer großen Informationsvielfalt fĂŒhren kann, fĂŒhrt sie inhĂ€rent zu HeterogenitĂ€ten in der WissensreprĂ€sentation. Diese HeterogenitĂ€t in den Daten gefĂ€hrdet den praktischen Nutzen der Wissensgraphen. Durch ihre GrĂ¶ĂŸe lassen sich die Wissensgraphen allerdings nicht mehr manuell bereinigen. DafĂŒr werden heutzutage hĂ€ufig automatische und halbautomatische Methoden benötigt. In dieser Arbeit befassen wir uns mit dem Thema ReprĂ€sentationsheterogenitĂ€t. Wir klassifizieren HeterogenitĂ€t entlang verschiedener Dimensionen und erlĂ€utern HeterogenitĂ€tsprobleme in Datenbanken, Ontologien und Wissensgraphen. Weiterhin geben wir einen knappen Überblick ĂŒber verschiedene Techniken zur automatischen Lösung von HeterogenitĂ€tsproblemen. Im nĂ€chsten Kapitel beschĂ€ftigen wir uns mit EntitĂ€tsheterogenitĂ€t. Wir zeigen Probleme auf, die in einem Multi-Wissensgraphen-Szenario aufgrund von fehlerhaften transitiven Links entstehen. Um diese Probleme zu lösen stellen wir vier Techniken vor, mit denen sich die QualitĂ€t beliebiger Entity-Alignment-Tools deutlich verbessern lĂ€sst. Wir zeigen, dass RelationsheterogenitĂ€t in Wissensgraphen zu Problemen bei der Anfragenbeantwortung fĂŒhren kann. Daher entwickeln wir verschiedene Methoden um synonyme Relationen zu finden. Eine der Methoden arbeitet mit hochdimensionalen Wissensgrapheinbettungen, die andere mit einem Rule Mining Ansatz. Beide Methoden können synonyme Relationen in Wissensgraphen mit hoher QualitĂ€t erkennen. DarĂŒber hinaus stellen wir eine neuartige Technik zur Vermeidung von HeterogenitĂ€tsproblemen vor, bei der wir eine implizite WissensreprĂ€sentation verwenden. Wir zeigen, dass große neuronale Sprachmodelle eine wertvolle Wissensquelle sind, die Ă€hnlich wie Wissensgraphen angefragt werden können. Im Sprachmodell selbst werden bereits viele der HeterogenitĂ€tsprobleme aufgelöst, so dass eine Anfrage heterogener Wissensgraphen möglich wird

    Information structure and the referential status of linguistic expression : workshop as part of the 23th annual meetings of the Deutsche Gesellschaft fĂŒr Sprachwissenschaft in Leipzig, Leipzig, February 28 - March 2, 2001

    Get PDF
    This volume comprises papers that were given at the workshop Information Structure and the Referential Status of Linguistic Expressions, which we organized during the Deutsche Gesellschaft fĂŒr Sprachwissenschaft (DGfS) Conference in Leipzig in February 2001. At this workshop we discussed the connection between information structure and the referential interpretation of linguistic expressions, a topic mostly neglected in current linguistics research. One common aim of the papers is to find out to what extent the focus-background as well as the topic-comment structuring determine the referential interpretation of simple arguments like definite and indefinite NPs on the one hand and sentences on the other

    Word-to-Word Models of Translational Equivalence

    Full text link
    Parallel texts (bitexts) have properties that distinguish them from other kinds of parallel data. First, most words translate to only one other word. Second, bitext correspondence is noisy. This article presents methods for biasing statistical translation models to reflect these properties. Analysis of the expected behavior of these biases in the presence of sparse data predicts that they will result in more accurate models. The prediction is confirmed by evaluation with respect to a gold standard -- translation models that are biased in this fashion are significantly more accurate than a baseline knowledge-poor model. This article also shows how a statistical translation model can take advantage of various kinds of pre-existing knowledge that might be available about particular language pairs. Even the simplest kinds of language-specific knowledge, such as the distinction between content words and function words, is shown to reliably boost translation model performance on some tasks. Statistical models that are informed by pre-existing knowledge about the model domain combine the best of both the rationalist and empiricist traditions

    Entity-Oriented Search

    Get PDF
    This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms
    • 

    corecore