61 research outputs found

    CLSE: Corpus of Linguistically Significant Entities

    Full text link
    One of the biggest challenges of natural language generation (NLG) is the proper handling of named entities. Named entities are a common source of grammar mistakes such as wrong prepositions, wrong article handling, or incorrect entity inflection. Without factoring linguistic representation, such errors are often underrepresented when evaluating on a small set of arbitrarily picked argument values, or when translating a dataset from a linguistically simpler language, like English, to a linguistically complex language, like Russian. However, for some applications, broadly precise grammatical correctness is critical -- native speakers may find entity-related grammar errors silly, jarring, or even offensive. To enable the creation of more linguistically diverse NLG datasets, we release a Corpus of Linguistically Significant Entities (CLSE) annotated by linguist experts. The corpus includes 34 languages and covers 74 different semantic types to support various applications from airline ticketing to video games. To demonstrate one possible use of CLSE, we produce an augmented version of the Schema-Guided Dialog Dataset, SGD-CLSE. Using the CLSE's entities and a small number of human translations, we create a linguistically representative NLG evaluation benchmark in three languages: French (high-resource), Marathi (low-resource), and Russian (highly inflected language). We establish quality baselines for neural, template-based, and hybrid NLG systems and discuss the strengths and weaknesses of each approach.Comment: Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2022) at EMNLP 202

    Creation of dialog applications based on schema.org ontology

    Get PDF
    Cílem této práce je navrhnout a naimplementovat dialogovou aplikaci na základě ontologie Schema.org. Aplikace je schopna vést jednoduchý dialog s uživatelem, který je založen na poloautomatickém přístupu. Chatbot se zakládá na ontologii Schema.org, pomocí které je schopen vyčítat strukturovaná i polostrukturovaná data z různých datových zdrojů, kterými jsou RDF databáze, respektive webové stránky. Ontologie je také využita chatbotem pro aktivní orientaci ve zvolené doméně, která je zaměřena na filmy. Pro implementaci chatbota v programovacím jazyce Python využíváme známé NLP algoritmy pro extrakci dat z uživatelského vstupu. Provedli jsme také integraci námi vytvořeného chatbota do virtuálního agenta Amazon Alexa a do IM platformy Slack. Chatbota vyhodnocujeme na základě reálných konverzací s uživateli.This thesis aims to design and implement dialog application based on Schema.org ontology. The application uses a semi-automatic creation approach to be able to conduct a simple dialog with a user. Schema.org ontology helps the chatbot to read structured and semi-structured data from different data sources, namely RDF databases and web pages. Chatbot also uses the ontology for active orientation in the given domain which is in our case focused on movies. For the implementation of the chatbot in Python we utilized widespread NLP algorithms for data extraction from a user utterance. Moreover, we have integrated the chatbot into the smart speaker Amazon Alexa and the IM platform Slack. The chatbot is evaluated by real dialogs with users

    Semantic Systems. The Power of AI and Knowledge Graphs

    Get PDF
    This open access book constitutes the refereed proceedings of the 15th International Conference on Semantic Systems, SEMANTiCS 2019, held in Karlsruhe, Germany, in September 2019. The 20 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 88 submissions. They cover topics such as: web semantics and linked (open) data; machine learning and deep learning techniques; semantic information management and knowledge integration; terminology, thesaurus and ontology management; data mining and knowledge discovery; semantics in blockchain and distributed ledger technologies

    Development of a context knowledge system for mobile conversational agents

    Get PDF
    Un agente conversacional móvil o chatbot es un software que puede realizar tareas o servicios para un usuario o grupo en concreto. El objetivo principal de este Trabajo de Fin de Grado es desarrollar un sistema de conocimiento de contexto para agentes móviles, así como proporcionarle herramientas para que pueda adaptarse dinámicamente. Este sistema permitirá al usuario recibir sugerencias personalizadas de acciones basadas en su contexto y preferencias. Este proyecto se desarrolla en la modalidad A, que significa que está asociado a un departamento universitario. En este caso, este proyecto está vinculado al departamento de Grupo de Ingeniería del Software y de los Servicios (GESSI) de la Facultad de Informática de Barcelona, Universitat Politècnica de Catalunya. Este sistema expondrá integraciones de funciones entre diferentes aplicaciones de un dispositivo móvil, permitiendo al usuario realizar acciones en una aplicación y recibir sugerencias de acciones posibles para ser ejecutadas en otra, permitiéndole completar esa acción sin tener que abrir explícitamente la aplicación en cuestión.A mobile conversational agent or chatbot is software that can perform tasks or services for a particular user or group. The main goal of this Final Degree Project is to develop a context knowledge system for mobile agents, as well as provide it with tools that allow it to be adapted dynamically. This system will allow the user to receive personalised suggestions of actions based on their context and preferences. This project is developed in the A modality, which means it is associated with a university department. In this case, this project is linked to the Software and Service Engineering Group (GESSI) department from the Barcelona School of Informatics, Universitat Politècnica de Catalunya. This system will expose feature integrations between different applications of a mobile device, allowing the user to perform actions in one application and receive suggestions of possible actions to be executed in another application, letting them complete that suggestion without having to explicitly open the application

    Entity-Oriented Search

    Get PDF
    This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms
    • …
    corecore