61 research outputs found
CLSE: Corpus of Linguistically Significant Entities
One of the biggest challenges of natural language generation (NLG) is the
proper handling of named entities. Named entities are a common source of
grammar mistakes such as wrong prepositions, wrong article handling, or
incorrect entity inflection. Without factoring linguistic representation, such
errors are often underrepresented when evaluating on a small set of arbitrarily
picked argument values, or when translating a dataset from a linguistically
simpler language, like English, to a linguistically complex language, like
Russian. However, for some applications, broadly precise grammatical
correctness is critical -- native speakers may find entity-related grammar
errors silly, jarring, or even offensive.
To enable the creation of more linguistically diverse NLG datasets, we
release a Corpus of Linguistically Significant Entities (CLSE) annotated by
linguist experts. The corpus includes 34 languages and covers 74 different
semantic types to support various applications from airline ticketing to video
games. To demonstrate one possible use of CLSE, we produce an augmented version
of the Schema-Guided Dialog Dataset, SGD-CLSE. Using the CLSE's entities and a
small number of human translations, we create a linguistically representative
NLG evaluation benchmark in three languages: French (high-resource), Marathi
(low-resource), and Russian (highly inflected language). We establish quality
baselines for neural, template-based, and hybrid NLG systems and discuss the
strengths and weaknesses of each approach.Comment: Proceedings of the 2nd Workshop on Natural Language Generation,
Evaluation, and Metrics (GEM 2022) at EMNLP 202
Creation of dialog applications based on schema.org ontology
CĂlem tĂ©to práce je navrhnout a naimplementovat dialogovou aplikaci na základÄ› ontologie Schema.org. Aplikace je schopna vĂ©st jednoduchĂ˝ dialog s uĹľivatelem, kterĂ˝ je zaloĹľen na poloautomatickĂ©m pĹ™Ăstupu. Chatbot se zakládá na ontologii Schema.org, pomocĂ kterĂ© je schopen vyÄŤĂtat strukturovaná i polostrukturovaná data z rĹŻznĂ˝ch datovĂ˝ch zdrojĹŻ, kterĂ˝mi jsou RDF databáze, respektive webovĂ© stránky. Ontologie je takĂ© vyuĹľita chatbotem pro aktivnĂ orientaci ve zvolenĂ© domĂ©nÄ›, která je zaměřena na filmy. Pro implementaci chatbota v programovacĂm jazyce Python vyuĹľĂváme známĂ© NLP algoritmy pro extrakci dat z uĹľivatelskĂ©ho vstupu. Provedli jsme takĂ© integraci námi vytvoĹ™enĂ©ho chatbota do virtuálnĂho agenta Amazon Alexa a do IM platformy Slack. Chatbota vyhodnocujeme na základÄ› reálnĂ˝ch konverzacĂ s uĹľivateli.This thesis aims to design and implement dialog application based on Schema.org ontology. The application uses a semi-automatic creation approach to be able to conduct a simple dialog with a user. Schema.org ontology helps the chatbot to read structured and semi-structured data from different data sources, namely RDF databases and web pages. Chatbot also uses the ontology for active orientation in the given domain which is in our case focused on movies. For the implementation of the chatbot in Python we utilized widespread NLP algorithms for data extraction from a user utterance. Moreover, we have integrated the chatbot into the smart speaker Amazon Alexa and the IM platform Slack. The chatbot is evaluated by real dialogs with users
Semantic Systems. The Power of AI and Knowledge Graphs
This open access book constitutes the refereed proceedings of the 15th International Conference on Semantic Systems, SEMANTiCS 2019, held in Karlsruhe, Germany, in September 2019. The 20 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 88 submissions. They cover topics such as: web semantics and linked (open) data; machine learning and deep learning techniques; semantic information management and knowledge integration; terminology, thesaurus and ontology management; data mining and knowledge discovery; semantics in blockchain and distributed ledger technologies
Recommended from our members
Supporting the Discoverability of Open Educational Resources: on the Scent of a Hidden Treasury
Open Educational Resources (OERs), now available in large numbers, have a considerable potential to improve many aspects of society, yet one of the factors limiting this positive impact is the difficulty to discover them. This thesis investigates and proposes strategies to better support educators in discovering OERs.
The literature suggests that the effectiveness of existing search systems, including for OER discovery, could be improved by supporting users, such as teachers, in carrying out more exploratory search activities closer to their existing methods of working. Hence, a preliminary taxonomy of OER-related search tasks was produced, based on an analysis of the literature, interpreted through Information Foraging Theory. This taxonomy was empirically evaluated to preliminarily identify a set of search tasks that involve finding other OERs similar to one that has already been identified, a process that is generally referred to as Query By Example (QBE). Following the Design Science Research methodology, three prototypes to support as well as to refine those tasks were iteratively designed, implemented, and evaluated involving an increasing number of educators in usability oriented studies. The resulting high-level and domain-oriented blended search/recommendation strategy transparently replicates Google searches in specialized networks, and identifies similar resources with a QBE strategy. It makes use of a domain-oriented similarity metric based on shared alignments to educational standards, and clusters results in expandable classes of comparable degrees of similarity. The summative evaluation shows that educators do appreciate this strategy because it is exploratory and – balancing similarity and diversity – it supports their high-level tasks, such as lesson planning and personalization of education. Finally, potential barriers and opportunities for the uptake of OER discovery tools were investigated in a structured interview study with experts from the OER field. Identified issues included how to work across multiple OER portals, variability in the use of metadata and how to align with the working practices of teachers.
The findings of the thesis can be used to inform the research and development of methods and tools for OER discovery as well as their deployment to serve the needs of educators
Development of a context knowledge system for mobile conversational agents
Un agente conversacional mĂłvil o chatbot es un software que puede realizar tareas o servicios para un usuario o grupo en concreto. El objetivo principal de este Trabajo de Fin de Grado es desarrollar un sistema de conocimiento de contexto para agentes mĂłviles, asĂ como proporcionarle herramientas para que pueda adaptarse dinámicamente. Este sistema permitirá al usuario recibir sugerencias personalizadas de acciones basadas en su contexto y preferencias. Este proyecto se desarrolla en la modalidad A, que significa que está asociado a un departamento universitario. En este caso, este proyecto está vinculado al departamento de Grupo de IngenierĂa del Software y de los Servicios (GESSI) de la Facultad de Informática de Barcelona, Universitat Politècnica de Catalunya. Este sistema expondrá integraciones de funciones entre diferentes aplicaciones de un dispositivo mĂłvil, permitiendo al usuario realizar acciones en una aplicaciĂłn y recibir sugerencias de acciones posibles para ser ejecutadas en otra, permitiĂ©ndole completar esa acciĂłn sin tener que abrir explĂcitamente la aplicaciĂłn en cuestiĂłn.A mobile conversational agent or chatbot is software that can perform tasks or services for a particular user or group. The main goal of this Final Degree Project is to develop a context knowledge system for mobile agents, as well as provide it with tools that allow it to be adapted dynamically. This system will allow the user to receive personalised suggestions of actions based on their context and preferences. This project is developed in the A modality, which means it is associated with a university department. In this case, this project is linked to the Software and Service Engineering Group (GESSI) department from the Barcelona School of Informatics, Universitat Politècnica de Catalunya. This system will expose feature integrations between different applications of a mobile device, allowing the user to perform actions in one application and receive suggestions of possible actions to be executed in another application, letting them complete that suggestion without having to explicitly open the application
Entity-Oriented Search
This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms
- …