15,785 research outputs found

    Parsing Coordination for Spoken Language Understanding

    Full text link
    Typical spoken language understanding systems provide narrow semantic parses using a domain-specific ontology. The parses contain intents and slots that are directly consumed by downstream domain applications. In this work we discuss expanding such systems to handle compound entities and intents by introducing a domain-agnostic shallow parser that handles linguistic coordination. We show that our model for parsing coordination learns domain-independent and slot-independent features and is able to segment conjunct boundaries of many different phrasal categories. We also show that using adversarial training can be effective for improving generalization across different slot types for coordination parsing.Comment: The paper was published in SLT 2018 conferenc

    Human-Robot Collaboration: From Psychology to Social Robotics

    Full text link
    With the advances in robotic technology, research in human-robot collaboration (HRC) has gained in importance. For robots to interact with humans autonomously they need active decision making that takes human partners into account. However, state-of-the-art research in HRC does often assume a leader-follower division, in which one agent leads the interaction. We believe that this is caused by the lack of a reliable representation of the human and the environment to allow autonomous decision making. This problem can be overcome by an embodied approach to HRC which is inspired by psychological studies of human-human interaction (HHI). In this survey, we review neuroscientific and psychological findings of the sensorimotor patterns that govern HHI and view them in a robotics context. Additionally, we study the advances made by the robotic community into the direction of embodied HRC. We focus on the mechanisms that are required for active, physical human-robot collaboration. Finally, we discuss the similarities and differences in the two fields of study which pinpoint directions of future research

    Microplanning with Communicative Intentions: The SPUD System

    Full text link
    The process of microplanning encompasses a range of problems in Natural Language Generation (NLG), such as referring expression generation, lexical choice, and aggregation, problems in which a generator must bridge underlying domain-specific representations and general linguistic representations. In this paper, we describe a uniform approach to microplanning based on declarative representations of a generator's communicative intent. These representations describe the results of NLG: communicative intent associates the concrete linguistic structure planned by the generator with inferences that show how the meaning of that structure communicates needed information about some application domain in the current discourse context. Our approach, implemented in the SPUD (sentence planning using description) microplanner, uses the lexicalized tree-adjoining grammar formalism (LTAG) to connect structure to meaning and uses modal logic programming to connect meaning to context. At the same time, communicative intent representations provide a resource for the process of NLG. Using representations of communicative intent, a generator can augment the syntax, semantics and pragmatics of an incomplete sentence simultaneously, and can assess its progress on the various problems of microplanning incrementally. The declarative formulation of communicative intent translates into a well-defined methodology for designing grammatical and conceptual resources which the generator can use to achieve desired microplanning behavior in a specified domain

    Conversation as Action Under Uncertainty

    Full text link
    Conversations abound with uncetainties of various kinds. Treating conversation as inference and decision making under uncertainty, we propose a task independent, multimodal architecture for supporting robust continuous spoken dialog called Quartet. We introduce four interdependent levels of analysis, and describe representations, inference procedures, and decision strategies for managing uncertainties within and between the levels. We highlight the approach by reviewing interactions between a user and two spoken dialog systems developed using the Quartet architecture: Prsenter, a prototype system for navigating Microsoft PowerPoint presentations, and the Bayesian Receptionist, a prototype system for dealing with tasks typically handled by front desk receptionists at the Microsoft corporate campus.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000

    Grounding Spatio-Semantic Referring Expressions for Human-Robot Interaction

    Full text link
    The human language is one of the most natural interfaces for humans to interact with robots. This paper presents a robot system that retrieves everyday objects with unconstrained natural language descriptions. A core issue for the system is semantic and spatial grounding, which is to infer objects and their spatial relationships from images and natural language expressions. We introduce a two-stage neural-network grounding pipeline that maps natural language referring expressions directly to objects in the images. The first stage uses visual descriptions in the referring expressions to generate a candidate set of relevant objects. The second stage examines all pairwise relationships between the candidates and predicts the most likely referred object according to the spatial descriptions in the referring expressions. A key feature of our system is that by leveraging a large dataset of images labeled with text descriptions, it allows unrestricted object types and natural language referring expressions. Preliminary results indicate that our system outperforms a near state-of-the-art object comprehension system on standard benchmark datasets. We also present a robot system that follows voice commands to pick and place previously unseen objects.Comment: 8 pages, 4 figures, Accepted at RSS 2017 Workshop on Spatial-Semantic Representations in Robotic

    Situational Grounding within Multimodal Simulations

    Full text link
    In this paper, we argue that simulation platforms enable a novel type of embodied spatial reasoning, one facilitated by a formal model of object and event semantics that renders the continuous quantitative search space of an open-world, real-time environment tractable. We provide examples for how a semantically-informed AI system can exploit the precise, numerical information provided by a game engine to perform qualitative reasoning about objects and events, facilitate learning novel concepts from data, and communicate with a human to improve its models and demonstrate its understanding. We argue that simulation environments, and game engines in particular, bring together many different notions of "simulation" and many different technologies to provide a highly-effective platform for developing both AI systems and tools to experiment in both machine and human intelligence.Comment: AAAI-19 Workshop on Games and Simulations for Artificial Intelligenc

    The Role of Artificial Intelligence Technologies in Crisis Response

    Full text link
    Crisis response poses many of the most difficult information technology in crisis management. It requires information and communication-intensive efforts, utilized for reducing uncertainty, calculating and comparing costs and benefits, and managing resources in a fashion beyond those regularly available to handle routine problems. In this paper, we explore the benefits of artificial intelligence technologies in crisis response. This paper discusses the role of artificial intelligence technologies; namely, robotics, ontology and semantic web, and multi-agent systems in crisis response.Comment: 6 pages, 5 figures, 1 table, accepted for MENDEL 2008 14th International Conference on Soft Computing, June 18-20, Brno, Czech Republi

    Evaluating Personal Assistants on Mobile devices

    Full text link
    The iPhone was introduced only a decade ago in 2007 but has fundamentally changed the way we interact with online information. Mobile devices differ radically from classic command-based and point-and-click user interfaces, now allowing for gesture-based interaction using fine-grained touch and swipe signals. Due to the rapid growth in the use of voice-controlled intelligent personal assistants on mobile devices, such as Microsoft's Cortana, Google Now, and Apple's Siri, mobile devices have become personal, allowing us to be online all the time, and assist us in any task, both in work and in our daily lives, making context a crucial factor to consider. Mobile usage is now exceeding desktop usage, and is still growing at a rapid rate, yet our main ways of training and evaluating personal assistants are still based on (and framed in) classical desktop interactions, focusing on explicit queries, clicks, and dwell time spent. However, modern user interaction with mobile devices is radically different due to touch screens with a gesture- and voice-based control and the varying context of use, e.g., in a car, by bike, often invalidating the assumptions underlying today's user satisfaction evaluation. There is an urgent need to understand voice- and gesture-based interaction, taking all interaction signals and context into account in appropriate ways. We propose a research agenda for developing methods to evaluate and improve context-aware user satisfaction with mobile interactions using gesture-based signals at scale

    Online Semi-Supervised Learning with Deep Hybrid Boltzmann Machines and Denoising Autoencoders

    Full text link
    Two novel deep hybrid architectures, the Deep Hybrid Boltzmann Machine and the Deep Hybrid Denoising Auto-encoder, are proposed for handling semi-supervised learning problems. The models combine experts that model relevant distributions at different levels of abstraction to improve overall predictive performance on discriminative tasks. Theoretical motivations and algorithms for joint learning for each are presented. We apply the new models to the domain of data-streams in work towards life-long learning. The proposed architectures show improved performance compared to a pseudo-labeled, drop-out rectifier network

    Theory of mind and decision science: Towards a typology of tasks and computational models

    Get PDF
    The ability to form a Theory of Mind (ToM), i.e., to theorize about others’ mental states to explain and predict behavior in relation to attributed intentional states, constitutes a hallmark of human cognition. These abilities are multi-faceted and include a variety of different cognitive sub-functions. Here, we focus on decision processes in social contexts and review a number of experimental and computational modeling approaches in this field. We provide an overview of experimental accounts and formal computational models with respect to two dimensions: interactivity and uncertainty. Thereby, we aim at capturing the nuances of ToM functions in the context of social decision processes. We suggest there to be an increase in ToM engagement and multiplexing as social cognitive decision-making tasks become more interactive and uncertain. We propose that representing others as intentional and goal directed agents who perform consequential actions is elicited only at the edges of these two dimensions. Further, we argue that computational models of valuation and beliefs follow these dimensions to best allow researchers to effectively model sophisticated ToM-processes. Finally, we relate this typology to neuroimaging findings in neurotypical (NT) humans, studies of persons with autism spectrum (AS), and studies of nonhuman primates
    • …
    corecore