7,833 research outputs found

    An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog

    Full text link
    We present a novel end-to-end trainable neural network model for task-oriented dialog systems. The model is able to track dialog state, issue API calls to knowledge base (KB), and incorporate structured KB query results into system responses to successfully complete task-oriented dialogs. The proposed model produces well-structured system responses by jointly learning belief tracking and KB result processing conditioning on the dialog history. We evaluate the model in a restaurant search domain using a dataset that is converted from the second Dialog State Tracking Challenge (DSTC2) corpus. Experiment results show that the proposed model can robustly track dialog state given the dialog history. Moreover, our model demonstrates promising results in producing appropriate system responses, outperforming prior end-to-end trainable neural network models using per-response accuracy evaluation metrics.Comment: Published at Interspeech 201

    Design and evaluation of acceleration strategies for speeding up the development of dialog applications

    Get PDF
    In this paper, we describe a complete development platform that features different innovative acceleration strategies, not included in any other current platform, that simplify and speed up the definition of the different elements required to design a spoken dialog service. The proposed accelerations are mainly based on using the information from the backend database schema and contents, as well as cumulative information produced throughout the different steps in the design. Thanks to these accelerations, the interaction between the designer and the platform is improved, and in most cases the design is reduced to simple confirmations of the “proposals” that the platform dynamically provides at each step. In addition, the platform provides several other accelerations such as configurable templates that can be used to define the different tasks in the service or the dialogs to obtain or show information to the user, automatic proposals for the best way to request slot contents from the user (i.e. using mixed-initiative forms or directed forms), an assistant that offers the set of more probable actions required to complete the definition of the different tasks in the application, or another assistant for solving specific modality details such as confirmations of user answers or how to present them the lists of retrieved results after querying the backend database. Additionally, the platform also allows the creation of speech grammars and prompts, database access functions, and the possibility of using mixed initiative and over-answering dialogs. In the paper we also describe in detail each assistant in the platform, emphasizing the different kind of methodologies followed to facilitate the design process at each one. Finally, we describe the results obtained in both a subjective and an objective evaluation with different designers that confirm the viability, usefulness, and functionality of the proposed accelerations. Thanks to the accelerations, the design time is reduced in more than 56% and the number of keystrokes by 84%

    Research on speech understanding and related areas at SRI

    Get PDF
    Research capabilities on speech understanding, speech recognition, and voice control are described. Research activities and the activities which involve text input rather than speech are discussed
    • …
    corecore