7,833 research outputs found
An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog
We present a novel end-to-end trainable neural network model for
task-oriented dialog systems. The model is able to track dialog state, issue
API calls to knowledge base (KB), and incorporate structured KB query results
into system responses to successfully complete task-oriented dialogs. The
proposed model produces well-structured system responses by jointly learning
belief tracking and KB result processing conditioning on the dialog history. We
evaluate the model in a restaurant search domain using a dataset that is
converted from the second Dialog State Tracking Challenge (DSTC2) corpus.
Experiment results show that the proposed model can robustly track dialog state
given the dialog history. Moreover, our model demonstrates promising results in
producing appropriate system responses, outperforming prior end-to-end
trainable neural network models using per-response accuracy evaluation metrics.Comment: Published at Interspeech 201
Design and evaluation of acceleration strategies for speeding up the development of dialog applications
In this paper, we describe a complete development platform that features different innovative acceleration strategies, not included in any other current platform, that simplify and speed up the definition of the different elements required to design a spoken dialog service. The proposed accelerations are mainly based on using the information from the backend database schema and contents, as well as cumulative information produced throughout the different steps in the design. Thanks to these accelerations, the interaction between the designer and the platform is improved, and in most cases the design is reduced to simple confirmations of the “proposals” that the platform dynamically provides at each step.
In addition, the platform provides several other accelerations such as configurable templates that can be used to define the different tasks in the service or the dialogs to obtain or show information to the user, automatic proposals for the best way to request slot contents from the user (i.e. using mixed-initiative forms or directed forms), an assistant that offers the set of more probable actions required to complete the definition of the different tasks in the application, or another assistant for solving specific modality details such as confirmations of user answers or how to present them the lists of retrieved results after querying the backend database. Additionally, the platform also allows the creation of speech grammars and prompts, database access functions, and the possibility of using mixed initiative and over-answering dialogs. In the paper we also describe in detail each assistant in the platform, emphasizing the different kind of methodologies followed to facilitate the design process at each one.
Finally, we describe the results obtained in both a subjective and an objective evaluation with different designers that confirm the viability, usefulness, and functionality of the proposed accelerations. Thanks to the accelerations, the design time is reduced in more than 56% and the number of keystrokes by 84%
Research on speech understanding and related areas at SRI
Research capabilities on speech understanding, speech recognition, and voice control are described. Research activities and the activities which involve text input rather than speech are discussed
- …