1,261 research outputs found

    Goal-oriented Dialogue Policy Learning from Failures

    Full text link
    Reinforcement learning methods have been used for learning dialogue policies. However, learning an effective dialogue policy frequently requires prohibitively many conversations. This is partly because of the sparse rewards in dialogues, and the very few successful dialogues in early learning phase. Hindsight experience replay (HER) enables learning from failures, but the vanilla HER is inapplicable to dialogue learning due to the implicit goals. In this work, we develop two complex HER methods providing different trade-offs between complexity and performance, and, for the first time, enabled HER-based dialogue policy learning. Experiments using a realistic user simulator show that our HER methods perform better than existing experience replay methods (as applied to deep Q-networks) in learning rate

    Design and Realization of On-line Enterprise Office Automation System

    Get PDF
    AbstractThis paper discusses the online business office automation system development process, office automation system requirement analysis, system function design, database design and implementation of the system is introduced, the system function and database design and realization of the system. Through the system function data flow analysis, get the logical structure of database system, and on this basis, the physical structure of database to create all kinds of information inquiry, update operation

    Learning and Reasoning for Robot Dialog and Navigation Tasks

    Get PDF
    You are viewing an article from the Proceedings of the 21st Annual Meeting of the Special Interest Group on Discourse and Dialogue that was in the Good Systems Network Digest in 2020.Office of the VP for Researc

    A new integral equation formulation for American put options

    Get PDF
    In this paper, a completely new integral equation for the price of an American put option as well as its optimal exercise price is successfully derived. Compared to existing integral equations for pricing American options, the new integral formulation has two distinguishable advantages: (i) it is in a form of one-dimensional integral, and (ii) it is in a form that is free from any discontinuity and singularities associated with the optimal exercise boundary at the expiry time. These rather unique features have led to a significant enhancement of the computational accuracy and efficiency as shown in the examples
    • …
    corecore