9 research outputs found

    Nonstrict hierarchical reinforcement learning for interactive systems and robots

    Get PDF
    Conversational systems and robots that use reinforcement learning for policy optimization in large domains often face the problem of limited scalability. This problem has been addressed either by using function approximation techniques that estimate the approximate true value function of a policy or by using a hierarchical decomposition of a learning task into subtasks. We present a novel approach for dialogue policy optimization that combines the benefits of both hierarchical control and function approximation and that allows flexible transitions between dialogue subtasks to give human users more control over the dialogue. To this end, each reinforcement learning agent in the hierarchy is extended with a subtask transition function and a dynamic state space to allow flexible switching between subdialogues. In addition, the subtask policies are represented with linear function approximation in order to generalize the decision making to situations unseen in training. Our proposed approach is evaluated in an interactive conversational robot that learns to play quiz games. Experimental results, using simulation and real users, provide evidence that our proposed approach can lead to more flexible (natural) interactions than strict hierarchical control and that it is preferred by human users

    Deep reinforcement learning for multi-domain dialogue systems

    Get PDF
    Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems. We propose a method for multi-domain dialogue policy learning---termed NDQN, and apply it to an information-seeking spoken dialogue system in the domains of restaurants and hotels. Experimental results comparing DQN (baseline) versus NDQN (proposed) using simulations report that our proposed method exhibits better scalability and is promising for optimising the behaviour of multi-domain dialogue systems

    Scaling up deep reinforcement learning for multi-domain dialogue systems

    Get PDF
    Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems due to large search spaces. This paper proposes a three-stage method for multi-domain dialogue policy learning—termed NDQN, and applies it to an information-seeking spoken dialogue system in the domains of restaurants and hotels. In this method, the first stage does multi-policy learning via a network of DQN agents; the second makes use of compact state representations by compressing raw inputs; and the third stage applies a pre-training phase for bootstraping the behaviour of agents in the network. Experimental results comparing DQN (baseline) versus NDQN (proposed) using simulations report that the proposed method exhibits better scalability and is promising for optimising the behaviour of multi-domain dialogue systems. An additional evaluation reports that the NDQN agents outperformed a K-Nearest Neighbour baseline in task success and dialogue length, yielding more efficient and successful dialogues

    Survey on reinforcement learning for language processing

    Full text link
    In recent years some researchers have explored the use of reinforcement learning (RL) algorithms as key components in the solution of various natural language processing tasks. For instance, some of these algorithms leveraging deep neural learning have found their way into conversational systems. This paper reviews the state of the art of RL methods for their possible use for different problems of natural language processing, focusing primarily on conversational systems, mainly due to their growing relevance. We provide detailed descriptions of the problems as well as discussions of why RL is well-suited to solve them. Also, we analyze the advantages and limitations of these methods. Finally, we elaborate on promising research directions in natural language processing that might benefit from reinforcement learning

    Motion planning of upper-limb exoskeleton robots : a review

    Get PDF
    ABSTRACT: Background: Motion planning is an important part of exoskeleton control that improves the wearer’s safety and comfort. However, its usage introduces the problem of trajectory planning. The objective of trajectory planning is to generate the reference input for the motion-control system. This review explores the methods of trajectory planning for exoskeleton control. In order to reduce the number of surveyed papers, this review focuses on the upper limbs, which require refined three-dimensional motion planning. Methods: A systematic search covering the last 20 years was conducted in Ei Compendex, Inspect-IET, Web of Science, PubMed, ProQuest, and Science-Direct. The search strategy was to use and combine terms “trajectory planning”, “upper limb”, and ”exoskeleton” as high-level keywords. “Trajectory planning” and “motion planning” were also combined with the following keywords: “rehabilitation”, “humanlike motion“, “upper extremity“, “inverse kinematic“, and “learning machine “. Results: A total of 67 relevant papers were discovered. Results were then classified into two main categories of methods to plan trajectory: (i) Approaches based on Cartesian motion planning, and inverse kinematics using polynomial-interpolation or optimization-based methods such as minimum-jerk, minimum-torque-change, and inertia-like models; and (ii) approaches based on “learning by demonstration” using machine-learning techniques such as supervised learning based on neural networks, and learning methods based on hidden Markov models, Gaussian mixture models, and dynamic motion primitives. Conclusions: Various methods have been proposed to plan the trajectories for upper-limb exoskeleton robots, but most of them plan the trajectory offline. The review approach is general and could be extended to lower limbs. Trajectory planning has the advantage of extending the applicability of therapy robots to home usage (assistive exoskeletons); it also makes it possible to mitigate the shortages of medical caregivers and therapists, and therapy costs. In this paper, we also discuss challenges associated with trajectory planning: kinematic redundancy and incompatibility, and the trajectory-optimization problem. Commonly, methods based on the computation of swivel angles and other methods rely on the relationship (e.g., coordinated or synergistic) between the degrees of freedom used to resolve kinematic redundancy for exoskeletons. Moreover, two general solutions, namely, the self-tracing configuration of the joint axis and the alignment-free configuration of the joint axis, which add the appropriate number of extra degrees of freedom to the mechanism, were employed to improve the kinematic incompatibility between human and exoskeleton. Future work will focus on online trajectory planning and optimal control. This will be done because very few online methods were found in the scope of this study

    Robust and Cooperative Formation Control of Nonlinear Multi-Agent Systems

    Get PDF
    Compared with the conventional approach of controlling autonomous systems individually, building up a cooperative multi-agent structure is more robust and efficient for both research and industrial purposes. Among the many subbranches of multiagent systems, formation control has been a popular research direction due to its close connection with complex missions such as spacecraft clustering and intelligent transportation. Hence, this thesis focuses on providing new robust formation control algorithms for first-order, second-order and mixed-order nonlinear multi-agent systems to construct and maintain stable system structure in practical scenarios. System uncertainties and external disturbances are commonly seen factors that could negatively affect the formation tracking precision. Among the many popular tools of uncertainty estimation, the implementation of approaches including neural network adaptive estimation and observer-based approximation are discussed in this thesis. Regarding the neural-based approximation process, different neural network structures including Chebyshev neural network, radial basis function neural network, twolayer artificial neural network and three-layer artificial neural network are tested and implemented. The merits and drawbacks of each network design in the field of control is then analysed. Apart from that, this thesis also offers detailed comparison between the cooperative tuning approach and the observer-based tuning approach regarding the neural network structure to find their corresponding applicable scenarios. To ensure the safety of the formation control algorithms, the issues of obstacle avoidance and inter-agent collision avoidance are both considered. Although the method of constructing artificial potential fields is a popular approach in both the field of path planning and motion control, few have discussed the effect of the inter-agent communication on the collision avoidance scheme. For the obstacle avoiding scenarios, the passive correcting behaviour of individual agent is defined and investigated. A new algorithm is then introduced to modify the reference of individual agents to act as the mitigation. The issue of insufficient information accessibility is then discussed for multi-agent systems with a static and uncompleted communication topology. A distance-based communication topology is proposed to create necessary information exchange channel for unconnected agent pairs that are close enough. The actuator saturation issue is also considered for both first-order multi-agent systems and second-order multi-agent systems to increase the practicality of the formation control schemes. Apart from restricting the amplitudes of the control input, the effect of the input coupling phenomenon is investigated. The oscillation of states brought by the coupled and saturated control input is then summarised as the reverse effect. To attenuate the state oscillation, the methods of developing control input regulation algorithms and employing auxiliary compensator are discussed and validated. The last technical problem to discuss is the hierarchical control scheme. The issue of how to decouple the inter-agent communication and the motion dynamics is discussed for both unified-order and mixed-order multi-agent systems. By using a hierarchical formation control structure, the inter-agent communication process is considered based on a group of virtual agents with ideal characteristics, which can significantly reduce the complexity of the system design. Adaptive hierarchical control schemes are then proposed and validated for both unified-order and mixed-order multi-agent systems through the examples of a multi-drone system and a multiple omni-directional robot system, respectively.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 202

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications

    International Conference on Mathematical Analysis and Applications in Science and Engineering – Book of Extended Abstracts

    Get PDF
    The present volume on Mathematical Analysis and Applications in Science and Engineering - Book of Extended Abstracts of the ICMASC’2022 collects the extended abstracts of the talks presented at the International Conference on Mathematical Analysis and Applications in Science and Engineering – ICMA2SC'22 that took place at the beautiful city of Porto, Portugal, in June 27th-June 29th 2022 (3 days). Its aim was to bring together researchers in every discipline of applied mathematics, science, engineering, industry, and technology, to discuss the development of new mathematical models, theories, and applications that contribute to the advancement of scientific knowledge and practice. Authors proposed research in topics including partial and ordinary differential equations, integer and fractional order equations, linear algebra, numerical analysis, operations research, discrete mathematics, optimization, control, probability, computational mathematics, amongst others. The conference was designed to maximize the involvement of all participants and will present the state-of- the-art research and the latest achievements.info:eu-repo/semantics/publishedVersio
    corecore