96 research outputs found

    Manifold Representations for Continuous-State Reinforcement Learning

    Get PDF
    Reinforcement learning (RL) has shown itself to be an effective paradigm for solving optimal control problems with a finite number of states. Generalizing RL techniques to problems with a continuous state space has proven a difficult task. We present an approach to modeling the RL value function using a manifold representation. By explicitly modeling the topology of the value function domain, traditional problems with discontinuities and resolution can be addressed without resorting to complex function approximators. We describe how manifold techniques can be applied to value-function approximation, and present methods for constructing manifold representations in both batch and online settings. We present empirical results demonstrating the effectiveness of our approach

    Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning

    Full text link
    Despite the recent success of reinforcement learning in various domains, these approaches remain, for the most part, deterringly sensitive to hyper-parameters and are often riddled with essential engineering feats allowing their success. We consider the case of off-policy generative adversarial imitation learning, and perform an in-depth review, qualitative and quantitative, of the method. We show that forcing the learned reward function to be local Lipschitz-continuous is a sine qua non condition for the method to perform well. We then study the effects of this necessary condition and provide several theoretical results involving the local Lipschitzness of the state-value function. We complement these guarantees with empirical evidence attesting to the strong positive effect that the consistent satisfaction of the Lipschitzness constraint on the reward has on imitation performance. Finally, we tackle a generic pessimistic reward preconditioning add-on spawning a large class of reward shaping methods, which makes the base method it is plugged into provably more robust, as shown in several additional theoretical guarantees. We then discuss these through a fine-grained lens and share our insights. Crucially, the guarantees derived and reported in this work are valid for any reward satisfying the Lipschitzness condition, nothing is specific to imitation. As such, these may be of independent interest

    Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning

    Full text link
    A fundamental question in any peer-to-peer ridesharing system is how to, both effectively and efficiently, dispatch user's ride requests to the right driver in real time. Traditional rule-based solutions usually work on a simplified problem setting, which requires a sophisticated hand-crafted weight design for either centralized authority control or decentralized multi-agent scheduling systems. Although recent approaches have used reinforcement learning to provide centralized combinatorial optimization algorithms with informative weight values, their single-agent setting can hardly model the complex interactions between drivers and orders. In this paper, we address the order dispatching problem using multi-agent reinforcement learning (MARL), which follows the distributed nature of the peer-to-peer ridesharing problem and possesses the ability to capture the stochastic demand-supply dynamics in large-scale ridesharing scenarios. Being more reliable than centralized approaches, our proposed MARL solutions could also support fully distributed execution through recent advances in the Internet of Vehicles (IoV) and the Vehicle-to-Network (V2N). Furthermore, we adopt the mean field approximation to simplify the local interactions by taking an average action among neighborhoods. The mean field approximation is capable of globally capturing dynamic demand-supply variations by propagating many local interactions between agents and the environment. Our extensive experiments have shown the significant improvements of MARL order dispatching algorithms over several strong baselines on the gross merchandise volume (GMV), and order response rate measures. Besides, the simulated experiments with real data have also justified that our solution can alleviate the supply-demand gap during the rush hours, thus possessing the capability of reducing traffic congestion.Comment: 11 pages, 9 figure

    Reinforcement learning for sequential decision-making: a data driven approach for finance

    Get PDF
    This work presents a variety of reinforcement learning applications to the domain of nance. It composes of two-part. The rst one represents a technical overview of the basic concepts in machine learning, which are required to understand and work with the reinforcement learning paradigm and are shared among the domains of applications. Chapter 1 outlines the fundamental principle of machine learning reasoning before introducing the neural network model as a central component of every algorithm presented in this work. Chapter 2 introduces the idea of reinforcement learning from its roots, focusing on the mathematical formalism generally employed in every application. We focus on integrating the reinforcement learning framework with the neural network, and we explain their critical role in the eld's development. After the technical part, we present our original contribution, articulated in three di erent essays. The narrative line follows the idea of introducing the use of varying reinforcement learning algorithms through a trading application (Brini and Tantari, 2021) in Chapter 3. Then in Chapter 4 we focus on one of the presented reinforcement learning algorithms and aim at improving its performance and scalability in solving the trading problem by leveraging prior knowledge of the setting. In Chapter 5 of the second part, we use the same reinforcement learning algorithm to solve the problem of exchanging liquidity in a system of banks that can borrow and lend money, highlighting the exibility and the e ectiveness of the reinforcement learning paradigm in the broad nancial domain. We conclude with some remarks and ideas for further research in reinforcement learning applied to nance

    The Role of Machine Learning in Knowledge-Based Response-Adapted Radiotherapy

    Get PDF
    With the continuous increase in radiotherapy patient-specific data from multimodality imaging and biotechnology molecular sources, knowledge-based response-adapted radiotherapy (KBR-ART) is emerging as a vital area for radiation oncology personalized treatment. In KBR-ART, planned dose distributions can be modified based on observed cues in patients’ clinical, geometric, and physiological parameters. In this paper, we present current developments in the field of adaptive radiotherapy (ART), the progression toward KBR-ART, and examine several applications of static and dynamic machine learning approaches for realizing the KBR-ART framework potentials in maximizing tumor control and minimizing side effects with respect to individual radiotherapy patients. Specifically, three questions required for the realization of KBR-ART are addressed: (1) what knowledge is needed; (2) how to estimate RT outcomes accurately; and (3) how to adapt optimally. Different machine learning algorithms for KBR-ART application shall be discussed and contrasted. Representative examples of different KBR-ART stages are also visited
    • …
    corecore