1,643 research outputs found

    Lifting the Veil: Unlocking the Power of Depth in Q-learning

    Full text link
    With the help of massive data and rich computational resources, deep Q-learning has been widely used in operations research and management science and has contributed to great success in numerous applications, including recommender systems, supply chains, games, and robotic manipulation. However, the success of deep Q-learning lacks solid theoretical verification and interpretability. The aim of this paper is to theoretically verify the power of depth in deep Q-learning. Within the framework of statistical learning theory, we rigorously prove that deep Q-learning outperforms its traditional version by demonstrating its good generalization error bound. Our results reveal that the main reason for the success of deep Q-learning is the excellent performance of deep neural networks (deep nets) in capturing the special properties of rewards namely, spatial sparseness and piecewise constancy, rather than their large capacities. In this paper, we make fundamental contributions to the field of reinforcement learning by answering to the following three questions: Why does deep Q-learning perform so well? When does deep Q-learning perform better than traditional Q-learning? How many samples are required to achieve a specific prediction accuracy for deep Q-learning? Our theoretical assertions are verified by applying deep Q-learning in the well-known beer game in supply chain management and a simulated recommender system

    Random Neural Networks and Optimisation

    Get PDF
    In this thesis we introduce new models and learning algorithms for the Random Neural Network (RNN), and we develop RNN-based and other approaches for the solution of emergency management optimisation problems. With respect to RNN developments, two novel supervised learning algorithms are proposed. The first, is a gradient descent algorithm for an RNN extension model that we have introduced, the RNN with synchronised interactions (RNNSI), which was inspired from the synchronised firing activity observed in brain neural circuits. The second algorithm is based on modelling the signal-flow equations in RNN as a nonnegative least squares (NNLS) problem. NNLS is solved using a limited-memory quasi-Newton algorithm specifically designed for the RNN case. Regarding the investigation of emergency management optimisation problems, we examine combinatorial assignment problems that require fast, distributed and close to optimal solution, under information uncertainty. We consider three different problems with the above characteristics associated with the assignment of emergency units to incidents with injured civilians (AEUI), the assignment of assets to tasks under execution uncertainty (ATAU), and the deployment of a robotic network to establish communication with trapped civilians (DRNCTC). AEUI is solved by training an RNN tool with instances of the optimisation problem and then using the trained RNN for decision making; training is achieved using the developed learning algorithms. For the solution of ATAU problem, we introduce two different approaches. The first is based on mapping parameters of the optimisation problem to RNN parameters, and the second on solving a sequence of minimum cost flow problems on appropriately constructed networks with estimated arc costs. For the exact solution of DRNCTC problem, we develop a mixed-integer linear programming formulation, which is based on network flows. Finally, we design and implement distributed heuristic algorithms for the deployment of robots when the civilian locations are known or uncertain

    "Sticky Hands": learning and generalization for cooperative physical interactions with a humanoid robot

    Get PDF
    "Sticky Hands" is a physical game for two people involving gentle contact with the hands. The aim is to develop relaxed and elegant motion together, achieve physical sensitivity-improving reactions, and experience an interaction at an intimate yet comfortable level for spiritual development and physical relaxation. We developed a control system for a humanoid robot allowing it to play Sticky Hands with a human partner. We present a real implementation including a physical system, robot control, and a motion learning algorithm based on a generalizable intelligent system capable itself of generalizing observed trajectories' translation, orientation, scale and velocity to new data, operating with scalable speed and storage efficiency bounds, and coping with contact trajectories that evolve over time. Our robot control is capable of physical cooperation in a force domain, using minimal sensor input. We analyze robot-human interaction and relate characteristics of our motion learning algorithm with recorded motion profiles. We discuss our results in the context of realistic motion generation and present a theoretical discussion of stylistic and affective motion generation based on, and motivating cross-disciplinary research in computer graphics, human motion production and motion perception

    A review on model-based and model-free approaches to control soft actuators and their potentials in colonoscopy

    Get PDF
    Colorectal cancer (CRC) is the third most common cancer worldwide and responsible for approximately 1 million deaths annually. Early screening is essential to increase the chances of survival, and it can also reduce the cost of treatments for healthcare centres. Colonoscopy is the gold standard for CRC screening and treatment, but it has several drawbacks, including difficulty in manoeuvring the device, patient discomfort, and high cost. Soft endorobots, small and compliant devices thatcan reduce the force exerted on the colonic wall, offer a potential solution to these issues. However, controlling these soft robots is challenging due to their deformable materials and the limitations of mathematical models. In this Review, we discuss model-free and model-based approaches for controlling soft robots that can potentially be applied to endorobots for colonoscopy. We highlight the importance of selecting appropriate control methods based on various parameters, such as sensor and actuator solutions. This review aims to contribute to the development of smart control strategies for soft endorobots that can enhance the effectiveness and safety of robotics in colonoscopy. These strategies can be defined based on the available information about the robot and surrounding environment, control demands, mechanical design impact and characterization data based on calibration.<br/
    corecore