1,643 research outputs found
Lifting the Veil: Unlocking the Power of Depth in Q-learning
With the help of massive data and rich computational resources, deep
Q-learning has been widely used in operations research and management science
and has contributed to great success in numerous applications, including
recommender systems, supply chains, games, and robotic manipulation. However,
the success of deep Q-learning lacks solid theoretical verification and
interpretability. The aim of this paper is to theoretically verify the power of
depth in deep Q-learning. Within the framework of statistical learning theory,
we rigorously prove that deep Q-learning outperforms its traditional version by
demonstrating its good generalization error bound. Our results reveal that the
main reason for the success of deep Q-learning is the excellent performance of
deep neural networks (deep nets) in capturing the special properties of rewards
namely, spatial sparseness and piecewise constancy, rather than their large
capacities. In this paper, we make fundamental contributions to the field of
reinforcement learning by answering to the following three questions: Why does
deep Q-learning perform so well? When does deep Q-learning perform better than
traditional Q-learning? How many samples are required to achieve a specific
prediction accuracy for deep Q-learning? Our theoretical assertions are
verified by applying deep Q-learning in the well-known beer game in supply
chain management and a simulated recommender system
Random Neural Networks and Optimisation
In this thesis we introduce new models and learning algorithms for the Random
Neural Network (RNN), and we develop RNN-based and other approaches for the
solution of emergency management optimisation problems.
With respect to RNN developments, two novel supervised learning algorithms are
proposed. The first, is a gradient descent algorithm for an RNN extension model
that we have introduced, the RNN with synchronised interactions (RNNSI), which
was inspired from the synchronised firing activity observed in brain neural circuits.
The second algorithm is based on modelling the signal-flow equations in RNN as a
nonnegative least squares (NNLS) problem. NNLS is solved using a limited-memory
quasi-Newton algorithm specifically designed for the RNN case.
Regarding the investigation of emergency management optimisation problems,
we examine combinatorial assignment problems that require fast, distributed and
close to optimal solution, under information uncertainty. We consider three different
problems with the above characteristics associated with the assignment of
emergency units to incidents with injured civilians (AEUI), the assignment of assets
to tasks under execution uncertainty (ATAU), and the deployment of a robotic
network to establish communication with trapped civilians (DRNCTC).
AEUI is solved by training an RNN tool with instances of the optimisation problem
and then using the trained RNN for decision making; training is achieved using
the developed learning algorithms. For the solution of ATAU problem, we introduce
two different approaches. The first is based on mapping parameters of the
optimisation problem to RNN parameters, and the second on solving a sequence of
minimum cost flow problems on appropriately constructed networks with estimated
arc costs. For the exact solution of DRNCTC problem, we develop a mixed-integer
linear programming formulation, which is based on network flows. Finally, we design
and implement distributed heuristic algorithms for the deployment of robots
when the civilian locations are known or uncertain
"Sticky Hands": learning and generalization for cooperative physical interactions with a humanoid robot
"Sticky Hands" is a physical game for two people involving gentle contact with the hands. The aim is to develop relaxed and elegant motion together, achieve physical sensitivity-improving reactions, and experience an interaction at an intimate yet comfortable level for spiritual development and physical relaxation. We developed a control system for a humanoid robot allowing it to play Sticky Hands with a human partner. We present a real implementation including a physical system, robot control, and a motion learning algorithm based on a generalizable intelligent system capable itself of generalizing observed trajectories' translation, orientation, scale and velocity to new data, operating with scalable speed and storage efficiency bounds, and coping with contact trajectories that evolve over time. Our robot control is capable of physical cooperation in a force domain, using minimal sensor input. We analyze robot-human interaction and relate characteristics of our motion learning algorithm with recorded motion profiles. We discuss our results in the context of realistic motion generation and present a theoretical discussion of stylistic and affective motion generation based on, and motivating cross-disciplinary research in computer graphics, human motion production and motion perception
A review on model-based and model-free approaches to control soft actuators and their potentials in colonoscopy
Colorectal cancer (CRC) is the third most common cancer worldwide and responsible for approximately 1 million deaths annually. Early screening is essential to increase the chances of survival, and it can also reduce the cost of treatments for healthcare centres. Colonoscopy is the gold standard for CRC screening and treatment, but it has several drawbacks, including difficulty in manoeuvring the device, patient discomfort, and high cost. Soft endorobots, small and compliant devices thatcan reduce the force exerted on the colonic wall, offer a potential solution to these issues. However, controlling these soft robots is challenging due to their deformable materials and the limitations of mathematical models. In this Review, we discuss model-free and model-based approaches for controlling soft robots that can potentially be applied to endorobots for colonoscopy. We highlight the importance of selecting appropriate control methods based on various parameters, such as sensor and actuator solutions. This review aims to contribute to the development of smart control strategies for soft endorobots that can enhance the effectiveness and safety of robotics in colonoscopy. These strategies can be defined based on the available information about the robot and surrounding environment, control demands, mechanical design impact and characterization data based on calibration.<br/
- …