Trajectory Planning and Subject-Specific Control of a Stroke Rehabilitation Robot using Deep Reinforcement Learning

Abstract

There are approximately 13 million annual new stroke cases worldwide. Research has shown that robotics can provide practical and efficient solutions for expediting post-stroke patient recovery. Assistive robots provide automatic limb training, which saves a great deal of time and energy. In addition, they facilitate the use of data acquisition devices. The data is beneficial in terms of quantitative evaluation of the patient progress. This research focused on the trajectory planning and subject-specific control of an upper-extremity post-stroke rehabilitation robot. To find the optimal rehabilitation practice, the manipulation trajectory was designed by an optimization-based planner. A linear quadratic regulator (LQR) controller was then applied to stabilize the trajectory. The integrated planner-controller framework was tested in simulation. To validate the simulation results, hardware implementation was conducted, which provided good agreement with simulation. One of the challenges of rehabilitation robotics is the choice of the low-level controller. To find the best candidate for our specific setup, five controllers were evaluated in simulation for circular trajectory tracking. In particular, we compared the performance of LQR, sliding mode control (SMC), and nonlinear model predictive control (NMPC) to conventional proportional integral derivative (PID) and computed-torque PID controllers. The real-time assessment of the mentioned controllers was done by implementing them on the physical hardware for point stabilization and circular trajectory tracking scenarios. Our comparative study confirmed the need for advanced low-level controllers for better performance. Due to complex online optimization of the NMPC and the incorporated delay in the method of implementation, performance degradation was observed with NMPC compared to other advanced controllers. The evaluation showed that SMC and LQR were the two best candidates for the robot. To remove the need for extensive manual controller tuning, a deep reinforcement learning (DRL) tuner framework was designed in MATLAB to provide the optimal weights for the controllers; it permitted the online tuning of the weights, which enabled the subject-specific controller weight adjustment. This tuner was tested in simulation by adding a random noise to the input at each iteration, to simulate the subject. Compared to fixed manually tuned weights, the DRL-tuned controller presented lower position-error. In addition, an easy to implement high-level force controller algorithm was designed by incorporating the subject force data. The resulting hybrid position/force controller was tested with a healthy subject in the loop. The controller was able to provide assist as needed when the subject increased the position-error. Future research might consider model reduction methods for expediting the NMPC optimization, application of the DRL on other controllers and for optimization parameter adjustment, testing other high-level controllers like admittance control, and testing the final controllers with post-stroke patients

    Similar works