3 research outputs found
A Nonlinear PID-Enhanced Adaptive Latent Factor Analysis Model
High-dimensional and incomplete (HDI) data holds tremendous interactive
information in various industrial applications. A latent factor (LF) model is
remarkably effective in extracting valuable information from HDI data with
stochastic gradient decent (SGD) algorithm. However, an SGD-based LFA model
suffers from slow convergence since it only considers the current learning
error. To address this critical issue, this paper proposes a Nonlinear
PID-enhanced Adaptive Latent Factor (NPALF) model with two-fold ideas: 1)
rebuilding the learning error via considering the past learning errors
following the principle of a nonlinear PID controller; b) implementing all
parameters adaptation effectively following the principle of a particle swarm
optimization (PSO) algorithm. Experience results on four representative HDI
datasets indicate that compared with five state-of-the-art LFA models, the
NPALF model achieves better convergence rate and prediction accuracy for
missing data of an HDI data
Comparative Evaluation for Effectiveness Analysis of Policy Based Deep Reinforcement Learning Approaches
Deep Reinforcement Learning (DRL) has proven to be a very strong technique with results in various applications in recent years. Especially the achievements in the studies in the field of robotics show that much more progress will be made in this field. Undoubtedly, policy choices and parameter settings play an active role in the success of DRL. In this study, an analysis has been made on the policies used by examining the DRL studies conducted in recent years. Policies used in the literature are grouped under three different headings: value-based, policy-based and actor-critic. However, the problem of moving a common target using Newton's law of motion of collaborative agents is presented. Trainings are carried out in a frictionless environment with two agents and one object using four different policies. Agents try to force an object in the environment by colliding it and try to move it out of the area it is in. Two-dimensional surface is used during the training phase. As a result of the training, each policy is reported separately and its success is observed. Test results are discussed in section 5. Thus, policies are tested together with an application by providing information about the policies used in deep reinforcement learning approaches