1,608 research outputs found

    Hybrid Algorithm Selection and Hyperparameter Tuning on Distributed Machine Learning Resources: A Hierarchical Agent-based Approach

    Full text link
    Algorithm selection and hyperparameter tuning are critical steps in both academic and applied machine learning. On the other hand, these steps are becoming ever increasingly delicate due to the extensive rise in the number, diversity, and distributedness of machine learning resources. Multi-agent systems, when applied to the design of machine learning platforms, bring about several distinctive characteristics such as scalability, flexibility, and robustness, just to name a few. This paper proposes a fully automatic and collaborative agent-based mechanism for selecting distributedly organized machine learning algorithms and simultaneously tuning their hyperparameters. Our method builds upon an existing agent-based hierarchical machine-learning platform and augments its query structure to support the aforementioned functionalities without being limited to specific learning, selection, and tuning mechanisms. We have conducted theoretical assessments, formal verification, and analytical study to demonstrate the correctness, resource utilization, and computational efficiency of our technique. According to the results, our solution is totally correct and exhibits linear time and space complexity in relation to the size of available resources. To provide concrete examples of how the proposed methodologies can effectively adapt and perform across a range of algorithmic options and datasets, we have also conducted a series of experiments using a system comprised of 24 algorithms and 9 datasets

    Reinforcement learning in continuous state- and action-space

    Get PDF
    Reinforcement learning in the continuous state-space poses the problem of the inability to store the values of all state-action pairs in a lookup table, due to both storage limitations and the inability to visit all states sufficiently often to learn the correct values. This can be overcome with the use of function approximation techniques with generalisation capability, such as artificial neural networks, to store the value function. When this is applied we can select the optimal action by comparing the values of each possible action; however, when the action-space is continuous this is not possible. In this thesis we investigate methods to select the optimal action when artificial neural networks are used to approximate the value function, through the application of numerical optimization techniques. Although it has been stated in the literature that gradient-ascent methods can be applied to the action selection [47], it is also stated that solving this problem would be infeasible, and therefore, is claimed that it is necessary to utilise a second artificial neural network to approximate the policy function [21, 55]. The major contributions of this thesis include the investigation of the applicability of action selection by numerical optimization methods, including gradient-ascent along with other derivative-based and derivative-free numerical optimization methods,and the proposal of two novel algorithms which are based on the application of two alternative action selection methods: NM-SARSA [40] and NelderMead-SARSA. We empirically compare the proposed methods to state-of-the-art methods from the literature on three continuous state- and action-space control benchmark problems from the literature: minimum-time full swing-up of the Acrobot; Cart-Pole balancing problem; and a double pole variant. We also present novel results from the application of the existing direct policy search method genetic programming to the Acrobot benchmark problem [12, 14]

    Intelligent methods for complex systems control engineering

    Get PDF
    This thesis proposes an intelligent multiple-controller framework for complex systems that incorporates a fuzzy logic based switching and tuning supervisor along with a neural network based generalized learning model (GLM). The framework is designed for adaptive control of both Single-Input Single-Output (SISO) and Multi-Input Multi-Output (MIMO) complex systems. The proposed methodology provides the designer with an automated choice of using either: a conventional Proportional-Integral-Derivative (PID) controller, or a PID structure based (simultaneous) Pole and Zero Placement controller. The switching decisions between the two nonlinear fixed structure controllers is made on the basis of the required performance measure using the fuzzy logic based supervisor operating at the highest level of the system. The fuzzy supervisor is also employed to tune the parameters of the multiple-controller online in order to achieve the desired system performance. The GLM for modelling complex systems assumes that the plant is represented by an equivalent model consisting of a linear time-varying sub-model plus a learning nonlinear sub-model based on Radial Basis Function (RBF) neural network. The proposed control design brings together the dominant advantages of PID controllers (such as simplicity in structure and implementation) and the desirable attributes of Pole and Zero Placement controllers (such as stable set-point tracking and ease of parameters’ tuning). Simulation experiments using real-world nonlinear SISO and MIMO plant models, including realistic nonlinear vehicle models, demonstrate the effectiveness of the intelligent multiple-controller with respect to tracking set-point changes, achieve desired speed of response, prevent system output overshooting and maintain minimum variance input and output signals, whilst penalising excessive control actions

    GA-PSO-Optimized Neural-Based Control Scheme for Adaptive Congestion Control to Improve Performance in Multimedia Applications

    Full text link
    Active queue control aims to improve the overall communication network throughput while providing lower delay and small packet loss rate. The basic idea is to actively trigger packet dropping (or marking provided by explicit congestion notification (ECN)) before buffer overflow. In this paper, two artificial neural networks (ANN)-based control schemes are proposed for adaptive queue control in TCP communication networks. The structure of these controllers is optimized using genetic algorithm (GA) and the output weights of ANNs are optimized using particle swarm optimization (PSO) algorithm. The controllers are radial bias function (RBF)-based, but to improve the robustness of RBF controller, an error-integral term is added to RBF equation in the second scheme. Experimental results show that GA- PSO-optimized improved RBF (I-RBF) model controls network congestion effectively in terms of link utilization with a low packet loss rate and outperform Drop Tail, proportional-integral (PI), random exponential marking (REM), and adaptive random early detection (ARED) controllers.Comment: arXiv admin note: text overlap with arXiv:1711.0635

    DECISION SUPPORT IN CAR LEASING: A FORECASTING MODEL FOR RESIDUAL VALUE ESTIMATION

    Get PDF
    The paper proposes a methodology to support pricing decisions in the car leasing industry. In particular, the price is given by the monthly fee to be paid by the lessee as compensation for using a car over some contract horizon. After contract expiration, lessors are obliged to take back the vehicle, which will then be sold in the used car market. Therefore, lessors require an accurate estimate of cars’ residual values to manage the risk inherent to their business and determine profitable prices. We explore the organizational and technical requirements associated with this forecasting task and develop a prediction model that complies with identified application constraints. The model is rigorously tested within an empirical study and compared to established benchmarks. The results obtained in several experiments provide strong evidence for the proposed model being effective in generating accurate predictions of cars’ residual values and efficient in requiring little user intervention

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Constrained Discrete Phase Control of a Heaving Wave Energy Converter in Irregular Seas Using Reinforcement Learning

    Get PDF
    Designed for offshore deployment in irregular seas, the point absorber wave energy conversion (WEC) system is promisingly attractive amongst the currently available WEC technologies. The effectiveness of phase control when applied to a heaving point absorber through a hydraulic power take-off (PTO) system is systematically investigated in both regular and irregular waves. For this purpose, two phase control accumulators are utilized in the hydraulic PTO system. Simulations are performed in MATLAB® using the Cummins equation to model the dynamics of the heaving point absorber in the time domain. For a given sea state, the opening instant of the control valves of the phase control accumulators relative to the wave excitation peak and the volumetric displacement of the hydraulic motor are utilized as parameters in a number of simulation runs. In regular waves, the parametric investigation demonstrates that in most cases there is a trade off between maximizing the mean generated power and minimizing the maximum motion amplitude. In fully developed irregular seas, a parametric investigation of different sea states in the North Atlantic demonstrates that by utilizing phase control a significant increase in the power absorption efficiency can be obtained compared to the WEC system operation without phase control. The problem of providing an effective phase-control strategy that maximizes the mean generated power of the WEC system subject to motion amplitude constraints is formulated and solved using a Reinforcement Learning (RL) approach based on the Q-learning algorithm. The RL-based controller chooses actions that determine the opening instant of the phase control accumulator valves and the volumetric displacement of the hydraulic motor. As demonstrated in both regular and irregular waves, the RL-based controller is successful in finding the optimal phase-control strategy. Finally, the prediction of the wave excitation force is performed using a Radial Basis Function (RBF) network ensemble in order to evaluate the impact of the prediction accuracy on the RL-controller\u27s performance. The results show that the computed mean generated power and maximum motion amplitude values using the RBF network ensemble predictions compare very well with the corresponding values computed assuming perfect knowledge of the future wave excitation
    • …
    corecore