4 research outputs found

    Incorporating Memory and Learning Mechanisms Into Meta-RaPS

    Get PDF
    Due to the rapid increase of dimensions and complexity of real life problems, it has become more difficult to find optimal solutions using only exact mathematical methods. The need to find near-optimal solutions in an acceptable amount of time is a challenge when developing more sophisticated approaches. A proper answer to this challenge can be through the implementation of metaheuristic approaches. However, a more powerful answer might be reached by incorporating intelligence into metaheuristics. Meta-RaPS (Metaheuristic for Randomized Priority Search) is a metaheuristic that creates high quality solutions for discrete optimization problems. It is proposed that incorporating memory and learning mechanisms into Meta-RaPS, which is currently classified as a memoryless metaheuristic, can help the algorithm produce higher quality results. The proposed Meta-RaPS versions were created by taking different perspectives of learning. The first approach taken is Estimation of Distribution Algorithms (EDA), a stochastic learning technique that creates a probability distribution for each decision variable to generate new solutions. The second Meta-RaPS version was developed by utilizing a machine learning algorithm, Q Learning, which has been successfully applied to optimization problems whose output is a sequence of actions. In the third Meta-RaPS version, Path Relinking (PR) was implemented as a post-optimization method in which the new algorithm learns the good attributes by memorizing best solutions, and follows them to reach better solutions. The fourth proposed version of Meta-RaPS presented another form of learning with its ability to adaptively tune parameters. The efficiency of these approaches motivated us to redesign Meta-RaPS by removing the improvement phase and adding a more sophisticated Path Relinking method. The new Meta-RaPS could solve even the largest problems in much less time while keeping up the quality of its solutions. To evaluate their performance, all introduced versions were tested using the 0-1 Multidimensional Knapsack Problem (MKP). After comparing the proposed algorithms, Meta-RaPS PR and Meta-RaPS Q Learning appeared to be the algorithms with the best and worst performance, respectively. On the other hand, they could all show superior performance than other approaches to the 0-1 MKP in the literature

    Reinforcement Learning For The Control Of Large-Scale Power Systems

    Get PDF

    Towards Fast and High-quality Biomedical Image Reconstruction

    Get PDF
    Department of Computer Science and EngineeringReconstruction is an important module in the image analysis pipeline with purposes of isolating the majority of meaningful information that hidden inside the acquired data. The term ???reconstruction??? can be understood and subdivided in several specific tasks in different modalities. For example, in biomedical imaging, such as Computed Tomography (CT), Magnetic Resonance Image (MRI), that term stands for the transformation from the, possibly fully or under-sampled, spectral domains (sinogram for CT and k-space for MRI) to the visible image domains. Or, in connectomics, people usually refer it to segmentation (reconstructing the semantic contact between neuronal connections) or denoising (reconstructing the clean image). In this dissertation research, I will describe a set of my contributed algorithms from conventional to state-of-the-art deep learning methods, with a transition at the data-driven dictionary learning approaches that tackle the reconstruction problems in various image analysis tasks.clos

    Discretization and Approximation Methods for Reinforcement Learning of Highly Reconfigurable Systems

    Get PDF
    There are a number of techniques that are used to solve reinforcement learning problems, but very few that have been developed for and tested on highly reconfigurable systems cast as reinforcement learning problems. Reconfigurable systems refers to a vehicle (air, ground, or water) or collection of vehicles that can change its geometrical features, i.e. shape or formation, to perform tasks that the vehicle could not otherwise accomplish. These systems tend to be optimized for several operating conditions, and then controllers are designed to reconfigure the system from one operating condition to another. Q-learning, an unsupervised episodic learning technique that solves the reinforcement learning problem, is an attractive control methodology for reconfigurable systems. It has been successfully applied to a myriad of control problems, and there are a number of variations that were developed to avoid or alleviate some limitations in earlier version of this approach. This dissertation describes the development of three modular enhancements to the Q-learning algorithm that solve some of the unique problems that arise when working with this class of systems, such as the complex interaction of reconfigurable parameters and computationally intensive models of the systems. A multi-resolution state-space discretization method is developed that adaptively rediscretizes the state-space by progressively finer grids around one or more distinct Regions Of Interest within the state or learning space. A genetic algorithm that autonomously selects the basis functions to be used in the approximation of the action-value function is applied periodically throughout the learning process. Policy comparison is added to monitor the state of the policy encoded in the action-value function to prevent unnecessary episodes at each level of discretization. This approach is validated on several problems including an inverted pendulum, reconfigurable airfoil, and reconfigurable wing. Results show that the multi-resolution state-space discretization method reduces the number of state-action pairs, often by an order of magnitude, required to achieve a specific goal and the policy comparison prevents unnecessary episodes once the policy has converged to a usable policy. Results also show that the genetic algorithm is a promising candidate for the selection of basis functions for function approximation of the action-value function
    corecore