131 research outputs found

    ๊ณ„์ธต ๊ฐ•ํ™” ํ•™์Šต์—์„œ์˜ ํƒํ—˜์  ํ˜ผํ•ฉ ํƒ์ƒ‰

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2020. 8. ๋ฌธ๋ณ‘๋กœ.Balancing exploitation and exploration is a great challenge in many optimization problems. Evolutionary algorithms, such as evolutionary strategies and genetic algorithms, are algorithms inspired by biological evolution. They have been used for various optimization problems, such as combinatorial optimization and continuous optimization. However, evolutionary algorithms lack fine-tuning near local optima; in other words, they lack exploitation power. This drawback can be overcome by hybridization. Hybrid genetic algorithms, or memetic algorithms, are successful examples of hybridization. Although the solution space is exponentially vast in some optimization problems, these algorithms successfully find satisfactory solutions. In the deep learning era, the problem of exploitation and exploration has been relatively neglected. In deep reinforcement learning problems, however, balancing exploitation and exploration is more crucial than that in problems with supervision. Many environments in the real world have an exponentially wide state space that must be explored by agents. Without sufficient exploration power, agents only reveal a small portion of the state space and end up with seeking only instant rewards. In this thesis, a hybridization method is proposed which contains both gradientbased policy optimization with strong exploitation power and evolutionary policy optimization with strong exploration power. First, the gradient-based policy optimization and evolutionary policy optimization are analyzed in various environments. The results demonstrate that evolutionary policy optimization is robust for sparse rewards but weak for instant rewards, whereas gradient-based policy optimization is effective for instant rewards but weak for sparse rewards. This difference between the two optimizations reveals the potential of hybridization in policy optimization. Then, a hybrid search is suggested in the framework of hierarchical reinforcement learning. The results demonstrate that the hybrid search finds an effective agent for complex environments with sparse rewards thanks to its balanced exploitation and exploration.๋งŽ์€ ์ตœ์ ํ™” ๋ฌธ์ œ์—์„œ ํƒ์‚ฌ์™€ ํƒํ—˜์˜ ๊ท ํ˜•์„ ๋งž์ถ”๋Š” ๊ฒƒ์€ ๋งค์šฐ ์ค‘์š”ํ•œ ๋ฌธ์ œ์ด๋‹ค. ์ง„ํ™” ์ „๋žต๊ณผ ์œ ์ „ ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๊ฐ™์€ ์ง„ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ž์—ฐ์—์„œ์˜ ์ง„ํ™”์—์„œ ์˜๊ฐ์„ ์–ป์€ ๋ฉ”ํƒ€ํœด๋ฆฌ์Šคํ‹ฑ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. ์ด๋“ค์€ ์กฐํ•ฉ ์ตœ์ ํ™”, ์—ฐ์† ์ตœ์ ํ™”์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์ตœ์ ํ™” ๋ฌธ์ œ๋ฅผ ํ’€๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ง„ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ง€์—ญ ์ตœ์ ํ•ด ๊ทผ์ฒ˜์—์„œ์˜ ๋ฏธ์„ธ ์กฐ์ •, ์ฆ‰ ํƒ์‚ฌ์— ์•ฝํ•œ ํŠน์„ฑ์ด ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ์ ํ•จ์€ ํ˜ผํ•ฉํ™”๋ฅผ ํ†ตํ•ด ๊ทน๋ณตํ•  ์ˆ˜ ์žˆ๋‹ค. ํ˜ผํ•ฉ ์œ ์ „ ์•Œ๊ณ ๋ฆฌ์ฆ˜, ํ˜น์€ ๋ฏธ๋ฏธํ‹ฑ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์„ฑ๊ณต์ ์ธ ํ˜ผํ•ฉํ™”์˜ ์‚ฌ๋ก€์ด๋‹ค. ์ด๋Ÿฌํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ตœ์ ํ™” ๋ฌธ์ œ์˜ ํ•ด ๊ณต๊ฐ„์ด ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ๋„“๋”๋ผ๋„ ์„ฑ๊ณต์ ์œผ๋กœ ๋งŒ์กฑ์Šค๋Ÿฌ์šด ํ•ด๋ฅผ ์ฐพ์•„๋‚ธ๋‹ค. ํ•œํŽธ ์‹ฌ์ธต ํ•™์Šต์˜ ์‹œ๋Œ€์—์„œ, ํƒ์‚ฌ์™€ ํƒํ—˜์˜ ๊ท ํ˜•์„ ๋งž์ถ”๋Š” ๋ฌธ์ œ๋Š” ์ข…์ข… ๋ฌด์‹œ๋˜์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ์‹ฌ์ธต ๊ฐ•ํ™”ํ•™์Šต์—์„œ๋Š” ํƒ์‚ฌ์™€ ํƒํ—˜์˜ ๊ท ํ˜•์„ ๋งž์ถ”๋Š” ์ผ์€ ์ง€๋„ํ•™์Šต์—์„œ๋ณด๋‹ค ํ›จ์”ฌ ๋” ์ค‘์š”ํ•˜๋‹ค. ๋งŽ์€ ์‹ค์ œ ์„ธ๊ณ„์˜ ํ™˜๊ฒฝ์€ ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ํฐ ์ƒํƒœ ๊ณต๊ฐ„์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ  ์—์ด์ „ํŠธ๋Š” ์ด๋ฅผ ํƒํ—˜ํ•ด์•ผ๋งŒ ํ•œ๋‹ค. ์ถฉ๋ถ„ํ•œ ํƒํ—˜ ๋Šฅ๋ ฅ์ด ์—†์œผ๋ฉด ์—์ด์ „ํŠธ๋Š” ์ƒํƒœ ๊ณต๊ฐ„์˜ ๊ทนํžˆ ์ผ๋ถ€๋งŒ์„ ๋ฐํ˜€๋‚ด์–ด ๊ฒฐ๊ตญ ์ฆ‰๊ฐ์ ์ธ ๋ณด์ƒ๋งŒ ํƒํ•˜๊ฒŒ ๋  ๊ฒƒ์ด๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ๊ฐ•ํ•œ ํƒ์‚ฌ ๋Šฅ๋ ฅ์„ ๊ฐ€์ง„ ๊ทธ๋ ˆ๋””์–ธํŠธ ๊ธฐ๋ฐ˜ ์ •์ฑ… ์ตœ์ ํ™”์™€ ๊ฐ•ํ•œ ํƒํ—˜ ๋Šฅ๋ ฅ์„ ๊ฐ€์ง„ ์ง„ํ™”์  ์ •์ฑ… ์ตœ์ ํ™”๋ฅผ ํ˜ผํ•ฉํ•˜๋Š” ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•  ๊ฒƒ์ด๋‹ค. ์šฐ์„  ๊ทธ๋ ˆ๋””์–ธํŠธ ๊ธฐ๋ฐ˜ ์ •์ฑ… ์ตœ์ ํ™”์™€ ์ง„ํ™”์  ์ •์ฑ… ์ตœ์ ํ™”๋ฅผ ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ์—์„œ ๋ถ„์„ํ•œ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ๊ทธ๋ ˆ๋””์–ธํŠธ ๊ธฐ๋ฐ˜ ์ •์ฑ… ์ตœ์ ํ™”๋Š” ์ฆ‰๊ฐ์  ๋ณด์ƒ์— ํšจ๊ณผ์ ์ด์ง€๋งŒ ๋ณด์ƒ์˜ ๋ฐ€๋„๊ฐ€ ๋‚ฎ์„๋•Œ ์ทจ์•ฝํ•œ ๋ฐ˜๋ฉด ์ง„ํ™”์  ์ •์ฑ… ์ตœ์ ํ™”๊ฐ€ ๋ฐ€๋„๊ฐ€ ๋‚ฎ์€ ๋ณด์ƒ์— ๋Œ€ํ•ด ๊ฐ•ํ•˜์ง€๋งŒ ์ฆ‰๊ฐ์ ์ธ ๋ณด์ƒ์— ๋Œ€ํ•ด ์ทจ์•ฝํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ๋‘ ๊ฐ€์ง€ ์ตœ์ ํ™”์˜ ํŠน์ง• ์ƒ ์ฐจ์ด์ ์ด ํ˜ผํ•ฉ์  ์ •์ฑ… ์ตœ์ ํ™”์˜ ๊ฐ€๋Šฅ์„ฑ์„ ๋ณด์—ฌ์ค€๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ณ„์ธต์  ๊ฐ•ํ™” ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ์˜ ํ˜ผํ•ฉ ํƒ์ƒ‰ ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ํ˜ผํ•ฉ ํƒ์ƒ‰ ๊ธฐ๋ฒ•์ด ๊ท ํ˜•์žกํžŒ ํƒ์‚ฌ์™€ ํƒํ—˜ ๋•๋ถ„์— ๋ฐ€๋„๊ฐ€ ๋‚ฎ์€ ๋ณด์ƒ์„ ์ฃผ๋Š” ๋ณต์žกํ•œ ํ™˜๊ฒฝ์—์„œ ํšจ๊ณผ์ ์ธ ์—์ด์ „ํŠธ๋ฅผ ์ฐพ์•„๋‚ธ ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค.I. Introduction 1 II. Background 6 2.1 Evolutionary Computations 6 2.1.1 Hybrid Genetic Algorithm 7 2.1.2 Evolutionary Strategy 9 2.2 Hybrid Genetic Algorithm Example: Brick Layout Problem 10 2.2.1 Problem Statement 11 2.2.2 Hybrid Genetic Algorithm 11 2.2.3 Experimental Results 14 2.2.4 Discussion 15 2.3 Reinforcement Learning 16 2.3.1 Policy Optimization 19 2.3.2 Proximal Policy Optimization 21 2.4 Neuroevolution for Reinforcement Learning 23 2.5 Hierarchical Reinforcement Learning 25 2.5.1 Option-based HRL 26 2.5.2 Goal-based HRL 27 2.5.3 Exploitation versus Exploration 27 III. Understanding Features of Evolutionary Policy Optimizations 29 3.1 Experimental Setup 31 3.2 Feature Analysis 32 3.2.1 Convolution Filter Inspection 32 3.2.2 Saliency Map 36 3.3 Discussion 40 3.3.1 Behavioral Characteristics 40 3.3.2 ES Agent without Inputs 42 IV. Hybrid Search for Hierarchical Reinforcement Learning 44 4.1 Method 45 4.2 Experimental Setup 47 4.2.1 Environment 47 4.2.2 Network Architectures 50 4.2.3 Training 50 4.3 Results 51 4.3.1 Comparison 51 4.3.2 Experimental Results 53 4.3.3 Behavior of Low-Level Policy 54 4.4 Conclusion 55 V. Conclusion 56 5.1 Summary 56 5.2 Future Work 57 Bibliography 58Docto

    Minimum Population Search, an Application to Molecular Docking

    Get PDF
    Computer modeling of protein-ligand interactions is one of the most important phases in a drug design process. Part of the process involves the optimization of highly multi-modal objective (scoring) functions. This research presents the Minimum Population Search heuristic as an alternative for solving these global unconstrained optimization problems. To determine the effectiveness of Minimum Population Search, a comparison with seven state-of-the-art search heuristics is performed. Being specifically designed for the optimization of large scale multi-modal problems, Minimum Population Search achieves excellent results on all of the tested complexes, especially when the amount of available function evaluations is strongly reduced. A first step is also made toward the design of hybrid algorithms based on the exploratory power of Minimum Population Search. Computational results show that hybridization leads to a further improvement in performance

    Current Studies and Applications of Krill Herd and Gravitational Search Algorithms in Healthcare

    Full text link
    Nature-Inspired Computing or NIC for short is a relatively young field that tries to discover fresh methods of computing by researching how natural phenomena function to find solutions to complicated issues in many contexts. As a consequence of this, ground-breaking research has been conducted in a variety of domains, including synthetic immune functions, neural networks, the intelligence of swarm, as well as computing of evolutionary. In the domains of biology, physics, engineering, economics, and management, NIC techniques are used. In real-world classification, optimization, forecasting, and clustering, as well as engineering and science issues, meta-heuristics algorithms are successful, efficient, and resilient. There are two active NIC patterns: the gravitational search algorithm and the Krill herd algorithm. The study on using the Krill Herd Algorithm (KH) and the Gravitational Search Algorithm (GSA) in medicine and healthcare is given a worldwide and historical review in this publication. Comprehensive surveys have been conducted on some other nature-inspired algorithms, including KH and GSA. The various versions of the KH and GSA algorithms and their applications in healthcare are thoroughly reviewed in the present article. Nonetheless, no survey research on KH and GSA in the healthcare field has been undertaken. As a result, this work conducts a thorough review of KH and GSA to assist researchers in using them in diverse domains or hybridizing them with other popular algorithms. It also provides an in-depth examination of the KH and GSA in terms of application, modification, and hybridization. It is important to note that the goal of the study is to offer a viewpoint on GSA with KH, particularly for academics interested in investigating the capabilities and performance of the algorithm in the healthcare and medical domains.Comment: 35 page

    Hybrid Advanced Optimization Methods with Evolutionary Computation Techniques in Energy Forecasting

    Get PDF
    More accurate and precise energy demand forecasts are required when energy decisions are made in a competitive environment. Particularly in the Big Data era, forecasting models are always based on a complex function combination, and energy data are always complicated. Examples include seasonality, cyclicity, fluctuation, dynamic nonlinearity, and so on. These forecasting models have resulted in an over-reliance on the use of informal judgment and higher expenses when lacking the ability to determine data characteristics and patterns. The hybridization of optimization methods and superior evolutionary algorithms can provide important improvements via good parameter determinations in the optimization process, which is of great assistance to actions taken by energy decision-makers. This book aimed to attract researchers with an interest in the research areas described above. Specifically, it sought contributions to the development of any hybrid optimization methods (e.g., quadratic programming techniques, chaotic mapping, fuzzy inference theory, quantum computing, etc.) with advanced algorithms (e.g., genetic algorithms, ant colony optimization, particle swarm optimization algorithm, etc.) that have superior capabilities over the traditional optimization approaches to overcome some embedded drawbacks, and the application of these advanced hybrid approaches to significantly improve forecasting accuracy
    • โ€ฆ
    corecore