4,696 research outputs found

    Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration

    Full text link
    Testing in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle. Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if traceability links between code and tests are not available. This paper introduces Retecs, a new method for automatically learning test case selection and prioritization in CI with the goal to minimize the round-trip time between code commits and developer feedback on failed test cases. The Retecs method uses reinforcement learning to select and prioritize test cases according to their duration, previous last execution and failure history. In a constantly changing environment, where new test cases are created and obsolete test cases are deleted, the Retecs method learns to prioritize error-prone test cases higher under guidance of a reward function and by observing previous CI cycles. By applying Retecs on data extracted from three industrial case studies, we show for the first time that reinforcement learning enables fruitful automatic adaptive test case selection and prioritization in CI and regression testing.Comment: Spieker, H., Gotlieb, A., Marijan, D., & Mossige, M. (2017). Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration. In Proceedings of 26th International Symposium on Software Testing and Analysis (ISSTA'17) (pp. 12--22). AC

    Addressing Function Approximation Error in Actor-Critic Methods

    Get PDF
    In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies. We show that this problem persists in an actor-critic setting and propose novel mechanisms to minimize its effects on both the actor and the critic. Our algorithm builds on Double Q-learning, by taking the minimum value between a pair of critics to limit overestimation. We draw the connection between target networks and overestimation bias, and suggest delaying policy updates to reduce per-update error and further improve performance. We evaluate our method on the suite of OpenAI gym tasks, outperforming the state of the art in every environment tested.Comment: Accepted at ICML 201

    Local ant system for allocating robot swarms to time-constrained tasks

    Get PDF
    We propose a novel application of the Ant Colony Optimization algorithm to efficiently allocate a swarm of homogeneous robots to a set of tasks that need to be accomplished by specific deadlines. We exploit the local communication between robots to periodically evaluate the quality of the allocation solutions, and agents select independently among the high-quality alternatives. The evaluation is performed using pheromone trails to favor allocations which minimize the execution time of the tasks. Our approach is validated in both static and dynamic environments (i.e. the task availability changes over time) using different sets of physics-based simulations. (C) 2018 Elsevier B.V. All rights reserved

    κ°œλ―Έμ•Œκ³ λ¦¬μ¦˜μ„ μ΄μš©ν•œ λ“œλ‘ μ˜ μ œμ„€ 경둜 μ΅œμ ν™”

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(석사) -- μ„œμšΈλŒ€ν•™κ΅λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ κ±΄μ„€ν™˜κ²½κ³΅ν•™λΆ€, 2022.2. κΉ€λ™κ·œ.Drones can overcome the limitation of ground vehicles by replacing the congestion time and allowing rapid service. For sudden snowfall with climate change, a quickly deployed drone can be a flexible alternative considering the deadhead route and the labor costs. The goal of this study is to optimize a drone arc routing problem (D-ARP), servicing the required roads for snow removal. A D-ARP creates computational burden especially in large network. The D-ARP has a large search space due to its exponentially increased candidate route, arc direction decision, and continuous arc space. To reduce the search space, we developed the auxiliary transformation method in ACO algorithm and adopted the random walk method. The contribution of the work is introducing a new problem and optimization approach of D-ARP in snow removal operation and reduce its search space. The optimization results confirmed that the drone travels shorter distance compared to the truck with a reduction of 5% to 22%. Furthermore, even under the length constraint model, the drone shows 4% reduction compared to the truck. The result of the test sets demonstrated that the adopted heuristic algorithm performs well in the large size networks in reasonable time. Based on the results, introducing a drone in snow removal is expected to save the operation cost in practical terms.λ“œλ‘ μ€ ν˜Όμž‘μ‹œκ°„λŒ€λ₯Ό λŒ€μ²΄ν•˜κ³  λΉ λ₯Έ μ„œλΉ„μŠ€λ₯Ό κ°€λŠ₯ν•˜κ²Œ ν•¨μœΌλ‘œμ¨ μ§€μƒμ°¨λŸ‰μ˜ ν•œκ³„λ₯Ό 극볡할 수 μžˆλ‹€. 졜근 기후변화에 λ”°λ₯Έ κ°‘μž‘μŠ€λŸ° κ°•μ„€μ˜ κ²½μš°μ—, λ“œλ‘ κ³Ό 같이 λΉ λ₯΄κ²Œ νˆ¬μž…ν•  수 μžˆλŠ” μ„œλΉ„μŠ€λŠ” μš΄ν–‰ κ²½λ‘œμ™€ λ…Έλ™λΉ„μš©μ„ κ³ λ €ν–ˆμ„ λ•Œλ„ μœ μ—°ν•œ 운영 μ˜΅μ…˜μ΄ 될 수 μžˆλ‹€. λ³Έ μ—°κ΅¬μ˜ λͺ©μ μ€ λ“œλ‘  아크 λΌμš°νŒ…(D-ARP)을 μ΅œμ ν™”ν•˜λŠ” 것이며, μ΄λŠ” μ œμ„€μ— ν•„μš”ν•œ λ„λ‘œλ₯Ό μ„œλΉ„μŠ€ν•˜λŠ” 경둜λ₯Ό νƒμƒ‰ν•˜λŠ” 것이닀. λ“œλ‘  아크 λΌμš°νŒ…μ€ 특히 큰 λ„€νŠΈμ›Œν¬μ—μ„œ 컴퓨터 λΆ€ν•˜λ₯Ό μƒμ„±ν•œλ‹€. λ‹€μ‹œ 말해D-ARPλŠ” 큰 검색곡간을 ν•„μš”λ‘œ ν•˜λ©°, μ΄λŠ” κΈ°ν•˜κΈ‰μˆ˜μ μœΌλ‘œ μ¦κ°€ν•˜λŠ” 후보 경둜 및 호의 λ°©ν–₯ κ²°μ • 그리고 연속적인 호의 κ³΅κ°„μœΌλ‘œλΆ€ν„° κΈ°μΈν•œλ‹€. 검색곡간을 쀄이기 μœ„ν•΄, μš°λ¦¬λŠ” κ°œλ―Έμ•Œκ³ λ¦¬μ¦˜μ— λ³΄μ‘°λ³€ν™˜λ°©λ²•μ„ μ μš©ν•˜λŠ” λ°©μ•ˆμ„ λ„μž…ν•˜μ˜€μœΌλ©° λ˜ν•œ λžœλ€μ›Œν¬ 기법을 μ±„νƒν•˜μ˜€λ‹€. λ³Έ μ—°κ΅¬μ˜ κΈ°μ—¬λŠ” μ œμ„€ μš΄μ˜μ— μžˆμ–΄ D-ARPλΌλŠ” μƒˆλ‘œμš΄ 문제λ₯Ό μ„€μ •ν•˜κ³  μ΅œμ ν™” 접근법을 λ„μž…ν•˜μ˜€μœΌλ©° 검색곡간을 μ΅œμ†Œν™”ν•œ 것이닀. μ΅œμ ν™” κ²°κ³Ό, λ“œλ‘ μ€ μ§€μƒνŠΈλŸ­μ— λΉ„ν•΄ μ•½ 5% ~ 22%의 경둜 λΉ„μš© κ°μ†Œλ₯Ό λ³΄μ˜€λ‹€. λ‚˜μ•„κ°€ 길이 μ œμ•½ λͺ¨λΈμ—μ„œλ„ λ“œλ‘ μ€ 4%의 λΉ„μš© κ°μ†Œλ₯Ό λ³΄μ˜€λ‹€. λ˜ν•œ μ‹€ν—˜κ²°κ³ΌλŠ” μ μš©ν•œ νœ΄λ¦¬μŠ€ν‹± μ•Œκ³ λ¦¬μ¦˜μ΄ 큰 λ„€νŠΈμ›Œν¬μ—μ„œλ„ 합리적 μ‹œκ°„ 내에 μ΅œμ ν•΄λ₯Ό μ°ΎμŒμ„ μž…μ¦ν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ κ²°κ³Όλ₯Ό λ°”νƒ•μœΌλ‘œ, λ“œλ‘ μ„ μ œμ„€μ— λ„μž…ν•˜λŠ” 것은 λ―Έλž˜μ— μ œμ„€ 운영 λΉ„μš©μ„ μ‹€μ§ˆμ μœΌλ‘œ κ°μ†Œμ‹œν‚¬ κ²ƒμœΌλ‘œ κΈ°λŒ€λœλ‹€.Chapter 1. Introduction 4 1.1. Study Background 4 1.2. Purpose of Research 6 Chapter 2. Literature Review 7 2.1. Drone Arc Routing problem 7 2.2. Snow Removal Routing Problem 8 2.3. The Classic ARPs and Algorithms 9 2.4. Large Search Space and Arc direction 11 Chapter 3. Method 13 3.1. Problem Statement 13 3.2. Formulation 16 Chapter 4. Algorithm 17 4.1. Overview 17 4.2. Auxilary Transformation Method 18 4.3. Ant Colony Optimization (ACO) 20 4.4. Post Process for Arc Direction Decision 23 4.5. Length Constraint and Random Walk 24 Chapter 5. Results 27 5.1. Application in Toy Network 27 5.2. Application in Real-world Networks 29 5.3. Application of the Refill Constraint in Seoul 31 Chapter 6. Conclusion 34 References 35 Acknowledgment 40석

    Hopscotch: Robust Multi-agent Search

    Get PDF
    The task of searching a space is critical to a wide range of diverse applications such as land mine clearing and planetary exploration. Because applications frequently require searching remote or hazardous locations, and because the task is easily divisible, it is natural to consider the use of multi-robot teams to accomplish the search task. An important topic of research in this area is the division of the task among robot agents. Interrelated with subtask assignment is failure handling, in the sense that, when an agent fails, its part of the task must then be performed by other agents. This thesis describes Hopscotch, a multi-agent search strategy that divides the search area into a grid of lots. Each agent is assigned responsibility to search one lot at a time, and upon completing the search of that lot the agent is assigned a new lot. Assignment occurs in real time using a simple contract net. Because lots that have been previously searched are skipped, the order of search from the point of view of a particular agent is reminiscent of the progression of steps in the playground game of Hopscotch. Decomposition of the search area is a common approach to multi-agent search, and auction-based contract net strategies have appeared in recent literature as a method of task allocation in multi-agent systems. The Hopscotch strategy combines the two, with a strong focus on robust tolerance of agent failures. Contract nets typically divide all known tasks among available resources. In contrast, Hopscotch limits each agent to one assigned lot at a time, so that failure of an agent compels re-allocation of only one lot search task. Furthermore, the contract net is implemented in an unconventional manner that empowers each agent with responsibility for contract management. This novel combination of real-time assignment and decentralized management allows Hopscotch to resiliently cope with agent failures. The Hopscotch strategy was modeled and compared to other multi-agent strate- gies that tackle the search task in a variety of ways. Simulation results show that Hopscotch is failure-tolerant and very effective in comparison to the other approaches in terms of both search time and search efficiency. Although the search task modeled here is a basic one, results from simulations show the promise of using this strategy for more complicated scenarios, and with actual robot agents
    • …
    corecore