8,951 research outputs found

    Multi-criteria Evolution of Neural Network Topologies: Balancing Experience and Performance in Autonomous Systems

    Full text link
    Majority of Artificial Neural Network (ANN) implementations in autonomous systems use a fixed/user-prescribed network topology, leading to sub-optimal performance and low portability. The existing neuro-evolution of augmenting topology or NEAT paradigm offers a powerful alternative by allowing the network topology and the connection weights to be simultaneously optimized through an evolutionary process. However, most NEAT implementations allow the consideration of only a single objective. There also persists the question of how to tractably introduce topological diversification that mitigates overfitting to training scenarios. To address these gaps, this paper develops a multi-objective neuro-evolution algorithm. While adopting the basic elements of NEAT, important modifications are made to the selection, speciation, and mutation processes. With the backdrop of small-robot path-planning applications, an experience-gain criterion is derived to encapsulate the amount of diverse local environment encountered by the system. This criterion facilitates the evolution of genes that support exploration, thereby seeking to generalize from a smaller set of mission scenarios than possible with performance maximization alone. The effectiveness of the single-objective (optimizing performance) and the multi-objective (optimizing performance and experience-gain) neuro-evolution approaches are evaluated on two different small-robot cases, with ANNs obtained by the multi-objective optimization observed to provide superior performance in unseen scenarios

    DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics

    Full text link
    Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors. Learning algorithms add some flexibility with the ability to discover the appropriate behavior given either some demonstrations or a reward to guide its exploration with a reinforcement learning algorithm. Reinforcement learning algorithms rely on the definition of state and action spaces that define reachable behaviors. Their adaptation capability critically depends on the representations of these spaces: small and discrete spaces result in fast learning while large and continuous spaces are challenging and either require a long training period or prevent the robot from converging to an appropriate behavior. Beside the operational cycle of policy execution and the learning cycle, which works at a slower time scale to acquire new policies, we introduce the redescription cycle, a third cycle working at an even slower time scale to generate or adapt the required representations to the robot, its environment and the task. We introduce the challenges raised by this cycle and we present DREAM (Deferred Restructuring of Experience in Autonomous Machines), a developmental cognitive architecture to bootstrap this redescription process stage by stage, build new state representations with appropriate motivations, and transfer the acquired knowledge across domains or tasks or even across robots. We describe results obtained so far with this approach and end up with a discussion of the questions it raises in Neuroscience

    Few-shot Quality-Diversity Optimization

    Full text link
    In the past few years, a considerable amount of research has been dedicated to the exploitation of previous learning experiences and the design of Few-shot and Meta Learning approaches, in problem domains ranging from Computer Vision to Reinforcement Learning based control. A notable exception, where to the best of our knowledge, little to no effort has been made in this direction is Quality-Diversity (QD) optimization. QD methods have been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. However, they remain costly due to their reliance on inherently sample inefficient evolutionary processes. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation. Our proposed method does not require backpropagation. It is simple to implement and scale, and furthermore, it is agnostic to the underlying models that are being trained. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.Comment: Accepted for publication in the IEEE Robotics and Automation Letters (RA-L) journa

    Towards Continual Reinforcement Learning: A Review and Perspectives

    Full text link
    In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations and mathematically characterize the non-stationary dynamics of each setting. We go on to discuss evaluation of continual RL agents, providing an overview of benchmarks used in the literature and important metrics for understanding agent performance. Finally, we highlight open problems and challenges in bridging the gap between the current state of continual RL and findings in neuroscience. While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners that can function in increasingly realistic applications where non-stationarity plays a vital role. These include applications such as those in the fields of healthcare, education, logistics, and robotics.Comment: Preprint, 52 pages, 8 figure

    Robot Navigation in Unseen Spaces using an Abstract Map

    Full text link
    Human navigation in built environments depends on symbolic spatial information which has unrealised potential to enhance robot navigation capabilities. Information sources such as labels, signs, maps, planners, spoken directions, and navigational gestures communicate a wealth of spatial information to the navigators of built environments; a wealth of information that robots typically ignore. We present a robot navigation system that uses the same symbolic spatial information employed by humans to purposefully navigate in unseen built environments with a level of performance comparable to humans. The navigation system uses a novel data structure called the abstract map to imagine malleable spatial models for unseen spaces from spatial symbols. Sensorimotor perceptions from a robot are then employed to provide purposeful navigation to symbolic goal locations in the unseen environment. We show how a dynamic system can be used to create malleable spatial models for the abstract map, and provide an open source implementation to encourage future work in the area of symbolic navigation. Symbolic navigation performance of humans and a robot is evaluated in a real-world built environment. The paper concludes with a qualitative analysis of human navigation strategies, providing further insights into how the symbolic navigation capabilities of robots in unseen built environments can be improved in the future.Comment: 15 pages, published in IEEE Transactions on Cognitive and Developmental Systems (http://doi.org/10.1109/TCDS.2020.2993855), see https://btalb.github.io/abstract_map/ for access to softwar

    Engineering Resilient Space Systems

    Get PDF
    Several distinct trends will influence space exploration missions in the next decade. Destinations are becoming more remote and mysterious, science questions more sophisticated, and, as mission experience accumulates, the most accessible targets are visited, advancing the knowledge frontier to more difficult, harsh, and inaccessible environments. This leads to new challenges including: hazardous conditions that limit mission lifetime, such as high radiation levels surrounding interesting destinations like Europa or toxic atmospheres of planetary bodies like Venus; unconstrained environments with navigation hazards, such as free-floating active small bodies; multielement missions required to answer more sophisticated questions, such as Mars Sample Return (MSR); and long-range missions, such as Kuiper belt exploration, that must survive equipment failures over the span of decades. These missions will need to be successful without a priori knowledge of the most efficient data collection techniques for optimum science return. Science objectives will have to be revised ‘on the fly’, with new data collection and navigation decisions on short timescales. Yet, even as science objectives are becoming more ambitious, several critical resources remain unchanged. Since physics imposes insurmountable light-time delays, anticipated improvements to the Deep Space Network (DSN) will only marginally improve the bandwidth and communications cadence to remote spacecraft. Fiscal resources are increasingly limited, resulting in fewer flagship missions, smaller spacecraft, and less subsystem redundancy. As missions visit more distant and formidable locations, the job of the operations team becomes more challenging, seemingly inconsistent with the trend of shrinking mission budgets for operations support. How can we continue to explore challenging new locations without increasing risk or system complexity? These challenges are present, to some degree, for the entire Decadal Survey mission portfolio, as documented in Vision and Voyages for Planetary Science in the Decade 2013–2022 (National Research Council, 2011), but are especially acute for the following mission examples, identified in our recently completed KISS Engineering Resilient Space Systems (ERSS) study: 1. A Venus lander, designed to sample the atmosphere and surface of Venus, would have to perform science operations as components and subsystems degrade and fail; 2. A Trojan asteroid tour spacecraft would spend significant time cruising to its ultimate destination (essentially hibernating to save on operations costs), then upon arrival, would have to act as its own surveyor, finding new objects and targets of opportunity as it approaches each asteroid, requiring response on short notice; and 3. A MSR campaign would not only be required to perform fast reconnaissance over long distances on the surface of Mars, interact with an unknown physical surface, and handle degradations and faults, but would also contain multiple components (launch vehicle, cruise stage, entry and landing vehicle, surface rover, ascent vehicle, orbiting cache, and Earth return vehicle) that dramatically increase the need for resilience to failure across the complex system. The concept of resilience and its relevance and application in various domains was a focus during the study, with several definitions of resilience proposed and discussed. While there was substantial variation in the specifics, there was a common conceptual core that emerged—adaptation in the presence of changing circumstances. These changes were couched in various ways—anomalies, disruptions, discoveries—but they all ultimately had to do with changes in underlying assumptions. Invalid assumptions, whether due to unexpected changes in the environment, or an inadequate understanding of interactions within the system, may cause unexpected or unintended system behavior. A system is resilient if it continues to perform the intended functions in the presence of invalid assumptions. Our study focused on areas of resilience that we felt needed additional exploration and integration, namely system and software architectures and capabilities, and autonomy technologies. (While also an important consideration, resilience in hardware is being addressed in multiple other venues, including 2 other KISS studies.) The study consisted of two workshops, separated by a seven-month focused study period. The first workshop (Workshop #1) explored the ‘problem space’ as an organizing theme, and the second workshop (Workshop #2) explored the ‘solution space’. In each workshop, focused discussions and exercises were interspersed with presentations from participants and invited speakers. The study period between the two workshops was organized as part of the synthesis activity during the first workshop. The study participants, after spending the initial days of the first workshop discussing the nature of resilience and its impact on future science missions, decided to split into three focus groups, each with a particular thrust, to explore specific ideas further and develop material needed for the second workshop. The three focus groups and areas of exploration were: 1. Reference missions: address/refine the resilience needs by exploring a set of reference missions 2. Capability survey: collect, document, and assess current efforts to develop capabilities and technology that could be used to address the documented needs, both inside and outside NASA 3. Architecture: analyze the impact of architecture on system resilience, and provide principles and guidance for architecting greater resilience in our future systems The key product of the second workshop was a set of capability roadmaps pertaining to the three reference missions selected for their representative coverage of the types of space missions envisioned for the future. From these three roadmaps, we have extracted several common capability patterns that would be appropriate targets for near-term technical development: one focused on graceful degradation of system functionality, a second focused on data understanding for science and engineering applications, and a third focused on hazard avoidance and environmental uncertainty. Continuing work is extending these roadmaps to identify candidate enablers of the capabilities from the following three categories: architecture solutions, technology solutions, and process solutions. The KISS study allowed a collection of diverse and engaged engineers, researchers, and scientists to think deeply about the theory, approaches, and technical issues involved in developing and applying resilience capabilities. The conclusions summarize the varied and disparate discussions that occurred during the study, and include new insights about the nature of the challenge and potential solutions: 1. There is a clear and definitive need for more resilient space systems. During our study period, the key scientists/engineers we engaged to understand potential future missions confirmed the scientific and risk reduction value of greater resilience in the systems used to perform these missions. 2. Resilience can be quantified in measurable terms—project cost, mission risk, and quality of science return. In order to consider resilience properly in the set of engineering trades performed during the design, integration, and operation of space systems, the benefits and costs of resilience need to be quantified. We believe, based on the work done during the study, that appropriate metrics to measure resilience must relate to risk, cost, and science quality/opportunity. Additional work is required to explicitly tie design decisions to these first-order concerns. 3. There are many existing basic technologies that can be applied to engineering resilient space systems. Through the discussions during the study, we found many varied approaches and research that address the various facets of resilience, some within NASA, and many more beyond. Examples from civil architecture, Department of Defense (DoD) / Defense Advanced Research Projects Agency (DARPA) initiatives, ‘smart’ power grid control, cyber-physical systems, software architecture, and application of formal verification methods for software were identified and discussed. The variety and scope of related efforts is encouraging and presents many opportunities for collaboration and development, and we expect many collaborative proposals and joint research as a result of the study. 4. Use of principled architectural approaches is key to managing complexity and integrating disparate technologies. The main challenge inherent in considering highly resilient space systems is that the increase in capability can result in an increase in complexity with all of the 3 risks and costs associated with more complex systems. What is needed is a better way of conceiving space systems that enables incorporation of capabilities without increasing complexity. We believe principled architecting approaches provide the needed means to convey a unified understanding of the system to primary stakeholders, thereby controlling complexity in the conception and development of resilient systems, and enabling the integration of disparate approaches and technologies. A representative architectural example is included in Appendix F. 5. Developing trusted resilience capabilities will require a diverse yet strategically directed research program. Despite the interest in, and benefits of, deploying resilience space systems, to date, there has been a notable lack of meaningful demonstrated progress in systems capable of working in hazardous uncertain situations. The roadmaps completed during the study, and documented in this report, provide the basis for a real funded plan that considers the required fundamental work and evolution of needed capabilities. Exploring space is a challenging and difficult endeavor. Future space missions will require more resilience in order to perform the desired science in new environments under constraints of development and operations cost, acceptable risk, and communications delays. Development of space systems with resilient capabilities has the potential to expand the limits of possibility, revolutionizing space science by enabling as yet unforeseen missions and breakthrough science observations. Our KISS study provided an essential venue for the consideration of these challenges and goals. Additional work and future steps are needed to realize the potential of resilient systems—this study provided the necessary catalyst to begin this process
    • 

    corecore