4,944 research outputs found

    Ant Colony Optimization in Green Manufacturing

    Get PDF

    Towards efficient and robust reinforcement learning via synthetic environments and offline data

    Get PDF
    Over the past decade, Deep Reinforcement Learning (RL) has driven many advances in sequential decision-making, including remarkable applications in superhuman Go-playing, robotic control, and automated algorithm discovery. However, despite these successes, deep RL is also notoriously sample-inefficient, usually generalizes poorly to settings beyond the original environment, and can be unstable during training. Moreover, the conventional RL setting still relies on exploring and learning tabula-rasa in new environments and does not make use of pre-existing data. This thesis investigates two promising directions to address these challenges. First, we explore the use of synthetic data and environments in order to broaden an agent's experience. Second, we propose principled techniques to leverage pre-existing datasets, thereby reducing or replacing the need for costly online data collection. The first part of the thesis focuses on the generation of synthetic data and environments to train RL agents. While there is a rich history in model-based RL of leveraging a learned dynamics model to improve sample efficiency, these methods are usually restricted to single-task settings. To overcome this limitation, we propose Augmented World Models, a novel approach designed for offline-to-online transfer where the test dynamics may differ from the training data. Our method augments a learned dynamics model with simple transformations that seek to capture potential changes in the physical properties of a robot, leading to more robust policies. Additionally, we train the agent with the sampled augmentation as context for test-time inference, significantly improving zero-shot generalization to novel dynamics. Going beyond commonly used forward dynamics models, we propose an alternative paradigm, Synthetic Experience Replay, which uses generative modeling to directly model and upsample the agent's training data distribution. Leveraging recent advances in diffusion generative models, our approach outperforms and is composable with standard data augmentation, and is particularly effective in low-data regimes. Furthermore, our method opens the door for certain RL agents to train stably with much larger networks than before. In the second part of the thesis, we explore a complementary direction to data efficiency where we can leverage pre-existing data. While adjacent fields of machine learning, such as computer vision and natural language processing, have made significant progress in scaling data and model size, traditional RL algorithms can find it difficult to incorporate additional data due to the need for on-policy data. We begin by investigating a principled method for incorporating expert demonstrations to accelerate online RL, KL-regularization to a behavioral prior, and identify a pathology stemming from the behavioral prior having uncalibrated uncertainties. We show that standard parameterizations of the behavioral reference policy can lead to unstable training dynamics, and propose a solution, Non-Parametric Prior Actor–Critic, that represents the new state-of-the-art in locomotion and dexterous manipulation tasks. Furthermore, we make advances in offline reinforcement learning, with which agents can be trained without any online data collection at all. In this domain, we elucidate the design space of offline model-based RL algorithms and highlight where prior methods use suboptimal heuristics and choices for hyperparameters. By rigorously searching through this space, we show that we can vastly improve standard algorithms and provide insights into which design choices are most important. Finally, we make progress towards extending offline RL to pixel-based environments by presenting Vision Datasets for Deep Data-Driven RL, the first comprehensive and publicly available evaluation suite for this field, alongside simple model-based and model-free baselines for assessing future progress in this domain. In conclusion, this thesis represents explorations toward making RL algorithms more efficient and readily deployable in the real world. Further progress along these directions may bring us closer to the ultimate goal of more generally capable agents, that are able to both generate appropriate learning environments for themselves and bootstrap learning from vast quantities of pre-existing data

    Modulated Unit-Norm Tight Frames for Compressed Sensing

    Full text link
    In this paper, we propose a compressed sensing (CS) framework that consists of three parts: a unit-norm tight frame (UTF), a random diagonal matrix and a column-wise orthonormal matrix. We prove that this structure satisfies the restricted isometry property (RIP) with high probability if the number of measurements m=O(slog2slog2n)m = O(s \log^2s \log^2n) for ss-sparse signals of length nn and if the column-wise orthonormal matrix is bounded. Some existing structured sensing models can be studied under this framework, which then gives tighter bounds on the required number of measurements to satisfy the RIP. More importantly, we propose several structured sensing models by appealing to this unified framework, such as a general sensing model with arbitrary/determinisic subsamplers, a fast and efficient block compressed sensing scheme, and structured sensing matrices with deterministic phase modulations, all of which can lead to improvements on practical applications. In particular, one of the constructions is applied to simplify the transceiver design of CS-based channel estimation for orthogonal frequency division multiplexing (OFDM) systems.Comment: submitted to IEEE Transactions on Signal Processin

    Determination of Cap Model Parameters using Numerical Optimization Method for Powder Compaction

    Get PDF
    Many advantages are inherent to the successful powder metallurgy (P/M) process especially in high volume manufacturing. The strength/density distribution of the compacted product is crucial to overall success. The finite element analysis (FEA) method has become an effective way to numerically simulate strength/density distribution in a P/M compact. The modified Drucker-Prager cap (DPC) model has been shown to be a suitable constitutive relationship for metal powder compaction simulation. The calibration of the modified DPC model involves a procedure known as a triaxial compression test. Equipment for completing a triaxial compression test on metal powders is neither readily available nor standardized in the P/M industry. A robust calibration procedure that requires only simple experimental tests would increase the usability of the simulation procedure. This research created a universal cost/time-effective calibration method to accurately determine all parameters of a modified DPC model by using a combination of numerical simulation methods, numerical optimization methods and common material testing techniques. The use of the triaxial compression test is eliminated and the new method relies only upon conventional compaction equipment, standard geometry and readily available metallographic techniques. The DPC parameters were determined by applying the proposed method on ferrous powders. The predicted DPC parameters were verified on a compressed product with complex geometry

    Multiform Adaptive Robot Skill Learning from Humans

    Full text link
    Object manipulation is a basic element in everyday human lives. Robotic manipulation has progressed from maneuvering single-rigid-body objects with firm grasping to maneuvering soft objects and handling contact-rich actions. Meanwhile, technologies such as robot learning from demonstration have enabled humans to intuitively train robots. This paper discusses a new level of robotic learning-based manipulation. In contrast to the single form of learning from demonstration, we propose a multiform learning approach that integrates additional forms of skill acquisition, including adaptive learning from definition and evaluation. Moreover, going beyond state-of-the-art technologies of handling purely rigid or soft objects in a pseudo-static manner, our work allows robots to learn to handle partly rigid partly soft objects with time-critical skills and sophisticated contact control. Such capability of robotic manipulation offers a variety of new possibilities in human-robot interaction.Comment: Accepted to 2017 Dynamic Systems and Control Conference (DSCC), Tysons Corner, VA, October 11-1

    Robot Composite Learning and the Nunchaku Flipping Challenge

    Full text link
    Advanced motor skills are essential for robots to physically coexist with humans. Much research on robot dynamics and control has achieved success on hyper robot motor capabilities, but mostly through heavily case-specific engineering. Meanwhile, in terms of robot acquiring skills in a ubiquitous manner, robot learning from human demonstration (LfD) has achieved great progress, but still has limitations handling dynamic skills and compound actions. In this paper, we present a composite learning scheme which goes beyond LfD and integrates robot learning from human definition, demonstration, and evaluation. The method tackles advanced motor skills that require dynamic time-critical maneuver, complex contact control, and handling partly soft partly rigid objects. We also introduce the "nunchaku flipping challenge", an extreme test that puts hard requirements to all these three aspects. Continued from our previous presentations, this paper introduces the latest update of the composite learning scheme and the physical success of the nunchaku flipping challenge
    corecore