15 research outputs found
Cherry-Picking with Reinforcement Learning : Robust Dynamic Grasping in Unstable Conditions
Grasping small objects surrounded by unstable or non-rigid material plays a
crucial role in applications such as surgery, harvesting, construction,
disaster recovery, and assisted feeding. This task is especially difficult when
fine manipulation is required in the presence of sensor noise and perception
errors; errors inevitably trigger dynamic motion, which is challenging to model
precisely. Circumventing the difficulty to build accurate models for contacts
and dynamics, data-driven methods like reinforcement learning (RL) can optimize
task performance via trial and error, reducing the need for accurate models of
contacts and dynamics. Applying RL methods to real robots, however, has been
hindered by factors such as prohibitively high sample complexity or the high
training infrastructure cost for providing resets on hardware. This work
presents CherryBot, an RL system that uses chopsticks for fine manipulation
that surpasses human reactiveness for some dynamic grasping tasks. By
integrating imprecise simulators, suboptimal demonstrations and external state
estimation, we study how to make a real-world robot learning system sample
efficient and general while reducing the human effort required for supervision.
Our system shows continual improvement through 30 minutes of real-world
interaction: through reactive retry, it achieves an almost 100% success rate on
the demanding task of using chopsticks to grasp small objects swinging in the
air. We demonstrate the reactiveness, robustness and generalizability of
CherryBot to varying object shapes and dynamics (e.g., external disturbances
like wind and human perturbations). Videos are available at
https://goodcherrybot.github.io/
CCIL: Continuity-based Data Augmentation for Corrective Imitation Learning
We present a new technique to enhance the robustness of imitation learning
methods by generating corrective data to account for compounding errors and
disturbances. While existing methods rely on interactive expert labeling,
additional offline datasets, or domain-specific invariances, our approach
requires minimal additional assumptions beyond access to expert data. The key
insight is to leverage local continuity in the environment dynamics to generate
corrective labels. Our method first constructs a dynamics model from the expert
demonstration, encouraging local Lipschitz continuity in the learned model. In
locally continuous regions, this model allows us to generate corrective labels
within the neighborhood of the demonstrations but beyond the actual set of
states and actions in the dataset. Training on this augmented data enhances the
agent's ability to recover from perturbations and deal with compounding errors.
We demonstrate the effectiveness of our generated labels through experiments in
a variety of robotics domains in simulation that have distinct forms of
continuity and discontinuity, including classic control problems, drone flying,
navigation with high-dimensional sensor observations, legged locomotion, and
tabletop manipulation
Cyclic axial compressive performance of hybrid double-skin tubular square columns
This paper presents an experimental study on the cyclic axial compressive behavior of FRP-concrete-steel hybrid double-skin tubular columns. The square column specimens were cast with an external Fiber Reinforced Polymer jackets, inner steel tube and concrete in between. The height of the columns was 500 mm and the side dimension was 150 mm. The effects of loading scheme, void ratio and diameter-thickness ratio on axial compression behavior were investigated. A total of eight columns were tested under monotonic and cyclic axial compression. The experimental results show that the effect of loading scheme on axial stress-strain envelope curve and the peak load were not significant, and the ultimate state of the square columns subjected to cyclic axial compression was very similar to that of specimens subjected to monotonic axial compression. Besides, compared with void ratio, the diameter-thickness ratio of the inner steel tube has significant influence on the peak load of the columns when subjected to cyclic axial compression
Optimizing the synthesis of poly(2,2'-bithiophene)-based semiconducting nanoparticles
Semiconducting polymer-based nanoparticles (Pdots) have recently presented a promising prospect in fluorescence-labelling and biological sensing fields thanks to good photostabiligy, high photoluminescence (PL), low cytotoxicity compared to conventional dyes and semiconducting quantum dots (QDs) composed of heavy metals. In this thesis, 2,2'-bithiophene (BTh) was used as monomers to synthesize Pdots. The product was characterized by Fourier transform infrared spectroscopy (FTIR) for internal structure study and X-ray diffraction (XRD) for crystallinity. Excitation, emission as well as cell imaging were also investigated. Meanwhile in order to optimize the synthesis, Pdots were prepared under a series of varied time and voltage parameters, and their excitation range and maximum emission wavelength as well as intensity were analyzed and compared. It's proven that the electrochemical synthesis process is time and voltage dependent. The merits of Pdots mentioned above was also displayed during characterization. Moreover, other applications of Pdots, i.e. bioimaging and micro molecules tagging, are performed.Bachelor of Engineering (Chemical and Biomolecular Engineering
Two-stage blockchain-based transaction mechanism of demand response quota
The current price guided demand response mechanism is difficult to accurately achieve the expected load reduction goal for the grid company, while the direct control demand response mechanism cannot meet the requirements of users for autonomous power consumption. Therefore, it is necessary to formulate an incentive compatible demand response mechanism to take the needs of demand-side users and the grid company into account. Based on the current bidding demand response program, a two-stage transaction mechanism of demand response quota based on blockchain is proposed in this work. Firstly, a two-stage transaction mechanism for demand response quota considering day-ahead bidding transaction and intra-day double auction transaction is designed. Then, the smart contract is introduced to realize the proposed blockchain-based two-stage transaction mechanism of demand response quota, and all the smart contract functions involved in each business link are customized. Finally, the effectiveness of the proposed transaction mechanism of demand response quota is illustrated by the simulation results on the platform of Remix IDE
Planning with Spatial-Temporal Abstraction from Point Clouds for Deformable Object Manipulation
Effective planning of long-horizon deformable object manipulation requires
suitable abstractions at both the spatial and temporal levels. Previous methods
typically either focus on short-horizon tasks or make strong assumptions that
full-state information is available, which prevents their use on deformable
objects. In this paper, we propose PlAnning with Spatial-Temporal Abstraction
(PASTA), which incorporates both spatial abstraction (reasoning about objects
and their relations to each other) and temporal abstraction (reasoning over
skills instead of low-level actions). Our framework maps high-dimension 3D
observations such as point clouds into a set of latent vectors and plans over
skill sequences on top of the latent set representation. We show that our
method can effectively perform challenging sequential deformable object
manipulation tasks in the real world, which require combining multiple tool-use
skills such as cutting with a knife, pushing with a pusher, and spreading the
dough with a roller.Comment: Published at the Conference on Robot Learning (CoRL 2022