46 research outputs found
Meta-Reinforcement Learning via Language Instructions
Although deep reinforcement learning has recently been very successful at
learning complex behaviors, it requires a tremendous amount of data to learn a
task. One of the fundamental reasons causing this limitation lies in the nature
of the trial-and-error learning paradigm of reinforcement learning, where the
agent communicates with the environment and progresses in the learning only
relying on the reward signal. This is implicit and rather insufficient to learn
a task well. On the contrary, humans are usually taught new skills via natural
language instructions. Utilizing language instructions for robotic motion
control to improve the adaptability is a recently emerged topic and
challenging. In this paper, we present a meta-RL algorithm that addresses the
challenge of learning skills with language instructions in multiple
manipulation tasks. On the one hand, our algorithm utilizes the language
instructions to shape its interpretation of the task, on the other hand, it
still learns to solve task in a trial-and-error process. We evaluate our
algorithm on the robotic manipulation benchmark (Meta-World) and it
significantly outperforms state-of-the-art methods in terms of training and
testing task success rates. Codes are available at
\url{https://tumi6robot.wixsite.com/million}
Language-Conditioned Imitation Learning with Base Skill Priors under Unstructured Data
The growing interest in language-conditioned robot manipulation aims to
develop robots capable of understanding and executing complex tasks, with the
objective of enabling robots to interpret language commands and manipulate
objects accordingly. While language-conditioned approaches demonstrate
impressive capabilities for addressing tasks in familiar environments, they
encounter limitations in adapting to unfamiliar environment settings. In this
study, we propose a general-purpose, language-conditioned approach that
combines base skill priors and imitation learning under unstructured data to
enhance the algorithm's generalization in adapting to unfamiliar environments.
We assess our model's performance in both simulated and real-world environments
using a zero-shot setting. In the simulated environment, the proposed approach
surpasses previously reported scores for CALVIN benchmark, especially in the
challenging Zero-Shot Multi-Environment setting. The average completed task
length, indicating the average number of tasks the agent can continuously
complete, improves more than 2.5 times compared to the state-of-the-art method
HULC. In addition, we conduct a zero-shot evaluation of our policy in a
real-world setting, following training exclusively in simulated environments
without additional specific adaptations. In this evaluation, we set up ten
tasks and achieved an average 30% improvement in our approach compared to the
current state-of-the-art approach, demonstrating a high generalization
capability in both simulated environments and the real world. For further
details, including access to our code and videos, please refer to our
supplementary materials
Learning from Symmetry: Meta-Reinforcement Learning with Symmetric Data and Language Instructions
Meta-reinforcement learning (meta-RL) is a promising approach that enables
the agent to learn new tasks quickly. However, most meta-RL algorithms show
poor generalization in multiple-task scenarios due to the insufficient task
information provided only by rewards. Language-conditioned meta-RL improves the
generalization by matching language instructions and the agent's behaviors.
Learning from symmetry is an important form of human learning, therefore,
combining symmetry and language instructions into meta-RL can help improve the
algorithm's generalization and learning efficiency. We thus propose a dual-MDP
meta-reinforcement learning method that enables learning new tasks efficiently
with symmetric data and language instructions. We evaluate our method in
multiple challenging manipulation tasks, and experimental results show our
method can greatly improve the generalization and efficiency of
meta-reinforcement learning
Language-conditioned Learning for Robotic Manipulation: A Survey
Language-conditioned robotic manipulation represents a cutting-edge area of
research, enabling seamless communication and cooperation between humans and
robotic agents. This field focuses on teaching robotic systems to comprehend
and execute instructions conveyed in natural language. To achieve this, the
development of robust language understanding models capable of extracting
actionable insights from textual input is essential. In this comprehensive
survey, we systematically explore recent advancements in language-conditioned
approaches within the context of robotic manipulation. We analyze these
approaches based on their learning paradigms, which encompass reinforcement
learning, imitation learning, and the integration of foundational models, such
as large language models and vision-language models. Furthermore, we conduct an
in-depth comparative analysis, considering aspects like semantic information
extraction, environment & evaluation, auxiliary tasks, and task representation.
Finally, we outline potential future research directions in the realm of
language-conditioned learning for robotic manipulation, with the topic of
generalization capabilities and safety issues. The GitHub repository of this
paper can be found at
https://github.com/hk-zh/language-conditioned-robot-manipulation-model
Safety Guaranteed Manipulation Based on Reinforcement Learning Planner and Model Predictive Control Actor
Deep reinforcement learning (RL) has been endowed with high expectations in
tackling challenging manipulation tasks in an autonomous and self-directed
fashion. Despite the significant strides made in the development of
reinforcement learning, the practical deployment of this paradigm is hindered
by at least two barriers, namely, the engineering of a reward function and
ensuring the safety guaranty of learning-based controllers. In this paper, we
address these challenging limitations by proposing a framework that merges a
reinforcement learning \lstinline[columns=fixed]{planner} that is trained using
sparse rewards with a model predictive controller (MPC)
\lstinline[columns=fixed]{actor}, thereby offering a safe policy. On the one
hand, the RL \lstinline[columns=fixed]{planner} learns from sparse rewards by
selecting intermediate goals that are easy to achieve in the short term and
promising to lead to target goals in the long term. On the other hand, the MPC
\lstinline[columns=fixed]{actor} takes the suggested intermediate goals from
the RL \lstinline[columns=fixed]{planner} as the input and predicts how the
robot's action will enable it to reach that goal while avoiding any obstacles
over a short period of time. We evaluated our method on four challenging
manipulation tasks with dynamic obstacles and the results demonstrate that, by
leveraging the complementary strengths of these two components, the agent can
solve manipulation tasks in complex, dynamic environments safely with a
success rate. Videos are available at
\url{https://videoviewsite.wixsite.com/mpc-hgg}
Onset of sedimentation near the Carnian/Norian boundary in the northwestern Sichuan Basin: New evidence from ammonoid biostratigraphy and zircon U Pb geochronology
Upper Triassic deposits formed at the onset of subsidence in the Sichuan foreland basin of South China, and may record a crisis of carbonate deposition related to the Carnian Pluvial Episode. However, there is no consensus yet on the precise age of these deposits in northwestern Sichuan. In this work, ammonoid biostratigraphy has been improved, and a U/Pb age from detrital zircons has been obtained from the Upper Triassic of northwestern Sichuan. New ammonoid taxa Sinotropites sichuanensis n. gen., n. sp. and Hadrothisbites hanwangensis n. sp. are described from the upper part of the Ma’antang Formation in Hanwang and Jushui area and are assigned to the uppermost Tuvalian (Anatropites spinosus Zone, Gonionotites italicus Subzone). Ammonoid and conodont biostratigraphy, combined with U/Pb concordant ages of 227.2 ± 1.1 Ma obtained from youngest detrital zircons, provide a robust constrain on the initial sedimentation phases of the foreland basin in northwestern Sichuan, and suggest that the terrigenous turnover was not related to the Carnian Pluvial Episode
Dust deposition in the Aral Sea: implications for changes in atmospheric circulation in central Asia during the past 2000 year
International audienceWe investigated mineral aerosol (dust) deposition in the Aral Sea with intention to understand the variability of dust in central Asia and its implications for atmospheric circulation change in the late Holocene. Using an 11.12-m sediment core of the lake, we calculated bulk sediment fluxes at high time-resolution and analyzed grain-size distributions of detrital sediments. A refined age-depth model was established by combined methods of radiocarbon dating and archeological evidence. Besides, a principal component analysis (PCA) of grain-size fractions and elements (Fe, Ti, K, Ca, Sr) was used to assess the potential processes controlling detrital inputs. The results suggest that two processes are mainly relevant for the clastic input as the medium silt fractions and Ti, Fe and K are positively correlated with Component 1 (Cl), and the fine size fractions (<6 mu m) are positively correlated with Component 2 (C2). Taking the results of the PCA, geological backgrounds, clastic input processes into account, we propose that the medium silt fractions and, in particular, the grain-size fraction ratio (6-32 mu m/2-6 mu m), can serve as indicators of the variability of airborne dust in the Aral Sea region. On the contrary, the fine size fractions appear to be contributed mainly by the sheetwash processes. The bulk sediment deposition fluxes were extremely high during the Little Ice Age (LIA; AD 1400-1780), which may be related to the increased dust deposition. As indicated by the variations of grain-size ratio and Ti, the history of dust deposition in central Asia can be divided into five distinct periods, with a remarkably low deposition during AD 1-350, a moderately high value from AD 350-720, a return to relatively low level between AD 720 and AD 1400 (including the Medieval Warm Period (MWP, AD 755-1070)), an exceptionally high deposition from AD 1400 to 1940s and an abnormally low value since 1940s. The temporal variations in the dust deposition are consistent with the changes in the Siberian High (SH) and mean atmospheric temperature of the northern hemisphere during the past 2000 years, with low/high annual temperature anomalies corresponding to high/low dust supplies in the Aral Sea sediments, respectively. The variations in the fine size fraction also show a broadly similarity to a lacustrine delta(18)O record in Turkey (Jones et al., 2006), implying that there was less moisture entering western central Asia from the Mediterranean during the LIA than during the MW
Proterozoic to Phanerozoic case studies of laser ablation microanalysis for microbial carbonate U–Pb geochronology
Some of the earliest bio-sedimentary records of life on Earth are represented by microbial carbonates, which are also critical geochemical archives of ancient seawater chemistry and the environmental circumstances in which they precipitated. Reconstructing paleo-microbial environments on Earth and potentially other planets requires precise determination of the depositional ages of these materials. The (abiogenic) carbonate geochemistry communities can now use developments in in-situ laser ablation U-Pb dating using inductively coupled plasma mass spectrometry (LA-ICP-MS). Due to the effects of impurity mixing and diagenesis, microbial carbonates have received little geochronological study despite their broad relevance for understanding ancient seawater's environmental conditions and geochemical compositions. This study demonstrates using time-of-flight mass spectrometry (TOF-MS) to perform quick, quantitative elemental mapping before U-Pb spot dating to improve experiment success rates and data reliability and offers four practical application examples
Regional vegetation patterns at lake Son Kul reveal Holocene climatic variability in central Tien Shan (Kyrgyzstan, Central Asia)
International audienc
A Review of Research Progress on the Analytical Method of Large-<i>n</i> Detrital Zircon U-Pb Geochronology
BACKGROUND: Detrital zircon U-Pb geochronology is an important tool for identifying sedimentary provenance and determining the maximum depositional age. The numbers of grains for detrital zircon provenance investigations using laser-ablation inductively coupled-plasma mass spectrometer (LA-ICP-MS) typically range from 60 to 120. In this range, age components are commonly not identified from the sample aliquot. In order to improve the reliability of provenance investigation, analysis of more grains (n ≥ 300) or even the large-n aliquot with more than 1000 grains (n>1000) are required. The emergence of large-n detrital zircon U-Pb geochronology is challenging the methods of data measurement, reduction and evaluation.OBJECTIVES: To summarize the progress of measurement, data reduction and data evaluation of large-n detrital zircon U-Pb geochronology.METHODS: By summarizing the method innovation of domestic and foreign literature.RESULTS: Firstly, each measurement requires rapid acquisition of U and Pb isotope signals, which can be conducted by improving the transmission efficiency of aerosol. The "flat" signal acquisition time can be shortened or transformed to a "peak" signal mode for rapid measurement. Secondly, large-n data require efficient data reduction protocol or powerful software (e.g. iolite) to improve visualization and reduce the variability between inter-laboratory comparisons. For U-Pb data processing flow, several optimized methods are introduced for fractionation correction and propagating uncertainty. In addition, total integrated counts and linear regression correction are introduced to specially process "peak" signals. Thirdly, the new calculation method of U-Pb and Pb-Pb age discordance, such as using Aitchison concordia distance, makes data filtering more reasonable. Based on recent research progress, the future of automation and standardization of large-n detrital zircon U-Pb geochronology is discussed and advice on the selection of instruments and reduction software is provided.CONCLUSIONS:In the future, the development of large-n detrital zircon U-Pb geochronology has great prospects, and will play a greater role in the study of provenance tracing and stratigraphic dating