32 research outputs found
Pre-training on Synthetic Driving Data for Trajectory Prediction
Accumulating substantial volumes of real-world driving data proves pivotal in
the realm of trajectory forecasting for autonomous driving. Given the heavy
reliance of current trajectory forecasting models on data-driven methodologies,
we aim to tackle the challenge of learning general trajectory forecasting
representations under limited data availability. We propose to augment both HD
maps and trajectories and apply pre-training strategies on top of them.
Specifically, we take advantage of graph representations of HD-map and apply
vector transformations to reshape the maps, to easily enrich the limited number
of scenes. Additionally, we employ a rule-based model to generate trajectories
based on augmented scenes; thus enlarging the trajectories beyond the collected
real ones. To foster the learning of general representations within this
augmented dataset, we comprehensively explore the different pre-training
strategies, including extending the concept of a Masked AutoEncoder (MAE) for
trajectory forecasting. Extensive experiments demonstrate the effectiveness of
our data expansion and pre-training strategies, which outperform the baseline
prediction model by large margins, e.g. 5.04%, 3.84% and 8.30% in terms of
, and
Q-SLAM: Quadric Representations for Monocular SLAM
Monocular SLAM has long grappled with the challenge of accurately modeling 3D
geometries. Recent advances in Neural Radiance Fields (NeRF)-based monocular
SLAM have shown promise, yet these methods typically focus on novel view
synthesis rather than precise 3D geometry modeling. This focus results in a
significant disconnect between NeRF applications, i.e., novel-view synthesis
and the requirements of SLAM. We identify that the gap results from the
volumetric representations used in NeRF, which are often dense and noisy. In
this study, we propose a novel approach that reimagines volumetric
representations through the lens of quadric forms. We posit that most scene
components can be effectively represented as quadric planes. Leveraging this
assumption, we reshape the volumetric representations with million of cubes by
several quadric planes, which leads to more accurate and efficient modeling of
3D scenes in SLAM contexts. Our method involves two key steps: First, we use
the quadric assumption to enhance coarse depth estimations obtained from
tracking modules, e.g., Droid-SLAM. This step alone significantly improves
depth estimation accuracy. Second, in the subsequent mapping phase, we diverge
from previous NeRF-based SLAM methods that distribute sampling points across
the entire volume space. Instead, we concentrate sampling points around quadric
planes and aggregate them using a novel quadric-decomposed Transformer.
Additionally, we introduce an end-to-end joint optimization strategy that
synchronizes pose estimation with 3D reconstruction
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Existing learning-based autonomous driving (AD) systems face challenges in
comprehending high-level information, generalizing to rare events, and
providing interpretability. To address these problems, this work employs Large
Language Models (LLMs) as a decision-making component for complex AD scenarios
that require human commonsense understanding. We devise cognitive pathways to
enable comprehensive reasoning with LLMs, and develop algorithms for
translating LLM decisions into actionable driving commands. Through this
approach, LLM decisions are seamlessly integrated with low-level controllers by
guided parameter matrix adaptation. Extensive experiments demonstrate that our
proposed method not only consistently surpasses baseline approaches in
single-vehicle tasks, but also helps handle complex driving behaviors even
multi-vehicle coordination, thanks to the commonsense reasoning capabilities of
LLMs. This paper presents an initial step toward leveraging LLMs as effective
decision-makers for intricate AD scenarios in terms of safety, efficiency,
generalizability, and interoperability. We aspire for it to serve as
inspiration for future research in this field. Project page:
https://sites.google.com/view/llm-mp
Holistic Strategies Lead to Enhanced Efficiency and Stability of Hybrid Chemical Vapor Deposition Based Perovskite Solar Cells and Modules
Hybrid chemical vapor deposition (HCVD) is a promising method for the up-scalable fabrication of perovskite solar cells/modules (PSCs/PSMs). However, the efficiency of the HCVD-based perovskite solar cells still lags behind the solution-processed PSCs/PSMs. In this work, the oxygen loss of the electron transport layer of SnO2 in the HCVD process and its negative impact on solar cell device performance are revealed. As the counter-measure, potassium sulfamate (H2KNO3S) is introduced as the passivation layer to both mitigate the oxygen loss issue of SnO2 and passivate the uncoordinated Pb2+ in the perovskite film. In parallel, N-methylpyrrolidone (NMP) is used as the solvent to dissolve PbI2 by forming the intermediate phase of PbI2•NMP, which can greatly lower the energy barrier for perovskite nucleation in the HCVD process. The perovskite seed is employed to further modulate the kinetics of perovskite crystal growth and improve the grain size. The resultant solar cells yield a champion power conversion efficiency (PCE) of 21.98% (0.09 cm2) with a stable output performance of 21.15%, and the PCEs of the mini-modules are 16.16% (22.4 cm2, stable output performance of 14.72%) and 12.12% (91.8 cm2). Furthermore, the unencapsulated small area device shows an outstanding operational stability with a T80 lifetime exceeding 4000 h.journal articl
Constructing Heterostructure through Bidentate Coordination toward Operationally Stable Inverted Perovskite Solar Cells
It has been reported that one of the influencing factors leading to stability issues in iodine-containing perovskite solar cells is the iodine loss from the perovskite layer. Herein, bidentate coordination is used with undercoordinated I− of the perovskite surface to construct the stable perovskite-based heterostructure. This strong halogen bonding effectively inhibits interfacial migration of I− into functional layers such as C60 and Ag. Moreover, passivation of the undercoordinated I− suppresses the release of I2 and further delays the formation of voids at the perovskite surface. The resulting inverted perovskite solar cell exhibits a power conversion efficiency of 22.59% and the unencapsulated device maintains 96.15% of its initial value after continuous operation for 500 h under illumination.journal articl
Graphene‐Like Conjugated Molecule as Hole‐Selective Contact for Operationally Stable Inverted Perovskite Solar Cells and Modules
Further enhancing the operational lifetime of inverted-structure perovskite solar cells (PSCs) is crucial for their commercialization, and the design of hole-selective contacts at the illumination side plays a key role in operational stability. In this work, the self-anchoring benzo[rst]pentaphene (SA-BPP) is developed as a new type of hole-selective contact toward long-term operationally stable inverted PSCs. The SA-BPP molecule with a graphene-like conjugated structure shows a higher photostability and mobility than that of the frequently-used triphenylamine and carbazole-based hole-selective molecules. Besides, the anchoring groups of SA-BPP promote the formation of a large-scale uniform hole contact on ITO substrate and efficiently passivate the perovskite absorbers. Benefiting from these merits, the champion efficiencies of 22.03% for the small-sized cells and 17.08% for 5 × 5 cm2 solar modules on an aperture area of 22.4 cm2 are achieved based on this SA-BPP contact. Also, the SA-BPP-based device exhibits promising operational stability, with an efficiency retention of 87.4% after 2000 h continuous operation at the maximum power point under simulated 1-sun illumination, which indicates an estimated T80 lifetime of 3175 h. This novel design concept of hole-selective contacts provides a promising strategy for further improving the PSC stability.journal articl
Elimination of light-induced degradation at the nickel oxide-perovskite heterojunction by aprotic sulfonium layers towards long-term operationally stable inverted perovskite solar cells
Nickel oxide (NiOx) is a promising hole-selective contact to produce efficient inverted p-i-n structured perovskite solar cells (PSCs) due to its high carrier mobility and high transparency. However, the light-induced degradation of the NiOx–perovskite heterojunction is the main factor limiting its long-term operational lifetime. In this study, we used the time-resolved mass spectrometry technique to clarify the degradation mechanism of the NiOx-formamidinium–methylammonium iodide perovskite (a common composition for high-performance PSCs) heterojunction under operational conditions, and observed that (1) oxidation of iodide and generation of free protons under 1-sun illumination, (2) formation of volatile hydrogen cyanide, methyliodide, and ammonia at elevated temperatures, and (3) a condensation reaction between the organic components under a high vapor pressure. To eliminate these multi-step photochemical reactions, we constructed an aprotic trimethylsulfonium bromide (TMSBr) buffer layer at the NiOx/perovskite interface, which enables excellent photo-thermal stability, a matched lattice parameter with the perovskite crystal, and robust trap-passivation ability. Inverted PSCs stabilized with the TMSBr buffer layer reached the maximum efficiency of 22.1% and retained 82.8% of the initial value after continuous operation for 2000 hours under AM1.5G light illumination, which translates into a T80 lifetime of 2310 hours that is among the highest operational lifetimes for NiOx-based PSCs
Open X-Embodiment:Robotic learning datasets and RT-X models
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train "generalist" X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. The project website is robotics-transformer-x.github.io
Design and implementation of Peking Opera action scoring system based on human skeleton information
At present, most of the preservation records of Peking Opera remain in the ways of video and text, and the digitalization degree is far lower than the development level of science and technology. The immaterial cultural heritage cannot be fully displayed and Peking Opera’s value is weakened. Therefore, adopting advanced motion capture technology is of great significance to the protection and inheritance of Peking Opera. We use optical motion capture equipment to record the movement information of Peking Opera actors, then keep the human skeleton information in a specific file format. After that, the hierarchical human action skeleton model was analysed, and the final score was obtained by comparing the change sequence of information of reference action and training action skeleton with the improved DTW algorithm. We have realized the graphical interface of the system, and the trainer can easily select the action segments to train or select a specific body part for specific action training. This paper introduces the overall design framework of our Peking Opera action scoring system, including the collection of action information, the implementation of scoring algorithm and the design of software interface