32 research outputs found

    Pre-training on Synthetic Driving Data for Trajectory Prediction

    Full text link
    Accumulating substantial volumes of real-world driving data proves pivotal in the realm of trajectory forecasting for autonomous driving. Given the heavy reliance of current trajectory forecasting models on data-driven methodologies, we aim to tackle the challenge of learning general trajectory forecasting representations under limited data availability. We propose to augment both HD maps and trajectories and apply pre-training strategies on top of them. Specifically, we take advantage of graph representations of HD-map and apply vector transformations to reshape the maps, to easily enrich the limited number of scenes. Additionally, we employ a rule-based model to generate trajectories based on augmented scenes; thus enlarging the trajectories beyond the collected real ones. To foster the learning of general representations within this augmented dataset, we comprehensively explore the different pre-training strategies, including extending the concept of a Masked AutoEncoder (MAE) for trajectory forecasting. Extensive experiments demonstrate the effectiveness of our data expansion and pre-training strategies, which outperform the baseline prediction model by large margins, e.g. 5.04%, 3.84% and 8.30% in terms of MR6MR_6, minADE6minADE_6 and minFDE6minFDE_6

    Q-SLAM: Quadric Representations for Monocular SLAM

    Full text link
    Monocular SLAM has long grappled with the challenge of accurately modeling 3D geometries. Recent advances in Neural Radiance Fields (NeRF)-based monocular SLAM have shown promise, yet these methods typically focus on novel view synthesis rather than precise 3D geometry modeling. This focus results in a significant disconnect between NeRF applications, i.e., novel-view synthesis and the requirements of SLAM. We identify that the gap results from the volumetric representations used in NeRF, which are often dense and noisy. In this study, we propose a novel approach that reimagines volumetric representations through the lens of quadric forms. We posit that most scene components can be effectively represented as quadric planes. Leveraging this assumption, we reshape the volumetric representations with million of cubes by several quadric planes, which leads to more accurate and efficient modeling of 3D scenes in SLAM contexts. Our method involves two key steps: First, we use the quadric assumption to enhance coarse depth estimations obtained from tracking modules, e.g., Droid-SLAM. This step alone significantly improves depth estimation accuracy. Second, in the subsequent mapping phase, we diverge from previous NeRF-based SLAM methods that distribute sampling points across the entire volume space. Instead, we concentrate sampling points around quadric planes and aggregate them using a novel quadric-decomposed Transformer. Additionally, we introduce an end-to-end joint optimization strategy that synchronizes pose estimation with 3D reconstruction

    LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving

    Full text link
    Existing learning-based autonomous driving (AD) systems face challenges in comprehending high-level information, generalizing to rare events, and providing interpretability. To address these problems, this work employs Large Language Models (LLMs) as a decision-making component for complex AD scenarios that require human commonsense understanding. We devise cognitive pathways to enable comprehensive reasoning with LLMs, and develop algorithms for translating LLM decisions into actionable driving commands. Through this approach, LLM decisions are seamlessly integrated with low-level controllers by guided parameter matrix adaptation. Extensive experiments demonstrate that our proposed method not only consistently surpasses baseline approaches in single-vehicle tasks, but also helps handle complex driving behaviors even multi-vehicle coordination, thanks to the commonsense reasoning capabilities of LLMs. This paper presents an initial step toward leveraging LLMs as effective decision-makers for intricate AD scenarios in terms of safety, efficiency, generalizability, and interoperability. We aspire for it to serve as inspiration for future research in this field. Project page: https://sites.google.com/view/llm-mp

    Holistic Strategies Lead to Enhanced Efficiency and Stability of Hybrid Chemical Vapor Deposition Based Perovskite Solar Cells and Modules

    Get PDF
    Hybrid chemical vapor deposition (HCVD) is a promising method for the up-scalable fabrication of perovskite solar cells/modules (PSCs/PSMs). However, the efficiency of the HCVD-based perovskite solar cells still lags behind the solution-processed PSCs/PSMs. In this work, the oxygen loss of the electron transport layer of SnO2 in the HCVD process and its negative impact on solar cell device performance are revealed. As the counter-measure, potassium sulfamate (H2KNO3S) is introduced as the passivation layer to both mitigate the oxygen loss issue of SnO2 and passivate the uncoordinated Pb2+ in the perovskite film. In parallel, N-methylpyrrolidone (NMP) is used as the solvent to dissolve PbI2 by forming the intermediate phase of PbI2•NMP, which can greatly lower the energy barrier for perovskite nucleation in the HCVD process. The perovskite seed is employed to further modulate the kinetics of perovskite crystal growth and improve the grain size. The resultant solar cells yield a champion power conversion efficiency (PCE) of 21.98% (0.09 cm2) with a stable output performance of 21.15%, and the PCEs of the mini-modules are 16.16% (22.4 cm2, stable output performance of 14.72%) and 12.12% (91.8 cm2). Furthermore, the unencapsulated small area device shows an outstanding operational stability with a T80 lifetime exceeding 4000 h.journal articl

    Constructing Heterostructure through Bidentate Coordination toward Operationally Stable Inverted Perovskite Solar Cells

    Get PDF
    It has been reported that one of the influencing factors leading to stability issues in iodine-containing perovskite solar cells is the iodine loss from the perovskite layer. Herein, bidentate coordination is used with undercoordinated I− of the perovskite surface to construct the stable perovskite-based heterostructure. This strong halogen bonding effectively inhibits interfacial migration of I− into functional layers such as C60 and Ag. Moreover, passivation of the undercoordinated I− suppresses the release of I2 and further delays the formation of voids at the perovskite surface. The resulting inverted perovskite solar cell exhibits a power conversion efficiency of 22.59% and the unencapsulated device maintains 96.15% of its initial value after continuous operation for 500 h under illumination.journal articl

    Graphene‐Like Conjugated Molecule as Hole‐Selective Contact for Operationally Stable Inverted Perovskite Solar Cells and Modules

    Get PDF
    Further enhancing the operational lifetime of inverted-structure perovskite solar cells (PSCs) is crucial for their commercialization, and the design of hole-selective contacts at the illumination side plays a key role in operational stability. In this work, the self-anchoring benzo[rst]pentaphene (SA-BPP) is developed as a new type of hole-selective contact toward long-term operationally stable inverted PSCs. The SA-BPP molecule with a graphene-like conjugated structure shows a higher photostability and mobility than that of the frequently-used triphenylamine and carbazole-based hole-selective molecules. Besides, the anchoring groups of SA-BPP promote the formation of a large-scale uniform hole contact on ITO substrate and efficiently passivate the perovskite absorbers. Benefiting from these merits, the champion efficiencies of 22.03% for the small-sized cells and 17.08% for 5 × 5 cm2 solar modules on an aperture area of 22.4 cm2 are achieved based on this SA-BPP contact. Also, the SA-BPP-based device exhibits promising operational stability, with an efficiency retention of 87.4% after 2000 h continuous operation at the maximum power point under simulated 1-sun illumination, which indicates an estimated T80 lifetime of 3175 h. This novel design concept of hole-selective contacts provides a promising strategy for further improving the PSC stability.journal articl

    Elimination of light-induced degradation at the nickel oxide-perovskite heterojunction by aprotic sulfonium layers towards long-term operationally stable inverted perovskite solar cells

    Get PDF
    Nickel oxide (NiOx) is a promising hole-selective contact to produce efficient inverted p-i-n structured perovskite solar cells (PSCs) due to its high carrier mobility and high transparency. However, the light-induced degradation of the NiOx–perovskite heterojunction is the main factor limiting its long-term operational lifetime. In this study, we used the time-resolved mass spectrometry technique to clarify the degradation mechanism of the NiOx-formamidinium–methylammonium iodide perovskite (a common composition for high-performance PSCs) heterojunction under operational conditions, and observed that (1) oxidation of iodide and generation of free protons under 1-sun illumination, (2) formation of volatile hydrogen cyanide, methyliodide, and ammonia at elevated temperatures, and (3) a condensation reaction between the organic components under a high vapor pressure. To eliminate these multi-step photochemical reactions, we constructed an aprotic trimethylsulfonium bromide (TMSBr) buffer layer at the NiOx/perovskite interface, which enables excellent photo-thermal stability, a matched lattice parameter with the perovskite crystal, and robust trap-passivation ability. Inverted PSCs stabilized with the TMSBr buffer layer reached the maximum efficiency of 22.1% and retained 82.8% of the initial value after continuous operation for 2000 hours under AM1.5G light illumination, which translates into a T80 lifetime of 2310 hours that is among the highest operational lifetimes for NiOx-based PSCs

    Open X-Embodiment:Robotic learning datasets and RT-X models

    Get PDF
    Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train "generalist" X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. The project website is robotics-transformer-x.github.io

    Design and implementation of Peking Opera action scoring system based on human skeleton information

    No full text
    At present, most of the preservation records of Peking Opera remain in the ways of video and text, and the digitalization degree is far lower than the development level of science and technology. The immaterial cultural heritage cannot be fully displayed and Peking Opera’s value is weakened. Therefore, adopting advanced motion capture technology is of great significance to the protection and inheritance of Peking Opera. We use optical motion capture equipment to record the movement information of Peking Opera actors, then keep the human skeleton information in a specific file format. After that, the hierarchical human action skeleton model was analysed, and the final score was obtained by comparing the change sequence of information of reference action and training action skeleton with the improved DTW algorithm. We have realized the graphical interface of the system, and the trainer can easily select the action segments to train or select a specific body part for specific action training. This paper introduces the overall design framework of our Peking Opera action scoring system, including the collection of action information, the implementation of scoring algorithm and the design of software interface
    corecore