780 research outputs found

    Open X-Embodiment:Robotic learning datasets and RT-X models

    Get PDF
    Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train "generalist" X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. The project website is robotics-transformer-x.github.io

    Shuffling of Promoters for Multiple Genes To Optimize Xylose Fermentation in an Engineered Saccharomyces cerevisiae Strain▿ †

    Full text link
    We describe here a useful metabolic engineering tool, multiple-gene-promoter shuffling (MGPS), to optimize expression levels for multiple genes. This method approaches an optimized gene overexpression level by fusing promoters of various strengths to genes of interest for a particular pathway. Selection of these promoters is based on the expression levels of the native genes under the same physiological conditions intended for the application. MGPS was implemented in a yeast xylose fermentation mixture by shuffling the promoters for GND2 and HXK2 with the genes for transaldolase (TAL1), transketolase (TKL1), and pyruvate kinase (PYK1) in the Saccharomyces cerevisiae strain FPL-YSX3. This host strain has integrated xylose-metabolizing genes, including xylose reductase, xylitol dehydrogenase, and xylulose kinase. The optimal expression levels for TAL1, TKL1, and PYK1 were identified by analysis of volumetric ethanol production by transformed cells. We found the optimal combination for ethanol production to be GND2-TAL1-HXK2-TKL1-HXK2-PYK1. The MGPS method could easily be adapted for other eukaryotic and prokaryotic organisms to optimize expression of genes for industrial fermentation

    Erosion Gully Networks Extraction Based on InSAR Refined Digital Elevation Model and Relative Elevation Algorithm—A Case Study in Huangfuchuan Basin, Northern Loess Plateau, China

    Full text link
    The time-effective mapping of erosion gullies is crucial for monitoring and early detection of developing erosional progression. However, current methods face challenges in obtaining large-scale erosion gully networks rapidly due to limitations in data availability and computational complexity. This study developed a rapid method for extracting erosion gully networks by integrating interferometric synthetic aperture radar (InSAR) and the relative elevation algorithm (REA) within the Huangfuchuan Basin, a case basin in the northern Loess Plateau, China. Validation in the study area demonstrated that the proposed method achieved an F1 score of 81.94%, representing a 9.77% improvement over that of the reference ASTER GDEM. The method successfully detected small reliefs of erosion gullies using the InSAR-refined DEM. The accuracy of extraction varied depending on the characteristics of the gullies in different locations. The F1 score showed a positive correlation with gully depth (R2 = 0.62), while the fragmented gully heads presented a higher potential of being missed due to the resolution effect. The extraction results provided insights into the erosion gully networks in the case study area. A total of approximately 28,000 gullies were identified, exhibiting pinnate and trellis patterns. Most of the gullies had notable intersecting angles exceeding 60°. The basin’s average depth was 64 m, with the deepest gully being 140 m deep. Surface fragmentation indicated moderate erosive activity, with the southeastern loess region showing more severe erosion than the Pisha sandstone-dominated central and northwestern regions. The method described in this study offers a rapid approach to map gullies, streamlining the workflow of erosion gully extraction and enabling efficiently targeted interventions for erosion control efforts. Its practical applicability and potential to leverage open-source data make it accessible for broader application in similar regions facing erosion challenges

    Open-Vocabulary 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning

    Full text link
    Current point-cloud detection methods have difficulty detecting the open-vocabulary objects in the real world, due to their limited generalization capability. Moreover, it is extremely laborious and expensive to collect and fully annotate a point-cloud detection dataset with numerous classes of objects, leading to the limited classes of existing point-cloud datasets and hindering the model to learn general representations to achieve open-vocabulary point-cloud detection. As far as we know, we are the first to study the problem of open-vocabulary 3D point-cloud detection. Instead of seeking a point-cloud dataset with full labels, we resort to ImageNet1K to broaden the vocabulary of the point-cloud detector. We propose OV-3DETIC, an Open-Vocabulary 3D DETector using Image-level Class supervision. Specifically, we take advantage of two modalities, the image modality for recognition and the point-cloud modality for localization, to generate pseudo labels for unseen classes. Then we propose a novel debiased cross-modal contrastive learning method to transfer the knowledge from image modality to point-cloud modality during training. Without hurting the latency during inference, OV-3DETIC makes the point-cloud detector capable of achieving open-vocabulary detection. Extensive experiments demonstrate that the proposed OV-3DETIC achieves at least 10.77 % mAP improvement (absolute value) and 9.56 % mAP improvement (absolute value) by a wide range of baselines on the SUN-RGBD dataset and ScanNet dataset, respectively. Besides, we conduct sufficient experiments to shed light on why the proposed OV-3DETIC works

    Open-Vocabulary Point-Cloud Object Detection without 3D Annotation

    Full text link
    The goal of open-vocabulary detection is to identify novel objects based on arbitrary textual descriptions. In this paper, we address open-vocabulary 3D point-cloud detection by a dividing-and-conquering strategy, which involves: 1) developing a point-cloud detector that can learn a general representation for localizing various objects, and 2) connecting textual and point-cloud representations to enable the detector to classify novel object categories based on text prompting. Specifically, we resort to rich image pre-trained models, by which the point-cloud detector learns localizing objects under the supervision of predicted 2D bounding boxes from 2D pre-trained detectors. Moreover, we propose a novel de-biased triplet cross-modal contrastive learning to connect the modalities of image, point-cloud and text, thereby enabling the point-cloud detector to benefit from vision-language pre-trained models,i.e.,CLIP. The novel use of image and vision-language pre-trained models for point-cloud detectors allows for open-vocabulary 3D object detection without the need for 3D annotations. Experiments demonstrate that the proposed method improves at least 3.03 points and 7.47 points over a wide range of baselines on the ScanNet and SUN RGB-D datasets, respectively. Furthermore, we provide a comprehensive analysis to explain why our approach works.Comment: I want to update this manuscrip
    • …
    corecore