36 research outputs found

    trans-4,5-Dihydr­oxy-1,3-bis­(4-methoxy­phen­yl)imidazolidine-2-thione

    Get PDF
    In the title compound, C17H18N2O4S, where one of the N-4-methoxy­phenyl fragments is disordered over two sets of sites, the five-membered ring exhibits a nearly half-chair conformation and the two hydroxyl groups lie on opposite sides of the five-membered ring. In the crystal, the mol­ecules are linked into sheets parallel to (100) via O—H⋯O and O—H⋯S hydrogen bonds

    Dimethyl 2,6-dimethyl-1,4-dihydro­pyridine-3,5-dicarboxyl­ate

    Get PDF
    In the crystal of the title compound, C11H15NO4, the mol­ecules are linked into sheets by N—H⋯O and C—H⋯O hydrogen bonds. Within the mol­ecule, the 1,4-dihydro­pyridine ring exhibits a distinctive planar conformation [r.m.s. deviation from the mean plane of 0.009 (3)Å], and the other non-H atoms are almost coplanar [r.m.s. deviation = 0.021 (3) Å] with the 1,4-dihydro­pyridine ring. The conformation of the latter is governed mainly by two intra­molecular C—H⋯O non-classical inter­actions

    trans-4,5-Dihydr­oxy-1,3-diphenyl­imidazolidine-2-thione

    Get PDF
    In the title compound, C15H14N2O2S, the five-membered ring adopts an envelope conformation and the two hydr­oxy groups lie on opposite sides of the ring. The six-membered rings are oriented at a dihedral angle of 22.63 (3)°. In the crystal structure, inter­molecular O—H⋯S and O—H⋯O hydrogen bonds link the mol­ecules into a two-dimensional network

    Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots

    Full text link
    Improving the generalization capabilities of general-purpose robotic agents has long been a significant challenge actively pursued by research communities. Existing approaches often rely on collecting large-scale real-world robotic data, such as the RT-1 dataset. However, these approaches typically suffer from low efficiency, limiting their capability in open-domain scenarios with new objects, and diverse backgrounds. In this paper, we propose a novel paradigm that effectively leverages language-grounded segmentation masks generated by state-of-the-art foundation models, to address a wide range of pick-and-place robot manipulation tasks in everyday scenarios. By integrating precise semantics and geometries conveyed from masks into our multi-view policy model, our approach can perceive accurate object poses and enable sample-efficient learning. Besides, such design facilitates effective generalization for grasping new objects with similar shapes observed during training. Our approach consists of two distinct steps. First, we introduce a series of foundation models to accurately ground natural language demands across multiple tasks. Second, we develop a Multi-modal Multi-view Policy Model that incorporates inputs such as RGB images, semantic masks, and robot proprioception states to jointly predict precise and executable robot actions. Extensive real-world experiments conducted on a Franka Emika robot arm validate the effectiveness of our proposed paradigm. Real-world demos are shown in YouTube (https://www.youtube.com/watch?v=1m9wNzfp_4E ) and Bilibili (https://www.bilibili.com/video/BV178411Z7H2/ )

    AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

    Full text link
    We propose a novel framework for learning high-level cognitive capabilities in robot manipulation tasks, such as making a smiley face using building blocks. These tasks often involve complex multi-step reasoning, presenting significant challenges due to the limited paired data connecting human instructions (e.g., making a smiley face) and robot actions (e.g., end-effector movement). Existing approaches relieve this challenge by adopting an open-loop paradigm decomposing high-level instructions into simple sub-task plans, and executing them step-by-step using low-level control models. However, these approaches are short of instant observations in multi-step reasoning, leading to sub-optimal results. To address this issue, we propose to automatically collect a cognitive robot dataset by Large Language Models (LLMs). The resulting dataset AlphaBlock consists of 35 comprehensive high-level tasks of multi-step text plans and paired observation sequences. To enable efficient data acquisition, we employ elaborated multi-round prompt designs that effectively reduce the burden of extensive human involvement. We further propose a closed-loop multi-modal embodied planning model that autoregressively generates plans by taking image observations as input. To facilitate effective learning, we leverage MiniGPT-4 with a frozen visual encoder and LLM, and finetune additional vision adapter and Q-former to enable fine-grained spatial perception for manipulation tasks. We conduct experiments to verify the superiority over existing open and closed-loop methods, and achieve a significant increase in success rate by 21.4% and 14.5% over ChatGPT and GPT-4 based robot tasks. Real-world demos are shown in https://www.youtube.com/watch?v=ayAzID1_qQk

    SBS Content Detection for Modified Asphalt Using Deep Neural Network

    No full text
    This study proposes a prediction model for accurately detecting styrene-butadiene-styrene (SBS) content in modified asphalt using the deep neural network (DNN). Traditional methods used for evaluating the SBS content are inaccurate and complicated because they are prone to produce errors by manual computation. Feature data of SBS content are derived from the spectra, which are obtained by the Fourier-transform infrared spectroscopy test. After designing DNN, preprocessed feature data are utilized as training and testing data and are fed into the DNN via a feature matrix. Furthermore, comparative studies are conducted to verify the accuracy of the proposed model. Results show that the mean square error value decreased by 68% for DNN with noise and dimension reduction. The DNN-based prediction model showed that the correlation coefficient between the target value and the mean predicted value is 0.9978 and 0.9992 for training and testing samples, respectively, indicating its remarkable accuracy and applicability after training. In comparison with the standard curve method and the random forest method, the precision of DNN is greater than 98% for the same test conditions, achieving the best predicting performance

    CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets

    No full text
    Current RGB-D scene recognition approaches often train two standalone backbones for RGB and depth modalities with the same Places or ImageNet pre-training. However, the pre-trained depth network is still biased by RGB-based models which may result in a suboptimal solution. In this paper, we present a single-model self-supervised hybrid pre-training framework for RGB and depth modalities, termed as CoMAE. Our CoMAE presents a curriculum learning strategy to unify the two popular self-supervised representation learning algorithms: contrastive learning and masked image modeling. Specifically, we first build a patch-level alignment task to pre-train a single encoder shared by two modalities via cross-modal contrastive learning. Then, the pre-trained contrastive encoder is passed to a multi-modal masked autoencoder to capture the finer context features from a generative perspective. In addition, our single-model design without requirement of fusion module is very flexible and robust to generalize to unimodal scenario in both training and testing phases. Extensive experiments on SUN RGB-D and NYUDv2 datasets demonstrate the effectiveness of our CoMAE for RGB and depth representation learning. In addition, our experiment results reveal that CoMAE is a data-efficient representation learner. Although we only use the small-scale and unlabeled training set for pre-training, our CoMAE pre-trained models are still competitive to the state-of-the-art methods with extra large-scale and supervised RGB dataset pre-training. Code will be released at https://github.com/MCG-NJU/CoMAE

    An Intelligent Vision System for Detecting Defects in Micro-Armatures for Smartphones

    No full text
    Automatic vision inspection technology shows a high potential for quality inspection, and has drawn great interest in micro-armature manufacturing. Given that the inspection process is highly influenced by the lack of real standardization and efficiency performed with the human eye, thus, it is necessary to develop an automatic defect detection process. In this work, an elaborated vision system for the defect inspection of micro-armatures used in smartphones was developed. It consists of two parts, the front-end module and the deep convolution neural networks (DCNNs) module, which are responsible for different areas. The front-end module runs first and the DCNNs module will not run if the output of the front-end module is negative. To verify the application of this system, an apparatus consisting of an objective table, control panel, and a camera connected to a Personal Computer (PC) was used to simulate an industrial position of production. The results indicate that the developed vision system is capable of defect detection of micro-armatures

    The Arabidopsis CALLOSE DEFECTIVE MICROSPORE1

    No full text
    corecore