691 research outputs found

    Multi3DRefer: Grounding Text Description to Multiple 3D Objects

    Full text link
    We introduce the task of localizing a flexible number of objects in real-world 3D scenes using natural language descriptions. Existing 3D visual grounding tasks focus on localizing a unique object given a text description. However, such a strict setting is unnatural as localizing potentially multiple objects is a common need in real-world scenarios and robotic tasks (e.g., visual navigation and object rearrangement). To address this setting we propose Multi3DRefer, generalizing the ScanRefer dataset and task. Our dataset contains 61926 descriptions of 11609 objects, where zero, single or multiple target objects are referenced by each description. We also introduce a new evaluation metric and benchmark methods from prior work to enable further investigation of multi-modal 3D scene understanding. Furthermore, we develop a better baseline leveraging 2D features from CLIP by rendering object proposals online with contrastive learning, which outperforms the state of the art on the ScanRefer benchmark.Comment: ICCV 202

    Bidirectional Graph Reasoning Network for Panoptic Segmentation

    Full text link
    Recent researches on panoptic segmentation resort to a single end-to-end network to combine the tasks of instance segmentation and semantic segmentation. However, prior models only unified the two related tasks at the architectural level via a multi-branch scheme or revealed the underlying correlation between them by unidirectional feature fusion, which disregards the explicit semantic and co-occurrence relations among objects and background. Inspired by the fact that context information is critical to recognize and localize the objects, and inclusive object details are significant to parse the background scene, we thus investigate on explicitly modeling the correlations between object and background to achieve a holistic understanding of an image in the panoptic segmentation task. We introduce a Bidirectional Graph Reasoning Network (BGRNet), which incorporates graph structure into the conventional panoptic segmentation network to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes. In particular, BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level, respectively. To establish the correlations between separate branches and fully leverage the complementary relations between things and stuff, we propose a Bidirectional Graph Connection Module to diffuse information across branches in a learnable fashion. Experimental results demonstrate the superiority of our BGRNet that achieves the new state-of-the-art performance on challenging COCO and ADE20K panoptic segmentation benchmarks.Comment: CVPR202

    New Benzofuran Oligomers from the Roots of Eupatorium heterophyllum Collected in China

    Get PDF
    The chemical constituents of two root samples of Eupatorium heterophyllum DC. collected in Yunnan Province, China, were investigated. Five new oligomeric benzofurans (1–5), nine new benzofuran/dihydrobenzofuran derivatives, and a new thymol analog were isolated, and their structures were determined using extensive spectroscopic techniques, such as 1D and 2D NMR spectroscopy and DFT calculations of the CD spectra. Most of the new compounds, including oligomeric benzofurans (1–5), were obtained from only one of the root samples. Furthermore, this is the first example that produces oligomeric benzofurans in this plant. These results imply that diversification of secondary metabolites in E. heterophyllum is ongoing. Plausible biosynthetic pathways for 1–5 are also proposed
    • …
    corecore