2,145 research outputs found

    All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

    Full text link
    Unlike language tasks, where the output space is usually limited to a set of tokens, the output space of visual tasks is more complicated, making it difficult to build a unified visual model for various visual tasks. In this paper, we seek to unify the output space of visual tasks, so that we can also build a unified model for visual tasks. To this end, we demonstrate a single unified model that simultaneously handles two typical visual tasks of instance segmentation and depth estimation, which have discrete/fixed-length and continuous/varied-length outputs, respectively. We propose several new techniques that take into account the particularity of visual tasks: 1) Soft token. We employ soft token to represent the task output. Unlike hard tokens in the common VQ-VAE which are assigned one-hot to discrete codebooks/vocabularies, the soft token is assigned softly to the codebook embeddings. Soft token can improve the accuracy of both the next token inference and decoding of the task output; 2) Mask augmentation. Many visual tasks have corruption, undefined or invalid values in label annotations, i.e., occluded area of depth maps. We show that a mask augmentation technique can greatly benefit these tasks. With these new techniques and other designs, we show that the proposed general-purpose task-solver can perform both instance segmentation and depth estimation well. Particularly, we achieve 0.279 RMSE on the specific task of NYUv2 depth estimation, setting a new record on this benchmark. The general-purpose task-solver, dubbed AiT, is available at \url{https://github.com/SwinTransformer/AiT}

    Holographic Storage of Biphoton Entanglement

    Full text link
    Coherent and reversible storage of multi-photon entanglement with a multimode quantum memory is essential for scalable all-optical quantum information processing. Although single photon has been successfully stored in different quantum systems, storage of multi-photon entanglement remains challenging because of the critical requirement for coherent control of photonic entanglement source, multimode quantum memory, and quantum interface between them. Here we demonstrate a coherent and reversible storage of biphoton Bell-type entanglement with a holographic multimode atomic-ensemble-based quantum memory. The retrieved biphoton entanglement violates Bell's inequality for 1 microsecond storage time and a memory-process fidelity of 98% is demonstrated by quantum state tomography.Comment: 5 pages, 4 figures, accepted by Phys. Rev. Let

    Autoencoding a Soft Touch to Learn Grasping from On-land to Underwater

    Full text link
    Robots play a critical role as the physical agent of human operators in exploring the ocean. However, it remains challenging to grasp objects reliably while fully submerging under a highly pressurized aquatic environment with little visible light, mainly due to the fluidic interference on the tactile mechanics between the finger and object surfaces. This study investigates the transferability of grasping knowledge from on-land to underwater via a vision-based soft robotic finger that learns 6D forces and torques (FT) using a Supervised Variational Autoencoder (SVAE). A high-framerate camera captures the whole-body deformations while a soft robotic finger interacts with physical objects on-land and underwater. Results show that the trained SVAE model learned a series of latent representations of the soft mechanics transferrable from land to water, presenting a superior adaptation to the changing environments against commercial FT sensors. Soft, delicate, and reactive grasping enabled by tactile intelligence enhances the gripper's underwater interaction with improved reliability and robustness at a much-reduced cost, paving the path for learning-based intelligent grasping to support fundamental scientific discoveries in environmental and ocean research.Comment: 17 pages, 5 figures, 1 table, submitted to Advanced Intelligent Systems for revie

    Quantum interface between frequency-uncorrelated down-converted entanglement and atomic-ensemble quantum memory

    Full text link
    Photonic entanglement source and quantum memory are two basic building blocks of linear-optical quantum computation and long-distance quantum communication. In the past decades, intensive researches have been carried out, and remarkable progress, particularly based on the spontaneous parametric down-converted (SPDC) entanglement source and atomic ensembles, has been achieved. Currently, an important task towards scalable quantum information processing (QIP) is to efficiently write and read entanglement generated from a SPDC source into and out of an atomic quantum memory. Here we report the first experimental realization of a quantum interface by building a 5 MHz frequency-uncorrelated SPDC source and reversibly mapping the generated entangled photons into and out of a remote optically thick cold atomic memory using electromagnetically induced transparency. The frequency correlation between the entangled photons is almost fully eliminated with a suitable pump pulse. The storage of a triggered single photon with arbitrary polarization is shown to reach an average fidelity of 92% for 200 ns storage time. Moreover, polarization-entangled photon pairs are prepared, and one of photons is stored in the atomic memory while the other keeps flying. The CHSH Bell's inequality is measured and violation is clearly observed for storage time up to 1 microsecond. This demonstrates the entanglement is stored and survives during the storage. Our work establishes a crucial element to implement scalable all-optical QIP, and thus presents a substantial progress in quantum information science.Comment: 28 pages, 4 figures, 1 tabl
    corecore