12,859 research outputs found
Generating EPR beams in a cavity optomechanical system
We propose a scheme to produce continuous variable entanglement between
phase-quadrature amplitudes of two light modes in an optomechanical system. For
proper driving power and detuning, the entanglement is insensitive with bath
temperature and of mechanical oscillator. Under realistic experimental
conditions, we find that the entanglement could be very large even at room
temperature.Comment: 4.1 pages, 4 figures, comments are welcome; to appear in PRA,
published version with corrections of typo
Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks
Heterogeneous information networks (HINs) are ubiquitous in real-world
applications. In the meantime, network embedding has emerged as a convenient
tool to mine and learn from networked data. As a result, it is of interest to
develop HIN embedding methods. However, the heterogeneity in HINs introduces
not only rich information but also potentially incompatible semantics, which
poses special challenges to embedding learning in HINs. With the intention to
preserve the rich yet potentially incompatible information in HIN embedding, we
propose to study the problem of comprehensive transcription of heterogeneous
information networks. The comprehensive transcription of HINs also provides an
easy-to-use approach to unleash the power of HINs, since it requires no
additional supervision, expertise, or feature engineering. To cope with the
challenges in the comprehensive transcription of HINs, we propose the HEER
algorithm, which embeds HINs via edge representations that are further coupled
with properly-learned heterogeneous metrics. To corroborate the efficacy of
HEER, we conducted experiments on two large-scale real-words datasets with an
edge reconstruction task and multiple case studies. Experiment results
demonstrate the effectiveness of the proposed HEER model and the utility of
edge representations and heterogeneous metrics. The code and data are available
at https://github.com/GentleZhu/HEER.Comment: 10 pages. In Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, London, United Kingdom,
ACM, 201
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Masked image modeling (MIM) has become a prevalent pre-training setup for
vision foundation models and attains promising performance. Despite its
success, existing MIM methods discard the decoder network during downstream
applications, resulting in inconsistent representations between pre-training
and fine-tuning and can hamper downstream task performance. In this paper, we
propose a new architecture, RevColV2, which tackles this issue by keeping the
entire autoencoder architecture during both pre-training and fine-tuning. The
main body of RevColV2 contains bottom-up columns and top-down columns, between
which information is reversibly propagated and gradually disentangled. Such
design enables our architecture with the nice property: maintaining
disentangled low-level and semantic information at the end of the network in
MIM pre-training. Our experimental results suggest that a foundation model with
decoupled features can achieve competitive performance across multiple
downstream vision tasks such as image classification, semantic segmentation and
object detection. For example, after intermediate fine-tuning on ImageNet-22K
dataset, RevColV2-L attains 88.4% top-1 accuracy on ImageNet-1K classification
and 58.6 mIoU on ADE20K semantic segmentation. With extra teacher and large
scale dataset, RevColv2-L achieves 62.1 box AP on COCO detection and 60.4 mIoU
on ADE20K semantic segmentation. Code and models are released at
https://github.com/megvii-research/RevCo
Deformable Convolutional Networks
Convolutional neural networks (CNNs) are inherently limited to model
geometric transformations due to the fixed geometric structures in its building
modules. In this work, we introduce two new modules to enhance the
transformation modeling capacity of CNNs, namely, deformable convolution and
deformable RoI pooling. Both are based on the idea of augmenting the spatial
sampling locations in the modules with additional offsets and learning the
offsets from target tasks, without additional supervision. The new modules can
readily replace their plain counterparts in existing CNNs and can be easily
trained end-to-end by standard back-propagation, giving rise to deformable
convolutional networks. Extensive experiments validate the effectiveness of our
approach on sophisticated vision tasks of object detection and semantic
segmentation. The code would be released
- …