55,666 research outputs found
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability
Recently, large-scale diffusion models, e.g., Stable diffusion and DallE2,
have shown remarkable results on image synthesis. On the other hand,
large-scale cross-modal pre-trained models (e.g., CLIP, ALIGN, and FILIP) are
competent for various downstream tasks by learning to align vision and language
embeddings. In this paper, we explore the possibility of jointly modeling
generation and discrimination. Specifically, we propose DiffDis to unify the
cross-modal generative and discriminative pretraining into one single framework
under the diffusion process. DiffDis first formulates the image-text
discriminative problem as a generative diffusion process of the text embedding
from the text encoder conditioned on the image. Then, we propose a novel
dual-stream network architecture, which fuses the noisy text embedding with the
knowledge of latent images from different scales for image-text discriminative
learning. Moreover, the generative and discriminative tasks can efficiently
share the image-branch network structure in the multi-modality model.
Benefiting from diffusion-based unified training, DiffDis achieves both better
generation ability and cross-modal semantic alignment in one architecture.
Experimental results show that DiffDis outperforms single-task models on both
the image generation and the image-text discriminative tasks, e.g., 1.65%
improvement on average accuracy of zero-shot classification over 12 datasets
and 2.42 improvement on FID of zero-shot image synthesis.Comment: ICCV202
A multi-species functional embedding integrating sequence and network structure
A key challenge to transferring knowledge between species is that different species have fundamentally different genetic architectures. Initial computational approaches to transfer knowledge across species have relied on measures of heredity such as genetic homology, but these approaches suffer from limitations. First, only a small subset of genes have homologs, limiting the amount of knowledge that can be transferred, and second, genes change or repurpose functions, complicating the transfer of knowledge. Many approaches address this problem by expanding the notion of homology by leveraging high-throughput genomic and proteomic measurements, such as through network alignment. In this work, we take a new approach to transferring knowledge across species by expanding the notion of homology through explicit measures of functional similarity between proteins in different species. Specifically, our kernel-based method, HANDL (Homology Assessment across Networks using Diffusion and Landmarks), integrates sequence and network structure to create a functional embedding in which proteins from different species are embedded in the same vector space. We show that inner products in this space and the vectors themselves capture functional similarity across species, and are useful for a variety of functional tasks. We perform the first whole-genome method for predicting phenologs, generating many that were previously identified, but also predicting new phenologs supported from the biological literature. We also demonstrate the HANDL embedding captures pairwise gene function, in that gene pairs with synthetic lethal interactions are significantly separated in HANDL space, and the direction of separation is conserved across species. Software for the HANDL algorithm is available at http://bit.ly/lrgr-handl.Published versio
Neighborhood Matching Network for Entity Alignment
Structural heterogeneity between knowledge graphs is an outstanding challenge
for entity alignment. This paper presents Neighborhood Matching Network (NMN),
a novel entity alignment framework for tackling the structural heterogeneity
challenge. NMN estimates the similarities between entities to capture both the
topological structure and the neighborhood difference. It provides two
innovative components for better learning representations for entity alignment.
It first uses a novel graph sampling method to distill a discriminative
neighborhood for each entity. It then adopts a cross-graph neighborhood
matching module to jointly encode the neighborhood difference for a given
entity pair. Such strategies allow NMN to effectively construct
matching-oriented entity representations while ignoring noisy neighbors that
have a negative impact on the alignment task. Extensive experiments performed
on three entity alignment datasets show that NMN can well estimate the
neighborhood similarity in more tough cases and significantly outperforms 12
previous state-of-the-art methods.Comment: 11 pages, accepted by ACL 202
- …