62,939 research outputs found
LATTE: Application Oriented Social Network Embedding
In recent years, many research works propose to embed the network structured
data into a low-dimensional feature space, where each node is represented as a
feature vector. However, due to the detachment of embedding process with
external tasks, the learned embedding results by most existing embedding models
can be ineffective for application tasks with specific objectives, e.g.,
community detection or information diffusion. In this paper, we propose study
the application oriented heterogeneous social network embedding problem.
Significantly different from the existing works, besides the network structure
preservation, the problem should also incorporate the objectives of external
applications in the objective function. To resolve the problem, in this paper,
we propose a novel network embedding framework, namely the "appLicAtion
orienTed neTwork Embedding" (Latte) model. In Latte, the heterogeneous network
structure can be applied to compute the node "diffusive proximity" scores,
which capture both local and global network structures. Based on these computed
scores, Latte learns the network representation feature vectors by extending
the autoencoder model model to the heterogeneous network scenario, which can
also effectively unite the objectives of network embedding and external
application tasks. Extensive experiments have been done on real-world
heterogeneous social network datasets, and the experimental results have
demonstrated the outstanding performance of Latte in learning the
representation vectors for specific application tasks.Comment: 11 Pages, 12 Figures, 1 Tabl
Independent Asymmetric Embedding for Cascade Prediction on Social Networks
The prediction for information diffusion on social networks has great
practical significance in marketing and public opinion control. Cascade
prediction aims to predict the individuals who will potentially repost the
message on the social network. One kind of methods either exploit
demographical, structural, and temporal features for prediction, or explicitly
rely on particular information diffusion models. The other kind of models are
fully data-driven and do not require a global network structure. Thus massive
diffusion prediction models based on network embedding are proposed. These
models embed the users into the latent space using their cascade information,
but are lack of consideration for the intervene among users when embedding. In
this paper, we propose an independent asymmetric embedding method to learn
social embedding for cascade prediction. Different from existing methods, our
method embeds each individual into one latent influence space and multiple
latent susceptibility spaces. Furthermore, our method captures the
co-occurrence regulation of user combination in cascades to improve the
calculating effectiveness. The results of extensive experiments conducted on
real-world datasets verify both the predictive accuracy and cost-effectiveness
of our approach
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability
Recently, large-scale diffusion models, e.g., Stable diffusion and DallE2,
have shown remarkable results on image synthesis. On the other hand,
large-scale cross-modal pre-trained models (e.g., CLIP, ALIGN, and FILIP) are
competent for various downstream tasks by learning to align vision and language
embeddings. In this paper, we explore the possibility of jointly modeling
generation and discrimination. Specifically, we propose DiffDis to unify the
cross-modal generative and discriminative pretraining into one single framework
under the diffusion process. DiffDis first formulates the image-text
discriminative problem as a generative diffusion process of the text embedding
from the text encoder conditioned on the image. Then, we propose a novel
dual-stream network architecture, which fuses the noisy text embedding with the
knowledge of latent images from different scales for image-text discriminative
learning. Moreover, the generative and discriminative tasks can efficiently
share the image-branch network structure in the multi-modality model.
Benefiting from diffusion-based unified training, DiffDis achieves both better
generation ability and cross-modal semantic alignment in one architecture.
Experimental results show that DiffDis outperforms single-task models on both
the image generation and the image-text discriminative tasks, e.g., 1.65%
improvement on average accuracy of zero-shot classification over 12 datasets
and 2.42 improvement on FID of zero-shot image synthesis.Comment: ICCV202
- …