434 research outputs found
The HyperBagGraph DataEdron: An Enriched Browsing Experience of Multimedia Datasets
Traditional verbatim browsers give back information in a linear way according
to a ranking performed by a search engine that may not be optimal for the
surfer. The latter may need to assess the pertinence of the information
retrieved, particularly when she wants to explore other facets of a
multi-facetted information space. For instance, in a multimedia dataset
different facets such as keywords, authors, publication category, organisations
and figures can be of interest. The facet simultaneous visualisation can help
to gain insights on the information retrieved and call for further searches.
Facets are co-occurence networks, modeled by HyperBag-Graphs -- families of
multisets -- and are in fact linked not only to the publication itself, but to
any chosen reference. These references allow to navigate inside the dataset and
perform visual queries. We explore here the case of scientific publications
based on Arxiv searches.Comment: Extension of the hypergraph framework shortly presented in
arXiv:1809.00164 (possible small overlaps); use the theoretical framework of
hb-graphs presented in arXiv:1809.0019
Network Capacity Bound for Personalized PageRank in Multimodal Networks
In a former paper the concept of Bipartite PageRank was introduced and a
theorem on the limit of authority flowing between nodes for personalized
PageRank has been generalized. In this paper we want to extend those results to
multimodal networks. In particular we introduce a hypergraph type that may be
used for describing multimodal network where a hyperlink connects nodes from
each of the modalities. We introduce a generalisation of PageRank for such
graphs and define the respective random walk model that can be used for
computations. we finally state and prove theorems on the limit of outflow of
authority for cases where individual modalities have identical and distinct
damping factors.Comment: 28 pages. arXiv admin note: text overlap with arXiv:1702.0373
MCDAN: a Multi-scale Context-enhanced Dynamic Attention Network for Diffusion Prediction
Information diffusion prediction aims at predicting the target users in the
information diffusion path on social networks. Prior works mainly focus on the
observed structure or sequence of cascades, trying to predict to whom this
cascade will be infected passively. In this study, we argue that user intent
understanding is also a key part of information diffusion prediction. We
thereby propose a novel Multi-scale Context-enhanced Dynamic Attention Network
(MCDAN) to predict which user will most likely join the observed current
cascades. Specifically, to consider the global interactive relationship among
users, we take full advantage of user friendships and global cascading
relationships, which are extracted from the social network and historical
cascades, respectively. To refine the model's ability to understand the user's
preference for the current cascade, we propose a multi-scale sequential
hypergraph attention module to capture the dynamic preference of users at
different time scales. Moreover, we design a contextual attention enhancement
module to strengthen the interaction of user representations within the current
cascade. Finally, to engage the user's own susceptibility, we construct a
susceptibility label for each user based on user susceptibility analysis and
use the rank of this label for auxiliary prediction. We conduct experiments
over four widely used datasets and show that MCDAN significantly overperforms
the state-of-the-art models. The average improvements are up to 10.61% in terms
of Hits@100 and 9.71% in terms of MAP@100, respectively
HyperLearn: A Distributed Approach for Representation Learning in Datasets With Many Modalities
Multimodal datasets contain an enormous amount of relational information,
which grows exponentially with the introduction of new modalities. Learning
representations in such a scenario is inherently complex due to the presence of
multiple heterogeneous information channels. These channels can encode both (a)
inter-relations between the items of different modalities and (b)
intra-relations between the items of the same modality. Encoding multimedia
items into a continuous low-dimensional semantic space such that both types of
relations are captured and preserved is extremely challenging, especially if
the goal is a unified end-to-end learning framework. The two key challenges
that need to be addressed are: 1) the framework must be able to merge complex
intra and inter relations without losing any valuable information and 2) the
learning model should be invariant to the addition of new and potentially very
different modalities. In this paper, we propose a flexible framework which can
scale to data streams from many modalities. To that end we introduce a
hypergraph-based model for data representation and deploy Graph Convolutional
Networks to fuse relational information within and across modalities. Our
approach provides an efficient solution for distributing otherwise extremely
computationally expensive or even unfeasible training processes across
multiple-GPUs, without any sacrifices in accuracy. Moreover, adding new
modalities to our model requires only an additional GPU unit keeping the
computational time unchanged, which brings representation learning to truly
multimodal datasets. We demonstrate the feasibility of our approach in the
experiments on multimedia datasets featuring second, third and fourth order
relations
- …