260 research outputs found
Social influence analysis in microblogging platforms - a topic-sensitive based approach
The use of Social Media, particularly microblogging platforms such as Twitter, has proven to be an effective channel for promoting ideas to online audiences. In a world where information can bias public opinion it is essential to analyse the propagation and influence of information in large-scale networks. Recent research studying social media data to rank users by topical relevance have largely focused on the “retweet", “following" and “mention" relations. In this paper we propose the use of semantic profiles for deriving influential users based on the retweet subgraph of the Twitter graph. We introduce a variation of the PageRank algorithm for analysing users’ topical and entity influence based on the topical/entity relevance of a retweet relation. Experimental results show that our approach outperforms related algorithms including HITS, InDegree and Topic-Sensitive PageRank. We also introduce VisInfluence, a visualisation platform for presenting top influential users based on a topical query need
Learning the effective order of a hypergraph dynamical system
Dynamical systems on hypergraphs can display a rich set of behaviours not
observable for systems with pairwise interactions. Given a distributed
dynamical system with a putative hypergraph structure, an interesting question
is thus how much of this hypergraph structure is actually necessary to
faithfully replicate the observed dynamical behaviour. To answer this question,
we propose a method to determine the minimum order of a hypergraph necessary to
approximate the corresponding dynamics accurately. Specifically, we develop an
analytical framework that allows us to determine this order when the type of
dynamics is known. We utilize these ideas in conjunction with a hypergraph
neural network to directly learn the dynamics itself and the resulting order of
the hypergraph from both synthetic and real data sets consisting of observed
system trajectories
Self-Supervised Learning for Recommender Systems: A Survey
In recent years, neural architecture-based recommender systems have achieved
tremendous success, but they still fall short of expectation when dealing with
highly sparse data. Self-supervised learning (SSL), as an emerging technique
for learning from unlabeled data, has attracted considerable attention as a
potential solution to this issue. This survey paper presents a systematic and
timely review of research efforts on self-supervised recommendation (SSR).
Specifically, we propose an exclusive definition of SSR, on top of which we
develop a comprehensive taxonomy to divide existing SSR methods into four
categories: contrastive, generative, predictive, and hybrid. For each category,
we elucidate its concept and formulation, the involved methods, as well as its
pros and cons. Furthermore, to facilitate empirical comparison, we release an
open-source library SELFRec (https://github.com/Coder-Yu/SELFRec), which
incorporates a wide range of SSR models and benchmark datasets. Through
rigorous experiments using this library, we derive and report some significant
findings regarding the selection of self-supervised signals for enhancing
recommendation. Finally, we shed light on the limitations in the current
research and outline the future research directions.Comment: 20 pages. Accepted by TKD
Multi-Modal Self-Supervised Learning for Recommendation
The online emergence of multi-modal sharing platforms (eg, TikTok, Youtube)
is powering personalized recommender systems to incorporate various modalities
(eg, visual, textual and acoustic) into the latent user representations. While
existing works on multi-modal recommendation exploit multimedia content
features in enhancing item embeddings, their model representation capability is
limited by heavy label reliance and weak robustness on sparse user behavior
data. Inspired by the recent progress of self-supervised learning in
alleviating label scarcity issue, we explore deriving self-supervision signals
with effectively learning of modality-aware user preference and cross-modal
dependencies. To this end, we propose a new Multi-Modal Self-Supervised
Learning (MMSSL) method which tackles two key challenges. Specifically, to
characterize the inter-dependency between the user-item collaborative view and
item multi-modal semantic view, we design a modality-aware interactive
structure learning paradigm via adversarial perturbations for data
augmentation. In addition, to capture the effects that user's modality-aware
interaction pattern would interweave with each other, a cross-modal contrastive
learning approach is introduced to jointly preserve the inter-modal semantic
commonality and user preference diversity. Experiments on real-world datasets
verify the superiority of our method in offering great potential for multimedia
recommendation over various state-of-the-art baselines. The implementation is
released at: https://github.com/HKUDS/MMSSL.Comment: This paper has been published as a full paper at WWW 202
Biclustering fMRI time series
Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2020Biclustering é um método de análise que procura gerar clusters tendo em conta simultaneamente as linhas e as colunas de uma matriz de dados. Este método tem sido vastamente explorado em análise de dados genéticos. Apesar de diversos estudos reconhecerem as capacidades deste método de análise em outras áreas de investigação, as últimas duas décadas tem sido marcadas por um número elevado de estudos aplicados em dados genéticos e pela ausência de uma linha de investigação que explore as capacidades de biclustering fora desta área tradicional Esta tese segue pistas que sugerem potencial no uso de biclustering em dados de natureza espaço-temporal. Considerando o contexto particular das neurociências, esta tese explora as capacidades dos algoritmos de biclustering em extrair conhecimento das séries temporais geradas por técnicas de imagem por ressonância magnética funcional (fMRI). Eta tese propõe uma metodologia para avaliar a capacidade de algoritmos de biclustering em estudar dados fMRI, considerando tanto dados sintéticos como dados reais. Para avaliar estes algoritmos, usamos métricas de avaliação interna. Os nossos resultados discutem o uso de diversas estratégias de busca, revelando a superioridade de estratégias exaustivos para obter os biclusters mais homogéneos. No entanto, o elevado custo computacional de estratégias exaustivas ainda são um desafio e é necessário pesquisa adicional para a busca eficiente de biclusters no contexto de análise de dados fMRI. Propomos adicionalmente uma nova metodologia de análise de biclusters baseada em algoritmos de descoberta de padrões para determinar os padrões mais frequentes presentes nas soluções de biclustering geradas. Um bicluster não é mais que um hipervértice num hipergrafo . Extrair padrões frequentes numa solução de biclustering implica extrair os hipervértices mais significativos. Numa primeira abordagem, isto permite entender relações entre regiões do cérebro e traçar perfis temporais que métodos tradicionais de estudos de correlação não são capazes de detetar. Adicionalmente, o processo de gerar os biclusters permite filtrar ligações pouco interessantes, permitindo potencialmente gerar hipergrafos de forma eficiente. A questão final é o que podemos fazer com este conhecimento. Conhecer a relação entre regiões do cérebro é o objetivo central das neurociências. Entender as ligações entre regiões do cérebro para vários sujeitos permitem traçar perfis. Nesse caso, propomos uma metodologia para extrapolar biclusters para dados tridimensionais e efetuar triclustering. Adicionalmente, entender a ligação entre zonas cerebrais permite identificar doenças como a esquizofrenia, demência ou o Alzheimer. Este trabalho aponta caminhos para o uso de biclustering na análise de dados espaço-temporais, em particular em neurociências. A metodologia de avaliação proposta mostra evidências da eficácia do biclustering para encontrar padrões locais em dados de fMRI, embora mais trabalhos sejam necessários em relação à escalabilidade para promover a aplicação em cenários reais.The effectiveness of biclustering, simultaneous clustering of both rows and columns in a data matrix, has been primarily shown in gene expression data analysis. Furthermore, several researchers recognize its potentialities in other research areas. Nevertheless, the last two decades witnessed many biclustering algorithms targeting gene expression data analysis and a lack of consistent studies exploring the capacities of biclustering outside this traditional application domain. Following hints that suggest potentialities for biclustering on Spatiotemporal data, particularly in neurosciences, this thesis explores biclustering’s capacity to extract knowledge from fMRI time series. This thesis proposes a methodology to evaluate biclustering algorithms’ feasibility to study the fMRI signal, considering both synthetic and realworld fMRI datasets. In the absence of ground truth to compare bicluster solutions with a reference one, we used internal valuation metrics. Results discussing the use of different search strategies showed the superiority of exhaustive approaches, obtaining the most homogeneous biclusters. However, their high computational cost is still a challenge, and further work is needed for the efficient use of biclustering in fMRI data analysis. We propose a new methodology for analyzing biclusters based on performing pattern mining algorithms to determine the most frequent patterns present in the generated biclustering solutions. A bicluster is nothing more than a hyperlink in a hypergraph. Extracting frequent patterns in a biclustering solution implies extracting the most significant hyperlinks. In a first approach, this allows to understand relationships between regions of the brain and draw temporal profiles that traditional methods of correlation studies cannot detect. Additionally, the process of generating biclusters allows filtering uninteresting links, potentially allowing to generate hypergraphs efficiently. The final question is, what can we do with this knowledge. Knowing the relationship between brain regions is the central objective of neurosciences. Understanding the connections between regions of the brain for various subjects allows one to draw profiles. In this case, we propose a methodology to extrapolate biclusters to threedimensional data and perform triclustering. Additionally, understanding the link between brain zones allows identifying diseases like schizophrenia, dementia, or Alzheimer’s. This work pinpoints avenues for the use of biclustering in Spatiotemporal data analysis, in particular neurosciences applications. The proposed evaluation methodology showed evidence of biclustering’s effectiveness in finding local fMRI data patterns, although further work is needed regarding scalability to promote the application in real scenarios
- …