260 research outputs found

    Social influence analysis in microblogging platforms - a topic-sensitive based approach

    Get PDF
    The use of Social Media, particularly microblogging platforms such as Twitter, has proven to be an effective channel for promoting ideas to online audiences. In a world where information can bias public opinion it is essential to analyse the propagation and influence of information in large-scale networks. Recent research studying social media data to rank users by topical relevance have largely focused on the “retweet", “following" and “mention" relations. In this paper we propose the use of semantic profiles for deriving influential users based on the retweet subgraph of the Twitter graph. We introduce a variation of the PageRank algorithm for analysing users’ topical and entity influence based on the topical/entity relevance of a retweet relation. Experimental results show that our approach outperforms related algorithms including HITS, InDegree and Topic-Sensitive PageRank. We also introduce VisInfluence, a visualisation platform for presenting top influential users based on a topical query need

    Learning the effective order of a hypergraph dynamical system

    Full text link
    Dynamical systems on hypergraphs can display a rich set of behaviours not observable for systems with pairwise interactions. Given a distributed dynamical system with a putative hypergraph structure, an interesting question is thus how much of this hypergraph structure is actually necessary to faithfully replicate the observed dynamical behaviour. To answer this question, we propose a method to determine the minimum order of a hypergraph necessary to approximate the corresponding dynamics accurately. Specifically, we develop an analytical framework that allows us to determine this order when the type of dynamics is known. We utilize these ideas in conjunction with a hypergraph neural network to directly learn the dynamics itself and the resulting order of the hypergraph from both synthetic and real data sets consisting of observed system trajectories

    Self-Supervised Learning for Recommender Systems: A Survey

    Full text link
    In recent years, neural architecture-based recommender systems have achieved tremendous success, but they still fall short of expectation when dealing with highly sparse data. Self-supervised learning (SSL), as an emerging technique for learning from unlabeled data, has attracted considerable attention as a potential solution to this issue. This survey paper presents a systematic and timely review of research efforts on self-supervised recommendation (SSR). Specifically, we propose an exclusive definition of SSR, on top of which we develop a comprehensive taxonomy to divide existing SSR methods into four categories: contrastive, generative, predictive, and hybrid. For each category, we elucidate its concept and formulation, the involved methods, as well as its pros and cons. Furthermore, to facilitate empirical comparison, we release an open-source library SELFRec (https://github.com/Coder-Yu/SELFRec), which incorporates a wide range of SSR models and benchmark datasets. Through rigorous experiments using this library, we derive and report some significant findings regarding the selection of self-supervised signals for enhancing recommendation. Finally, we shed light on the limitations in the current research and outline the future research directions.Comment: 20 pages. Accepted by TKD

    Multi-Modal Self-Supervised Learning for Recommendation

    Full text link
    The online emergence of multi-modal sharing platforms (eg, TikTok, Youtube) is powering personalized recommender systems to incorporate various modalities (eg, visual, textual and acoustic) into the latent user representations. While existing works on multi-modal recommendation exploit multimedia content features in enhancing item embeddings, their model representation capability is limited by heavy label reliance and weak robustness on sparse user behavior data. Inspired by the recent progress of self-supervised learning in alleviating label scarcity issue, we explore deriving self-supervision signals with effectively learning of modality-aware user preference and cross-modal dependencies. To this end, we propose a new Multi-Modal Self-Supervised Learning (MMSSL) method which tackles two key challenges. Specifically, to characterize the inter-dependency between the user-item collaborative view and item multi-modal semantic view, we design a modality-aware interactive structure learning paradigm via adversarial perturbations for data augmentation. In addition, to capture the effects that user's modality-aware interaction pattern would interweave with each other, a cross-modal contrastive learning approach is introduced to jointly preserve the inter-modal semantic commonality and user preference diversity. Experiments on real-world datasets verify the superiority of our method in offering great potential for multimedia recommendation over various state-of-the-art baselines. The implementation is released at: https://github.com/HKUDS/MMSSL.Comment: This paper has been published as a full paper at WWW 202

    Biclustering fMRI time series

    Get PDF
    Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2020Biclustering é um método de análise que procura gerar clusters tendo em conta simultaneamente as linhas e as colunas de uma matriz de dados. Este método tem sido vastamente explorado em análise de dados genéticos. Apesar de diversos estudos reconhecerem as capacidades deste método de análise em outras áreas de investigação, as últimas duas décadas tem sido marcadas por um número elevado de estudos aplicados em dados genéticos e pela ausência de uma linha de investigação que explore as capacidades de biclustering fora desta área tradicional Esta tese segue pistas que sugerem potencial no uso de biclustering em dados de natureza espaço-temporal. Considerando o contexto particular das neurociências, esta tese explora as capacidades dos algoritmos de biclustering em extrair conhecimento das séries temporais geradas por técnicas de imagem por ressonância magnética funcional (fMRI). Eta tese propõe uma metodologia para avaliar a capacidade de algoritmos de biclustering em estudar dados fMRI, considerando tanto dados sintéticos como dados reais. Para avaliar estes algoritmos, usamos métricas de avaliação interna. Os nossos resultados discutem o uso de diversas estratégias de busca, revelando a superioridade de estratégias exaustivos para obter os biclusters mais homogéneos. No entanto, o elevado custo computacional de estratégias exaustivas ainda são um desafio e é necessário pesquisa adicional para a busca eficiente de biclusters no contexto de análise de dados fMRI. Propomos adicionalmente uma nova metodologia de análise de biclusters baseada em algoritmos de descoberta de padrões para determinar os padrões mais frequentes presentes nas soluções de biclustering geradas. Um bicluster não é mais que um hipervértice num hipergrafo . Extrair padrões frequentes numa solução de biclustering implica extrair os hipervértices mais significativos. Numa primeira abordagem, isto permite entender relações entre regiões do cérebro e traçar perfis temporais que métodos tradicionais de estudos de correlação não são capazes de detetar. Adicionalmente, o processo de gerar os biclusters permite filtrar ligações pouco interessantes, permitindo potencialmente gerar hipergrafos de forma eficiente. A questão final é o que podemos fazer com este conhecimento. Conhecer a relação entre regiões do cérebro é o objetivo central das neurociências. Entender as ligações entre regiões do cérebro para vários sujeitos permitem traçar perfis. Nesse caso, propomos uma metodologia para extrapolar biclusters para dados tridimensionais e efetuar triclustering. Adicionalmente, entender a ligação entre zonas cerebrais permite identificar doenças como a esquizofrenia, demência ou o Alzheimer. Este trabalho aponta caminhos para o uso de biclustering na análise de dados espaço-temporais, em particular em neurociências. A metodologia de avaliação proposta mostra evidências da eficácia do biclustering para encontrar padrões locais em dados de fMRI, embora mais trabalhos sejam necessários em relação à escalabilidade para promover a aplicação em cenários reais.The effectiveness of biclustering, simultaneous clustering of both rows and columns in a data matrix, has been primarily shown in gene expression data analysis. Furthermore, several researchers recognize its potentialities in other research areas. Nevertheless, the last two decades witnessed many biclustering algorithms targeting gene expression data analysis and a lack of consistent studies exploring the capacities of biclustering outside this traditional application domain. Following hints that suggest potentialities for biclustering on Spatiotemporal data, particularly in neurosciences, this thesis explores biclustering’s capacity to extract knowledge from fMRI time series. This thesis proposes a methodology to evaluate biclustering algorithms’ feasibility to study the fMRI signal, considering both synthetic and realworld fMRI datasets. In the absence of ground truth to compare bicluster solutions with a reference one, we used internal valuation metrics. Results discussing the use of different search strategies showed the superiority of exhaustive approaches, obtaining the most homogeneous biclusters. However, their high computational cost is still a challenge, and further work is needed for the efficient use of biclustering in fMRI data analysis. We propose a new methodology for analyzing biclusters based on performing pattern mining algorithms to determine the most frequent patterns present in the generated biclustering solutions. A bicluster is nothing more than a hyperlink in a hypergraph. Extracting frequent patterns in a biclustering solution implies extracting the most significant hyperlinks. In a first approach, this allows to understand relationships between regions of the brain and draw temporal profiles that traditional methods of correlation studies cannot detect. Additionally, the process of generating biclusters allows filtering uninteresting links, potentially allowing to generate hypergraphs efficiently. The final question is, what can we do with this knowledge. Knowing the relationship between brain regions is the central objective of neurosciences. Understanding the connections between regions of the brain for various subjects allows one to draw profiles. In this case, we propose a methodology to extrapolate biclusters to threedimensional data and perform triclustering. Additionally, understanding the link between brain zones allows identifying diseases like schizophrenia, dementia, or Alzheimer’s. This work pinpoints avenues for the use of biclustering in Spatiotemporal data analysis, in particular neurosciences applications. The proposed evaluation methodology showed evidence of biclustering’s effectiveness in finding local fMRI data patterns, although further work is needed regarding scalability to promote the application in real scenarios
    • …
    corecore