Search CORE

134 research outputs found

Exploiting Latent Features of Text and Graphs

Author: Sybrandt Justin George
Publication venue: Clemson University Libraries
Publication date: 01/05/2020
Field of study

As the size and scope of online data continues to grow, new machine learning techniques become necessary to best capitalize on the wealth of available information. However, the models that help convert data into knowledge require nontrivial processes to make sense of large collections of text and massive online graphs. In both scenarios, modern machine learning pipelines produce embeddings --- semantically rich vectors of latent features --- to convert human constructs for machine understanding. In this dissertation we focus on information available within biomedical science, including human-written abstracts of scientific papers, as well as machine-generated graphs of biomedical entity relationships. We present the Moliere system, and our method for identifying new discoveries through the use of natural language processing and graph mining algorithms. We propose heuristically-based ranking criteria to augment Moliere, and leverage this ranking to identify a new gene-treatment target for HIV-associated Neurodegenerative Disorders. We additionally focus on the latent features of graphs, and propose a new bipartite graph embedding technique. Using our graph embedding, we advance the state-of-the-art in hypergraph partitioning quality. Having newfound intuition of graph embeddings, we present Agatha, a deep-learning approach to hypothesis generation. This system learns a data-driven ranking criteria derived from the embeddings of our large proposed biomedical semantic graph. To produce human-readable results, we additionally propose CBAG, a technique for conditional biomedical abstract generation

Clemson University: TigerPrints

Hypergraph models of biological networks to identify genes critical to pathogenic viral response

Author: Diamond Michael S
et al
Feng Song
Tan Qing
Thackray Larissa B
Publication venue: Digital Commons@Becker
Publication date: 29/05/2021
Field of study

BACKGROUND: Representing biological networks as graphs is a powerful approach to reveal underlying patterns, signatures, and critical components from high-throughput biomolecular data. However, graphs do not natively capture the multi-way relationships present among genes and proteins in biological systems. Hypergraphs are generalizations of graphs that naturally model multi-way relationships and have shown promise in modeling systems such as protein complexes and metabolic reactions. In this paper we seek to understand how hypergraphs can more faithfully identify, and potentially predict, important genes based on complex relationships inferred from genomic expression data sets. RESULTS: We compiled a novel data set of transcriptional host response to pathogenic viral infections and formulated relationships between genes as a hypergraph where hyperedges represent significantly perturbed genes, and vertices represent individual biological samples with specific experimental conditions. We find that hypergraph betweenness centrality is a superior method for identification of genes important to viral response when compared with graph centrality. CONCLUSIONS: Our results demonstrate the utility of using hypergraphs to represent complex biological systems and highlight central important responses in common to a variety of highly pathogenic viruses

Digital Commons@Becker

Hypergraph models of biological networks to identify genes critical to pathogenic viral response

Background: Representing biological networks as graphs is a powerful approach to reveal underlying patterns, signatures, and critical components from high-throughput biomolecular data. However, graphs do not natively capture the multi-way relationships present among genes and proteins in biological systems. Hypergraphs are generalizations of graphs that naturally model multi-way relationships and have shown promise in modeling systems such as protein complexes and metabolic reactions. In this paper we seek to understand how hypergraphs can more faithfully identify, and potentially predict, important genes based on complex relationships inferred from genomic expression data sets. Results: We compiled a novel data set of transcriptional host response to pathogenic viral infections and formulated relationships between genes as a hypergraph where hyperedges represent significantly perturbed genes, and vertices represent individual biological samples with specific experimental conditions. We find that hypergraph betweenness centrality is a superior method for identification of genes important to viral response when compared with graph centrality. Conclusions: Our results demonstrate the utility of using hypergraphs to represent complex biological systems and highlight central important responses in common to a variety of highly pathogenic viruses

arXiv.org e-Print Archive

PubMed Central

Carolina Digital Repository

Recommender Systems

Author: Adamic
Adomavicius
Agarwal
Albert
Anderson
Arndt
Balabanović
Barabási
Basu
Bell
Berge
Billsus
Blattner
Blei
Blei
Boccaletti
Bollobás
Bollobás
Bollé
Bollé
Bone
Bonhard
Bouchaud
Breiman
Brin
Brynjolfsson
Buckley
Buckley
Burkard
Burke
Burke
Burke
Cacheda
Caldarelli
Campos
Candés
Candés
Carlin
Castellano
Castells
Cattuto
Cattuto
Chebotarev
Chen
Chevalier
Chi Ho Yeung
Cho
Chou
Cimini
Clauset
Claypool
Cooke
Costa
Dellarocas
Dellarocas
Ding
Dorogovtsev
Ellero
Erdös
Esslimani
Euler
Fortunato
Fouss
Franceschet
Gao
Geman
Gemulla
Ghoshal
Golbeck
Goldberg
Goldberg
Goldstein
Griffiths
Grujić
Gualdi
Gualdi
Guo
Gupta
Hagel
Hanely
He
Herlocker
Herlocker
Herlocker
Herr
Hofmann
Hofmann
Holme
Holmes
Hotho
Hu
Huang
Huang
Huang
Hurley
Hwang
Hwang
Jaccard
Jamali
Jansen
Jeh
Jeong
Jia
Jin
Järvelin
Jøsang
Katz
Kendall
Keshavan
Keshavan
Klamt
Klein
Kobsa
Kolda
Kong
Koren
Koren
Koren
Kwak
Laherrère
Lam
Lambiotte
Lambiotte
Lathia
Lathia
Latora
Laureti
Leicht
Leskovec
Liben-Nowell
Linden
Linyuan Lü
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Liu
Lü
Lü
Lü
Lü
Ma
Mantegna
Maslov
Massa
Massa
Matúš Medo
Mcnee
Medo
Medo
Medo
Melville
Mika
Milgram
Min
Mobasher
Moffat
Moreno
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Newman
Palla
Pan
Pan
Pastor-Satorras
Pastor-Satorras
Pazzani
Pazzani
Pazzani
Phelps
Popescul
Qiu
Quillian
Ravasz
Ren
Resnick
Resnick
Rodgers
Romero
Sabater
Salganik
Salter
Salton
Schafer
Schein
Shang
Shang
Shang
Shang
Shardanand
Si
Simmel
Smyth
Song
Song
Spearman
Stojmirović
Su
Sun
Symeonidis
Symeonidis
Sørensen
Tang
Tao Zhou
Taramasco
Tong
Tribus
Tso
Turner
van Rijsbergen
Vazquez
Vespignani
Vig
Vázquez
Vázquez
Walter
Wang
Wang
Wang
Wasserman
Watts
Watts
Wei
Weibull
Witten
Wu
Xiang
Xuan
Yang
Yao
Yedidia
Yeung
Yeung
Yi-Cheng Zhang
Yin
Yu
Zeng
Zeng
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhao
Zheng
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zhou
Zi-Ke Zhang
Ziegler
Ziegler
Zlatić
Publication venue: 'Elsevier BV'
Publication date: 06/02/2012
Field of study

The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has a great scientific depth and combines diverse research fields which makes it of interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports

arXiv.org e-Print Archive

Crossref

Aston Publications Explorer

RERO DOC Digital Library

Multi-layered HITS on Multi-sourced Networks

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Network mining has been attracting a lot of research attention because of the prevalence of networks. As the world is becoming increasingly connected and correlated, networks arising from inter-dependent application domains are often collected from different sources, forming the so-called multi-sourced networks. Examples of such multi-sourced networks include critical infrastructure networks, multi-platform social networks, cross-domain collaboration networks, and many more. Compared with single-sourced network, multi-sourced networks bear more complex structures and therefore could potentially contain more valuable information. This thesis proposes a multi-layered HITS (Hyperlink-Induced Topic Search) algorithm to perform the ranking task on multi-sourced networks. Specifically, each node in the network receives an authority score and a hub score for evaluating the value of the node itself and the value of its outgoing links respectively. Based on a recent multi-layered network model, which allows more flexible dependency structure across different sources (i.e., layers), the proposed algorithm leverages both within-layer smoothness and cross-layer consistency. This essentially allows nodes from different layers to be ranked accordingly. The multi-layered HITS is formulated as a regularized optimization problem with non-negative constraint and solved by an iterative update process. Extensive experimental evaluations demonstrate the effectiveness and explainability of the proposed algorithm.Dissertation/ThesisMasters Thesis Computer Science 201

ASU Digital Repository

Neighborhood based computational approaches for the prediction of lncRNA-disease associations

Author: Bonomo M.
Rombo S. E.
Publication venue: BioMed Central Ltd
Publication date: 13/05/2024
Field of study

Motivation: Long non-coding RNAs (lncRNAs) are a class of molecules involved in important biological processes. Extensive efforts have been provided to get deeper understanding of disease mechanisms at the lncRNA level, guiding towards the detection of biomarkers for disease diagnosis, treatment, prognosis and prevention. Unfortunately, due to costs and time complexity, the number of possible disease-related lncRNAs verified by traditional biological experiments is very limited. Computational approaches for the prediction of disease-lncRNA associations allow to identify the most promising candidates to be verified in laboratory, reducing costs and time consuming. Results: We propose novel approaches for the prediction of lncRNA-disease associations, all sharing the idea of exploring associations among lncRNAs, other intermediate molecules (e.g., miRNAs) and diseases, suitably represented by tripartite graphs. Indeed, while only a few lncRNA-disease associations are still known, plenty of interactions between lncRNAs and other molecules, as well as associations of the latters with diseases, are available. A first approach presented here, NGH, relies on neighborhood analysis performed on a tripartite graph, built upon lncRNAs, miRNAs and diseases. A second approach (CF) relies on collaborative filtering; a third approach (NGH-CF) is obtained boosting NGH by collaborative filtering. The proposed approaches have been validated on both synthetic and real data, and compared against other methods from the literature. It results that neighborhood analysis allows to outperform competitors, and when it is combined with collaborative filtering the prediction accuracy further improves, scoring a value of AUC equal to 0966

Archivio istituzionale della ricerca - Università di Palermo

Descoberta de conhecimento biomédico através de representações continuas de grafos multi-relacionais

Author: Pereira Rodrigo Amaral Ribeiro
Publication venue
Publication date: 17/02/2021
Field of study

Knowledge graphs are multi-relational graph structures that allow to organize data in a way that is not only query able but that also allows the inference of implicit knowledge by both humans and, particularly, machines. In recent years new methods have been developed in order to maximize the knowledge that can be extracted from these structures, especially in the machine learning field. Knowledge graph embedding (KGE) strategies allow to map the data of these graphs to a lower dimensional space to facilitate the application of downstream tasks such as link prediction or node classification. In this work the capabilities and limitations of using these techniques to derive new knowledge from pre-existing biomedical networks was explored, since this is a field that not only has seen efforts towards converting its large knowledge bases into knowledge graphs, but that also can make use of the predictive capabilities of these models in order to accelerate research in the field. In order to do so, several KGE models were studied and a pipeline was created in order to obtain and train such models on different biomedical datasets. The results show that these models can make accurate predictions on some datasets, but that their performance can be hampered by some inherent characteristics of the networks. Additionally, with the knowledge acquired during this research a notebook was created that aims to be an entry point to other researchers interested in exploring this field.Grafos de conhecimento são grafos multi-relacionais que permitem organizar informação de maneira a que esta seja não apenas passível de ser inquirida, mas que também permita a inferência logica de nova informação por parte de humanos e especialmente sistemas computacionais. Recentemente vários métodos têm vindo a ser criados de maneira a maximizar a informação que pode ser retirada destas estruturas, sendo a área de \Machine Learning" um dos grandes propulsores para tal. \Knowledge graph embeddings" (KGE) permitem que os componentes destes grafos sejam mapeados num espaço latente, de maneira a facilitar a aplicação de tarefas como a predição de novas ligações no grafo ou classificação de nós. Neste trabalho foram exploradas as capacidades e limitações da aplicação de modelos baseados em \Knowledge graph embeddings" a redes biomédicas existentes, dado que a biomedicina é uma área na qual têm sido feitos esforços no sentido de organizar a sua vasta base de conhecimento em grafos de conhecimento, e onde esta capacidade de predição pode ser usada para potenciar avanços nos seus diversos domínios. Para tal, no presente trabalho, vários modelos foram estudados e uma pipeline foi criada para treinar os mesmos sobre algumas redes biomédicas. Os resultados mostram que estes modelos conseguem de facto ser precisos no que diz respeito á tarefa de predição de ligações em alguns conjuntos de dados, contudo esta precisão aparenta ser afetada por características inerentes à estrutura do grafo. Adicionalmente, com o conhecimento adquirido durante a realização deste trabalho foi criado um \notebook" que tem como objetivo servir como uma introdução à área de \Knowledge graph embeddings" para investigadores interessados em explorar a mesma.Mestrado em Engenharia de Computadores e Telemátic

Repositório Institucional da Universidade de Aveiro

Multiview physician-specific attributes fusion for health seeking

Author: Chang Xiaojun
Liu Maofu
Nie Liqiang
Shao Ling
Yan Yan
Zhang Luming
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/06/2016
Field of study

Community-based health services have risen as important online resources for resolving users health concerns. Despite the value, the gap between what health seekers with specific health needs and what busy physicians with specific attitudes and expertise can offer is being widened. To bridge this gap, we present a question routing scheme that is able to connect health seekers to the right physicians. In this scheme, we first bridge the expertise matching gap via a probabilistic fusion of the physician-expertise distribution and the expertise-question distribution. The distributions are calculated by hypergraph-based learning and kernel density estimation. We then measure physicians attitudes toward answering general questions from the perspectives of activity, responsibility, reputation, and willingness. At last, we adaptively fuse the expertise modeling and attitude modeling by considering the personal needs of the health seekers. Extensive experiments have been conducted on a real-world dataset to validate our proposed scheme

Crossref

OPUS - University of Technology Sydney

University of East Anglia digital repository