Search CORE

18 research outputs found

An Introductory Guide to Aligning Networks Using SANA, the Simulated Annealing Network Aligner.

Author: A Chatr-Aryamontri
A Hasan
A Lesk
BH Junker
C Clark
C Mering Von
C Yang
CG El Van
D Davis
DM Prescott
EH Davidson
F Alkan
FE Faisal
FE Faisal
HT Phan
J Crawford
K Chen
K Mehlhorn
KI Smith
M Ashburner
M El-Kebir
M Kotlyar
M Malek
M Milano
M Vidal
MP Williamson
MR Garey
N Malod-Dognin
N Pržulj
N Yaveroğlu
O Fiehn
O Kuchaiev
O Kuchaiev
O Sporns
R Jaenicke
S Hashemifar
SA Cook
SF Altschul
SJ Larsen
T Hočevar
T Milenković
T Milenković
T Tokar
TA Farazi
V Saraph
V Vijayan
Publication venue: eScholarship, University of California
Publication date: 22/11/2019
Field of study

Sequence alignment has had an enormous impact on our understanding of biology, evolution, and disease. The alignment of biological networks holds similar promise. Biological networks generally model interactions between biomolecules such as proteins, genes, metabolites, or mRNAs. There is strong evidence that the network topology-the "structure" of the network-is correlated with the functions performed, so that network topology can be used to help predict or understand function. However, unlike sequence comparison and alignment-which is an essentially solved problem-network comparison and alignment is an NP-complete problem for which heuristic algorithms must be used.Here we introduce SANA, the Simulated Annealing Network Aligner. SANA is one of many algorithms proposed for the arena of biological network alignment. In the context of global network alignment, SANA stands out for its speed, memory efficiency, ease-of-use, and flexibility in the arena of producing alignments between two or more networks. SANA produces better alignments in minutes on a laptop than most other algorithms can produce in hours or days of CPU time on large server-class machines. We walk the user through how to use SANA for several types of biomolecular networks

arXiv.org e-Print Archive

Crossref

Ezid

eScholarship - University of California

An introductory guide to aligning networks using SANA, the Simulated Annealing Network Aligner

Author: A Chatr-Aryamontri
A Hasan
A Lesk
BH Junker
C Clark
C Mering Von
C Yang
CG El Van
D Davis
DM Prescott
EH Davidson
F Alkan
FE Faisal
FE Faisal
HT Phan
J Crawford
K Chen
K Mehlhorn
KI Smith
M Ashburner
M El-Kebir
M Kotlyar
M Malek
M Milano
M Vidal
MP Williamson
MR Garey
N Malod-Dognin
N Pržulj
N Yaveroğlu
O Fiehn
O Kuchaiev
O Kuchaiev
O Sporns
R Jaenicke
S Hashemifar
SA Cook
SF Altschul
SJ Larsen
T Hočevar
T Milenković
T Milenković
T Tokar
TA Farazi
V Saraph
V Vijayan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/11/2019
Field of study

Sequence alignment has had an enormous impact on our understanding of biology, evolution, and disease. The alignment of biological {\em networks} holds similar promise. Biological networks generally model interactions between biomolecules such as proteins, genes, metabolites, or mRNAs. There is strong evidence that the network topology -- the "structure" of the network -- is correlated with the functions performed, so that network topology can be used to help predict or understand function. However, unlike sequence comparison and alignment -- which is an essentially solved problem -- network comparison and alignment is an NP-complete problem for which heuristic algorithms must be used. Here we introduce SANA, the {\it Simulated Annealing Network Aligner}. SANA is one of many algorithms proposed for the arena of biological network alignment. In the context of global network alignment, SANA stands out for its speed, memory efficiency, ease-of-use, and flexibility in the arena of producing alignments between 2 or more networks. SANA produces better alignments in minutes on a laptop than most other algorithms can produce in hours or days of CPU time on large server-class machines. We walk the user through how to use SANA for several types of biomolecular networks. Availability: https://github.com/waynebhayes/SAN

arXiv.org e-Print Archive

Crossref

Big networks : a survey

Author: Bedru Hayat
Xia Feng
Xiao Xinru
Yu Shuo
Zhang Da
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

A network is a typical expressive form of representing complex systems in terms of vertices and links, in which the pattern of interactions amongst components of the network is intricate. The network can be static that does not change over time or dynamic that evolves through time. The complication of network analysis is different under the new circumstance of network size explosive increasing. In this paper, we introduce a new network science concept called a big network. A big networks is generally in large-scale with a complicated and higher-order inner structure. This paper proposes a guideline framework that gives an insight into the major topics in the area of network science from the viewpoint of a big network. We first introduce the structural characteristics of big networks from three levels, which are micro-level, meso-level, and macro-level. We then discuss some state-of-the-art advanced topics of big network analysis. Big network models and related approaches, including ranking methods, partition approaches, as well as network embedding algorithms are systematically introduced. Some typical applications in big networks are then reviewed, such as community detection, link prediction, recommendation, etc. Moreover, we also pinpoint some critical open issues that need to be investigated further. © 2020 Elsevier Inc

Federation ResearchOnline

Large-scale automated protein function prediction

Author: Kahanda Indika
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2016
Field of study

Includes bibliographical references.2016 Summer.Proteins are the workhorses of life, and identifying their functions is a very important biological problem. The function of a protein can be loosely defined as everything it performs or happens to it. The Gene Ontology (GO) is a structured vocabulary which captures protein function in a hierarchical manner and contains thousands of terms. Through various wet-lab experiments over the years scientists have been able to annotate a large number of proteins with GO categories which reflect their functionality. However, experimentally determining protein functions is a highly resource-intensive task, and a large fraction of proteins remain un-annotated. Recently a plethora automated methods have emerged and their reasonable success in computationally determining the functions of proteins using a variety of data sources – by sequence/structure similarity or using various biological network data, has led to establishing automated function prediction (AFP) as an important problem in bioinformatics. In a typical machine learning problem, cross-validation is the protocol of choice for evaluating the accuracy of a classifier. But, due to the process of accumulation of annotations over time, we identify the AFP as a combination of two sub-tasks: making predictions on annotated proteins and making predictions on previously unannotated proteins. In our first project, we analyze the performance of several protein function prediction methods in these two scenarios. Our results show that GOstruct, an AFP method that our lab has previously developed, and two other popular methods: binary SVMs and guilt by association, find it hard to achieve the same level of accuracy on these two tasks compared to the performance evaluated through cross-validation, and that predicting novel annotations for previously annotated proteins is a harder problem than predicting annotations for uncharacterized proteins. We develop GOstruct 2.0 by proposing improvements which allows the model to make use of information of a protein's current annotations to better handle the task of predicting novel annotations for previously annotated proteins. Experimental results on yeast and human data show that GOstruct 2.0 outperforms the original GOstruct, demonstrating the effectiveness of the proposed improvements. Although the biomedical literature is a very informative resource for identifying protein function, most AFP methods do not take advantage of the large amount of information contained in it. In our second project, we conduct the first ever comprehensive evaluation on the effectiveness of literature data for AFP. Specifically, we extract co-mentions of protein-GO term pairs and bag-of-words features from the literature and explore their effectiveness in predicting protein function. Our results show that literature features are very informative of protein function but with further room for improvement. In order to improve the quality of automatically extracted co-mentions, we formulate the classification of co-mentions as a supervised learning problem and propose a novel method based on graph kernels. Experimental results indicate the feasibility of using this co-mention classifier as a complementary method that aids the bio-curators who are responsible for maintaining databases such as Gene Ontology. This is the first study of the problem of protein-function relation extraction from biomedical text. The recently developed human phenotype ontology (HPO), which is very similar to GO, is a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. At present, only a small fraction of human protein coding genes have HPO annotations. But, researchers believe that a large portion of currently unannotated genes are related to disease phenotypes. Therefore, it is important to predict gene-HPO term associations using accurate computational methods. In our third project, we introduce PHENOstruct, a computational method that directly predicts the set of HPO terms for a given gene. We compare PHENOstruct with several baseline methods and show that it outperforms them in every respect. Furthermore, we highlight a collection of informative data sources suitable for the problem of predicting gene-HPO associations, including large scale literature mining data

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Streaming Social Event Detection and Evolution Discovery in Heterogeneous Information Networks

Author: He L
Li J
Peng H
Ranjan R
Song Y
Yang R
Yu PS
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2021
Field of study

Events are happening in real world and real time, which can be planned and organized for occasions, such as social gatherings, festival celebrations, influential meetings, or sports activities. Social media platforms generate a lot of real-time text information regarding public events with different topics. However, mining social events is challenging because events typically exhibit heterogeneous texture and metadata are often ambiguous. In this article, we first design a novel event-based meta-schema to characterize the semantic relatedness of social events and then build an event-based heterogeneous information network (HIN) integrating information from external knowledge base. Second, we propose a novel Pairwise Popularity Graph Convolutional Network, named as PP-GCN, based on weighted meta-path instance similarity and textual semantic representation as inputs, to perform fine-grained social event categorization and learn the optimal weights of meta-paths in different tasks. Third, we propose a streaming social event detection and evolution discovery framework for HINs based on meta-path similarity search, historical information about meta-paths, and heterogeneous DBSCAN clustering method. Comprehensive experiments on real-world streaming social text data are conducted to compare various social event detection and evolution discovery algorithms. Experimental results demonstrate that our proposed framework outperforms other alternative social event detection and evolution discovery techniques

arXiv.org e-Print Archive

White Rose Research Online

Sparsity-aware neural user behavior modeling in online interaction platforms

Author: Sankar Aravind
Publication venue
Publication date: 01/12/2021
Field of study

Modern online platforms offer users an opportunity to participate in a variety of content-creation, social networking, and shopping activities. With the rapid proliferation of such online services, learning data-driven user behavior models is indispensable to enable personalized user experiences. Recently, representation learning has emerged as an effective strategy for user modeling, powered by neural networks trained over large volumes of interaction data. Despite their enormous potential, we encounter the unique challenge of data sparsity for a vast majority of entities, e.g., sparsity in ground-truth labels for entities and in entity-level interactions (cold-start users, items in the long-tail, and ephemeral groups). In this dissertation, we develop generalizable neural representation learning frameworks for user behavior modeling designed to address different sparsity challenges across applications. Our problem settings span transductive and inductive learning scenarios, where transductive learning models entities seen during training and inductive learning targets entities that are only observed during inference. We leverage different facets of information reflecting user behavior (e.g., interconnectivity in social networks, temporal and attributed interaction information) to enable personalized inference at scale. Our proposed models are complementary to concurrent advances in neural architectural choices and are adaptive to the rapid addition of new applications in online platforms. First, we examine two transductive learning settings: inference and recommendation in graph-structured and bipartite user-item interactions. In chapter 3, we formulate user profiling in social platforms as semi-supervised learning over graphs given sparse ground-truth labels for node attributes. We present a graph neural network framework that exploits higher-order connectivity structures (network motifs) to learn attributed structural roles of nodes that identify structurally similar nodes with co-varying local attributes. In chapter 4, we design neural collaborative filtering models for few-shot recommendations over user-item interactions. To address item interaction sparsity due to heavy-tailed distributions, our proposed meta-learning framework learns-to-recommend few-shot items by knowledge transfer from arbitrary base recommenders. We show that our framework consistently outperforms state-of-art approaches on overall recommendation (by 5% Recall) while achieving significant gains (of 60-80% Recall) for tail items with fewer than 20 interactions. Next, we explored three inductive learning settings: modeling spread of user-generated content in social networks; item recommendations for ephemeral groups; and friend ranking in large-scale social platforms. In chapter 5, we focus on diffusion prediction in social networks where a vast population of users rarely post content. We introduce a deep generative modeling framework that models users as probability distributions in the latent space with variational priors parameterized by graph neural networks. Our approach enables massive performance gains (over 150% recall) for users with sparse activities while being faster than state-of-the-art neural models by an order of magnitude. In chapter 6, we examine item recommendations for ephemeral groups with limited or no historical interactions together. To overcome group interaction sparsity, we present self-supervised learning strategies that exploit the preference co-variance in observed group memberships for group recommender training. Our framework achieves significant performance gains (over 30% NDCG) over prior state-of-the-art group recommendation models. In chapter 7, we introduce multi-modal inference with graph neural networks that captures knowledge from multiple feature modalities and user interactions for multi-faceted friend ranking. Our approach achieves notable higher performance gains for critical populations of less-active and low degree users

Illinois Digital Environment for Access to Learning and Scholarship Repository

Automatic & Semi-Automatic Methods for Supporting Ontology Change

Author: Lösch Uta
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2012
Field of study

KITopen