Search CORE

73,620 research outputs found

Interpretable Network Representations

Author: Jin Shengmin
Publication venue: SURFACE at Syracuse University
Publication date: 16/12/2022
Field of study

Networks (or interchangeably graphs) have been ubiquitous across the globe and within science and engineering: social networks, collaboration networks, protein-protein interaction networks, infrastructure networks, among many others. Machine learning on graphs, especially network representation learning, has shown remarkable performance in network-based applications, such as node/graph classification, graph clustering, and link prediction. Like performance, it is equally crucial for individuals to understand the behavior of machine learning models and be able to explain how these models arrive at a certain decision. Such needs have motivated many studies on interpretability in machine learning. For example, for social network analysis, we may need to know the reasons why certain users (or groups) are classified or clustered together by the machine learning models, or why a friend recommendation system considers some users similar so that they are recommended to connect with each other. Therefore, an interpretable network representation is necessary and it should carry the graph information to a level understandable by humans. Here, we first introduce our method on interpretable network representations: the network shape. It provides a framework to represent a network with a 3-dimensional shape, and one can customize network shapes for their need, by choosing various graph sampling methods, 3D network embedding methods and shape-fitting methods. In this thesis, we introduce the two types of network shape: a Kronecker hull which represents a network as a 3D convex polyhedron using stochastic Kronecker graphs as the network embedding method, and a Spectral Path which represents a network as a 3D path connecting the spectral moments of the network and its subgraphs. We demonstrate that network shapes can capture various properties of not only the network, but also its subgraphs. For instance, they can provide the distribution of subgraphs within a network, e.g., what proportion of subgraphs are structurally similar to the whole network? Network shapes are interpretable on different levels, so one can quickly understand the structural properties of a network and its subgraphs by its network shape. Using experiments on real-world networks, we demonstrate that network shapes can be used in various applications, including (1) network visualization, the most intuitive way for users to understand a graph; (2) network categorization (e.g., is this a social or a biological network?); (3) computing similarity between two graphs. Moreover, we utilize network shapes to extend biometrics studies to network data, by solving two problems: network identification (Given an anonymized graph, can we identify the network from which it is collected? i.e., answering questions such as ``where is this anonymized graph sampled from, Twitter or Facebook? ) and network authentication (If one claims the graph is sampled from a certain network, can we verify this claim?). The overall objective of the thesis is to provide a compact, interpretable, visualizable, comparable and efficient representation of networks

Syracuse University Research Facility and Collaborative Environment

Physics-based visual characterization of molecular interaction forces

Author: Estrada Jorge
Guallar Víctor
Hermosilla Pedro
Ropinski Timo
Vinacua Pla Álvaro
Vázquez Alcocer Pere Pau
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Molecular simulations are used in many areas of biotechnology, such as drug design and enzyme engineering. Despite the development of automatic computational protocols, analysis of molecular interactions is still a major aspect where human comprehension and intuition are key to accelerate, analyze, and propose modifications to the molecule of interest. Most visualization algorithms help the users by providing an accurate depiction of the spatial arrangement: the atoms involved in inter-molecular contacts. There are few tools that provide visual information on the forces governing molecular docking. However, these tools, commonly restricted to close interaction between atoms, do not consider whole simulation paths, long-range distances and, importantly, do not provide visual cues for a quick and intuitive comprehension of the energy functions (modeling intermolecular interactions) involved. In this paper, we propose visualizations designed to enable the characterization of interaction forces by taking into account several relevant variables such as molecule-ligand distance and the energy function, which is essential to understand binding affinities. We put emphasis on mapping molecular docking paths obtained from Molecular Dynamics or Monte Carlo simulations, and provide time-dependent visualizations for different energy components and particle resolutions: atoms, groups or residues. The presented visualizations have the potential to support domain experts in a more efficient drug or enzyme design process.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Automated protein structure calculation from NMR data

Author: Craven C.J.
Williamson M.P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/01/2009
Field of study

Current software is almost at the stage to permit completely automatic structure determination of small proteins of < 15 kDa, from NMR spectra to structure validation with minimal user interaction. This goal is welcome, as it makes structure calculation more objective and therefore more easily validated, without any loss in the quality of the structures generated. Moreover, it releases expert spectroscopists to carry out research that cannot be automated. It should not take much further effort to extend automation to ca 20 kDa. However, there are technological barriers to further automation, of which the biggest are identified as: routines for peak picking; adoption and sharing of a common framework for structure calculation, including the assembly of an automated and trusted package for structure validation; and sample preparation, particularly for larger proteins. These barriers should be the main target for development of methodology for protein structure determination, particularly by structural genomics consortia

White Rose Research Online