Search CORE

4,144 research outputs found

New error measures and methods for realizing protein graphs from distance data

Author: D'Ambrosio Claudia
Lavor Carlile
Liberti Leo
Maculan Nelson
Vu Ky
Publication venue
Publication date: 04/07/2016
Field of study

The interval Distance Geometry Problem (iDGP) consists in finding a realization in

\mathbb{R}^K

of a simple undirected graph

G=(V,E)

with nonnegative intervals assigned to the edges in such a way that, for each edge, the Euclidean distance between the realization of the adjacent vertices is within the edge interval bounds. In this paper, we focus on the application to the conformation of proteins in space, which is a basic step in determining protein function: given interval estimations of some of the inter-atomic distances, find their shape. Among different families of methods for accomplishing this task, we look at mathematical programming based methods, which are well suited for dealing with intervals. The basic question we want to answer is: what is the best such method for the problem? The most meaningful error measure for evaluating solution quality is the coordinate root mean square deviation. We first introduce a new error measure which addresses a particular feature of protein backbones, i.e. many partial reflections also yield acceptable backbones. We then present a set of new and existing quadratic and semidefinite programming formulations of this problem, and a set of new and existing methods for solving these formulations. Finally, we perform a computational evaluation of all the feasible solver

+

formulation combinations according to new and existing error measures, finding that the best methodology is a new heuristic method based on multiplicative weights updates

arXiv.org e-Print Archive

Repositorio da Producao Cientifica e Intelectual da Unicamp

HAL-Polytechnique

Euclidean distance geometry and applications

Author: Lavor Carlile
Liberti Leo
Maculan Nelson
Mucherino Antonio
Publication venue
Publication date: 02/05/2012
Field of study

Euclidean distance geometry is the study of Euclidean geometry based on the concept of distance. This is useful in several applications where the input data consists of an incomplete set of distances, and the output is a set of points in Euclidean space that realizes the given distances. We survey some of the theory of Euclidean distance geometry and some of the most important applications: molecular conformation, localization of sensor networks and statics.Comment: 64 pages, 21 figure

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

INRIA a CCSD electronic archive server

Repositorio da Producao Cientifica e Intelectual da Unicamp

HAL-Polytechnique

HAL-Rennes 1

A Metric for genus-zero surfaces

Author: Hass Joel
Koehl Patrice
Publication venue
Publication date: 02/07/2015
Field of study

We present a new method to compare the shapes of genus-zero surfaces. We introduce a measure of mutual stretching, the symmetric distortion energy, and establish the existence of a conformal diffeomorphism between any two genus-zero surfaces that minimizes this energy. We then prove that the energies of the minimizing diffeomorphisms give a metric on the space of genus-zero Riemannian surfaces. This metric and the corresponding optimal diffeomorphisms are shown to have properties that are highly desirable for applications.Comment: 33 pages, 8 figure

arXiv.org e-Print Archive

eScholarship - University of California

Leveraging Citation Networks to Visualize Scholarly Influence Over Time

Author: Hullman Jessica
Portenoy Jason
West Jevin D.
Publication venue: 'Frontiers Media SA'
Publication date: 05/12/2016
Field of study

Assessing the influence of a scholar's work is an important task for funding organizations, academic departments, and researchers. Common methods, such as measures of citation counts, can ignore much of the nuance and multidimensionality of scholarly influence. We present an approach for generating dynamic visualizations of scholars' careers. This approach uses an animated node-link diagram showing the citation network accumulated around the researcher over the course of the career in concert with key indicators, highlighting influence both within and across fields. We developed our design in collaboration with one funding organization---the Pew Biomedical Scholars program---but the methods are generalizable to visualizations of scholarly influence. We applied the design method to the Microsoft Academic Graph, which includes more than 120 million publications. We validate our abstractions throughout the process through collaboration with the Pew Biomedical Scholars program officers and summative evaluations with their scholars

arXiv.org e-Print Archive

Directory of Open Access Journals

Frontiers - Publisher Connector

kLog: A Language for Logical and Relational Learning with Kernels

Author: Altun
Ando
Antanas
Antanas
Antanas
Argyriou
Blockeel
Blockeel
Bottou
Boulicaut
Bröcheler
Ceroni
Chang
Chang
Cook
Costa
Costa
De
De Grave
De Grave
De Raedt
De Raedt
De Raedt
Dietterich
Dietterich
Evgeniou
Fabrizio Costa
Frasconi
Frasconi
Friedman
Gross
Gärtner
Gärtner
Haussler
Heckerman
Helma
Helma
Horváth
Joachims
Kazius
Kersting
Kersting
Kersting
Kimmig
Koller
Kordjamshidi
Kou
Kramer
Kurt De Grave
Lanckriet
Landwehr
Lao
Lari
London
Lowd
Luc De Raedt
Luks
Macskassy
Mahe
McCallum
McKay
Menchetti
Mitchell
Muggleton
Muggleton
Neville
Ng
Paolo Frasconi
Quinlan
Ralaivola
Richardson
Rizzolo
Rossi
Serebrenik
Shervashidze
Shi
Sorlin
Srinivasan
Srinivasan
Sun
Sutton
Taskar
Taskar
Tsochantaridis
van de Waterbeemd
Vazquez
Verbeke
Verbeke
Vishwanathan
Wachman
Wang
Wolpert
Yan
Publication venue: 'Elsevier BV'
Publication date: 28/07/2014
Field of study

We introduce kLog, a novel approach to statistical relational learning. Unlike standard approaches, kLog does not represent a probability distribution directly. It is rather a language to perform kernel-based learning on expressive logical and relational representations. kLog allows users to specify learning problems declaratively. It builds on simple but powerful concepts: learning from interpretations, entity/relationship data modeling, logic programming, and deductive databases. Access by the kernel to the rich representation is mediated by a technique we call graphicalization: the relational representation is first transformed into a graph --- in particular, a grounded entity/relationship diagram. Subsequently, a choice of graph kernel defines the feature space. kLog supports mixed numerical and symbolic data, as well as background knowledge in the form of Prolog or Datalog programs as in inductive logic programming systems. The kLog framework can be applied to tackle the same range of tasks that has made statistical relational learning so popular, including classification, regression, multitask learning, and collective classification. We also report about empirical comparisons, showing that kLog can be either more accurate, or much faster at the same level of accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at http://klog.dinfo.unifi.it along with tutorials

arXiv.org e-Print Archive

Lirias

Crossref

Consistency of spectral clustering in stochastic block models

Author: Lei Jing
Rinaldo Alessandro
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 30/12/2014
Field of study

We analyze the performance of spectral clustering for community extraction in stochastic block models. We show that, under mild conditions, spectral clustering applied to the adjacency matrix of the network can consistently recover hidden communities even when the order of the maximum expected degree is as small as

\log n

, with

n

the number of nodes. This result applies to some popular polynomial time spectral clustering algorithms and is further extended to degree corrected stochastic block models using a spherical

k

-median spectral clustering method. A key component of our analysis is a combinatorial bound on the spectrum of binary random matrices, which is sharper than the conventional matrix Bernstein inequality and may be of independent interest.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1274 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Geometric, Feature-based and Graph-based Approaches for the Structural Analysis of Protein Binding Sites : Novel Methods and Computational Analysis

Author: Fober Thomas
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2013
Field of study

In this thesis, protein binding sites are considered. To enable the extraction of information from the space of protein binding sites, these binding sites must be mapped onto a mathematical space. This can be done by mapping binding sites onto vectors, graphs or point clouds. To finally enable a structure on the mathematical space, a distance measure is required, which is introduced in this thesis. This distance measure eventually can be used to extract information by means of data mining techniques

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Graph-Based Approaches to Protein StructureComparison - From Local to Global Similarity

Author: Mernberger Marco
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2011
Field of study

The comparative analysis of protein structure data is a central aspect of structural bioinformatics. Drawing upon structural information allows the inference of function for unknown proteins even in cases where no apparent homology can be found on the sequence level. Regarding the function of an enzyme, the overall fold topology might less important than the specific structural conformation of the catalytic site or the surface region of a protein, where the interaction with other molecules, such as binding partners, substrates and ligands occurs. Thus, a comparison of these regions is especially interesting for functional inference, since structural constraints imposed by the demands of the catalyzed biochemical function make them more likely to exhibit structural similarity. Moreover, the comparative analysis of protein binding sites is of special interest in pharmaceutical chemistry, in order to predict cross-reactivities and gain a deeper understanding of the catalysis mechanism. From an algorithmic point of view, the comparison of structured data, or, more generally, complex objects, can be attempted based on different methodological principles. Global methods aim at comparing structures as a whole, while local methods transfer the problem to multiple comparisons of local substructures. In the context of protein structure analysis, it is not a priori clear, which strategy is more suitable. In this thesis, several conceptually different algorithmic approaches have been developed, based on local, global and semi-global strategies, for the task of comparing protein structure data, more specifically protein binding pockets. The use of graphs for the modeling of protein structure data has a long standing tradition in structural bioinformatics. Recently, graphs have been used to model the geometric constraints of protein binding sites. The algorithms developed in this thesis are based on this modeling concept, hence, from a computer scientist's point of view, they can also be regarded as global, local and semi-global approaches to graph comparison. The developed algorithms were mainly designed on the premise to allow for a more approximate comparison of protein binding sites, in order to account for the molecular flexibility of the protein structures. A main motivation was to allow for the detection of more remote similarities, which are not apparent by using more rigid methods. Subsequently, the developed approaches were applied to different problems typically encountered in the field of structural bioinformatics in order to assess and compare their performance and suitability for different problems. Each of the approaches developed during this work was capable of improving upon the performance of existing methods in the field. Another major aspect in the experiments was the question, which methodological concept, local, global or a combination of both, offers the most benefits for the specific task of protein binding site comparison, a question that is addressed throughout this thesis

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg