Search CORE

37,522 research outputs found

Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem

Author: Giuliani Alessandro
Livi Lorenzo
Rizzi Antonello
Publication venue: 'Elsevier BV'
Publication date: 29/04/2015
Field of study

This paper builds upon the fundamental work of Niwa et al. [34], which provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free standardized microenvironment. The hardness of the problem comes from the superposition between the driving forces of intra- and inter-molecule interactions and it is mirrored by the evidences of shift from folding to aggregation phenotypes by single-point mutations [10]. Here we apply several state-of-the-art classification methods coming from the field of structural pattern recognition, with the aim to compare different representations of the same proteins gathered from the Niwa et al. data base; such representations include sequences and labeled (contact) graphs enriched with chemico-physical attributes. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating "easily foldable" from "hardly foldable" molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution used in the various discrimination systems.Comment: 17 pages, 3 figures, 46 reference

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

On the Intrinsic Locality Properties of Web Reference Streams

Author: Abrahão Bruno
Almeida Virgílio
Crovella Mark
Fonseca Rodrigo
Publication venue: Boston University Computer Science Department
Publication date: 13/08/2002
Field of study

There has been considerable work done in the study of Web reference streams: sequences of requests for Web objects. In particular, many studies have looked at the locality properties of such streams, because of the impact of locality on the design and performance of caching and prefetching systems. However, a general framework for understanding why reference streams exhibit given locality properties has not yet emerged. In this work we take a first step in this direction, based on viewing the Web as a set of reference streams that are transformed by Web components (clients, servers, and intermediaries). We propose a graph-based framework for describing this collection of streams and components. We identify three basic stream transformations that occur at nodes of the graph: aggregation, disaggregation and filtering, and we show how these transformations can be used to abstract the effects of different Web components on their associated reference streams. This view allows a structured approach to the analysis of why reference streams show given properties at different points in the Web. Applying this approach to the study of locality requires good metrics for locality. These metrics must meet three criteria: 1) they must accurately capture temporal locality; 2) they must be independent of trace artifacts such as trace length; and 3) they must not involve manual procedures or model-based assumptions. We describe two metrics meeting these criteria that each capture a different kind of temporal locality in reference streams. The popularity component of temporal locality is captured by entropy, while the correlation component is captured by interreference coefficient of variation. We argue that these metrics are more natural and more useful than previously proposed metrics for temporal locality. We use this framework to analyze a diverse set of Web reference traces. We find that this framework can shed light on how and why locality properties vary across different locations in the Web topology. For example, we find that filtering and aggregation have opposing effects on the popularity component of the temporal locality, which helps to explain why multilevel caching can be effective in the Web. Furthermore, we find that all transformations tend to diminish the correlation component of temporal locality, which has implications for the utility of different cache replacement policies at different points in the Web.National Science Foundation (ANI-9986397, ANI-0095988); CNPq-Brazi

Boston University Institutional Repository (OpenBU)

Analysis of heat kernel highlights the strongly modular and heat-preserving structure of proteins

Author: Giuliani Alessandro
Livi Lorenzo
Maiorino Enrico
Pinna Andrea
Rizzi Antonello
Sadeghian Alireza
Publication venue: 'Elsevier BV'
Publication date: 16/03/2015
Field of study

In this paper, we study the structure and dynamical properties of protein contact networks with respect to other biological networks, together with simulated archetypal models acting as probes. We consider both classical topological descriptors, such as the modularity and statistics of the shortest paths, and different interpretations in terms of diffusion provided by the discrete heat kernel, which is elaborated from the normalized graph Laplacians. A principal component analysis shows high discrimination among the network types, either by considering the topological and heat kernel based vector characterizations. Furthermore, a canonical correlation analysis demonstrates the strong agreement among those two characterizations, providing thus an important justification in terms of interpretability for the heat kernel. Finally, and most importantly, the focused analysis of the heat kernel provides a way to yield insights on the fact that proteins have to satisfy specific structural design constraints that the other considered networks do not need to obey. Notably, the heat trace decay of an ensemble of varying-size proteins denotes subdiffusion, a peculiar property of proteins

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Proper local scoring rules on discrete sample spaces

Author: Dawid A. Philip
Lauritzen Steffen
Parry Matthew
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

A scoring rule is a loss function measuring the quality of a quoted probability distribution

Q

for a random variable

X

, in the light of the realized outcome

x

X

; it is proper if the expected score, under any distribution

P

for

X

, is minimized by quoting

Q=P

. Using the fact that any differentiable proper scoring rule on a finite sample space

{\mathcal{X}}

is the gradient of a concave homogeneous function, we consider when such a rule can be local in the sense of depending only on the probabilities quoted for points in a nominated neighborhood of

x

. Under mild conditions, we characterize such a proper local scoring rule in terms of a collection of homogeneous functions on the cliques of an undirected graph on the space

{\mathcal{X}}

. A useful property of such rules is that the quoted distribution

Q

need only be known up to a scale factor. Examples of the use of such scoring rules include Besag's pseudo-likelihood and Hyv\"{a}rinen's method of ratio matching.Comment: Published in at http://dx.doi.org/10.1214/12-AOS972 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Designing labeled graph classifiers by exploiting the R\'enyi entropy of the dissimilarity representation

Author: Livi Lorenzo
Publication venue: 'MDPI AG'
Publication date: 20/04/2017
Field of study

Representing patterns as labeled graphs is becoming increasingly common in the broad field of computational intelligence. Accordingly, a wide repertoire of pattern recognition tools, such as classifiers and knowledge discovery procedures, are nowadays available and tested for various datasets of labeled graphs. However, the design of effective learning procedures operating in the space of labeled graphs is still a challenging problem, especially from the computational complexity viewpoint. In this paper, we present a major improvement of a general-purpose classifier for graphs, which is conceived on an interplay between dissimilarity representation, clustering, information-theoretic techniques, and evolutionary optimization algorithms. The improvement focuses on a specific key subroutine devised to compress the input data. We prove different theorems which are fundamental to the setting of the parameters controlling such a compression operation. We demonstrate the effectiveness of the resulting classifier by benchmarking the developed variants on well-known datasets of labeled graphs, considering as distinct performance indicators the classification accuracy, computing time, and parsimony in terms of structural complexity of the synthesized classification models. The results show state-of-the-art standards in terms of test set accuracy and a considerable speed-up for what concerns the computing time.Comment: Revised versio

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Characterization of complex networks: A survey of measurements

Author: Altaf-Ul-Amin M
Anderberg MR
Arenas A
Baker WE
Baker WE
Baldi P
Bar-Yam Y
Barabási A-L
Barabási A-L
Batagelj V
Ben-Naim E
Benkler Y
Boccara N
Boguñá M
Bollobás B
Bollobás B
Bornholdt S
Brillouin L
Buchanan M
Bunde A
Bunde A
Carrington PJ
Castells M
Codenotti B
Costa L DA F
Csermely P
Danon L
Dawson Ross
di Bernardo M
Diestel R
Dodge M
Dodge M
Dorogovtsev SN
Duda RO
Edwards AL
Erdős P
Erdős P
F. A. Rodrigues
Fiedler M
Freeman LC
Fukunaga K
G. Travieso
Garrido PL
Hair JF
Hayes B
Hayes B
Huberman BA
Jain AK
Johnson RA
Kochen M
L. da F. Costa
McLachlan GJ
McNeill RR
Mehta ML
Messner D
Milgram S
Monasson R
Monge PR
Newman MEJ
Newman MEJ
Newman MEJ
P. R. Villas Boas
Pastor-Satorras R
Reichl LE
Reif F
Romesburg HC
Schlosser G
Scott JP
Shannon CE
Stauffer D
Stoyan D
Strogatz S
Tyler JR
Wasserman S
Watts DJ
Watts DJ
West DB
Westland C
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2005
Field of study

Each complex network (or class of networks) presents specific topological features which characterize its connectivity and highly influence the dynamics of processes executed on the network. The analysis, discrimination, and synthesis of complex networks therefore rely on the use of measurements capable of expressing the most relevant topological features. This article presents a survey of such measurements. It includes general considerations about complex network characterization, a brief review of the principal models, and the presentation of the main existing measurements. Important related issues covered in this work comprise the representation of the evolution of complex networks in terms of trajectories in several measurement spaces, the analysis of the correlations between some of the most traditional measurements, perturbation analysis, as well as the use of multivariate statistics for feature selection and network classification. Depending on the network and the analysis task one has in mind, a specific set of features may be chosen. It is hoped that the present survey will help the proper application and interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of measurements for inclusion are welcomed by the author

arXiv.org e-Print Archive

CiteSeerX

Crossref