Search CORE

2,195 research outputs found

Inference of hidden structures in complex physical systems by multi-scale clustering

Author: A Cardillo
A Lancichinetti
A Montanari
AA Abin
Andrea Montanari
C Dasgupta
CA Angell
CL Henley
D Hu
DA Keen
Dandan Hu
DJ Sordelet
DS Bassett
F Cerina
G Bianconi
G Petri
G Tarjus
Greg Ver Steeg
H Dandan
H Dandan
HW Sheng
J Reichardt
J Reichardt
J Saida
J Villain
J-P Bouchaud
J. Dana. Honeycutt
JL Finney
JM Kumpula
L Berthier
L Wang
M Meil
M Mezard
M Mitchell
M Mosayebi
M Rosvall
Manlio De Domenico
MEJ Newman
MEJ Newman
MEJ Newman
MEJ Newman
MEJ Newman
O Melchert
P Holme
P Ronhovde
P Ronhovde
P Ronhovde
P Ronhovde
P Ronhovde
PG Wolynes
PJ Steinhardt
R Monasson
Richard K. Darst
RK Darst
RL McGreevy
S Fortunato
S Fortunato
S Karmakar
S Wiseman
S. Kirkpatrick
SY Wang
T Nakamura
TR Kirkpatrick
V Gudkov
V Lubchenko
VD Blondel
W Kob
WH Zachariasen
Z Nussinov
Z Nussinov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/01/2016
Field of study

We survey the application of a relatively new branch of statistical physics--"community detection"-- to data mining. In particular, we focus on the diagnosis of materials and automated image segmentation. Community detection describes the quest of partitioning a complex system involving many elements into optimally decoupled subsets or communities of such elements. We review a multiresolution variant which is used to ascertain structures at different spatial and temporal scales. Significant patterns are obtained by examining the correlations between different independent solvers. Similar to other combinatorial optimization problems in the NP complexity class, community detection exhibits several phases. Typically, illuminating orders are revealed by choosing parameters that lead to extremal information theory correlations.Comment: 25 pages, 16 Figures; a review of earlier work

arXiv.org e-Print Archive

Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

Author: Korenblum Daniel
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

arXiv.org e-Print Archive

Directory of Open Access Journals

Applications of graph theory to landscape genetics

Author: Achard
Albert
Arthur
Barabasi
Bowman
Bowman
Brooks
Carr
Carr
Costa
Csárdi
Danon
Dyer
Dyer
Dyer
Erdos
Freeman
Goudet
Guimera
Gustafsson
Holme
James
Koen
Krohn
Kyle
Legendre
Lusseau
Manel
May
McRae
Milgram
Morris
Newman
Newman
Newman
O’Brien
Pritchard
Proulx
Storfer
Urban
Wagner
Watts
Weir
Whitehead
Williams
Publication venue: Blackwell Publishing Ltd
Publication date
Field of study

We investigated the relationships among landscape quality, gene flow, and population genetic structure of fishers (Martes pennanti) in ON, Canada. We used graph theory as an analytical framework considering each landscape as a network node. The 34 nodes were connected by 93 edges. Network structure was characterized by a higher level of clustering than expected by chance, a short mean path length connecting all pairs of nodes, and a resiliency to the loss of highly connected nodes. This suggests that alleles can be efficiently spread through the system and that extirpations and conservative harvest are not likely to affect their spread. Two measures of node centrality were negatively related to both the proportion of immigrants in a node and node snow depth. This suggests that central nodes are producers of emigrants, contain high-quality habitat (i.e., deep snow can make locomotion energetically costly) and that fishers were migrating from high to low quality habitat. A method of community detection on networks delineated five genetic clusters of nodes suggesting cryptic population structure. Our analyses showed that network models can provide system-level insight into the process of gene flow with implications for understanding how landscape alterations might affect population fitness and evolutionary potential

Detection of hidden structures on all scales in amorphous materials and complex physical systems: basic notions and applications to networks, lattice systems, and glasses

Author: Chakrabarty S.
Kelton K. F.
Mauro N.
Nussinov Z.
Ronhovde P.
Sahu K. K.
Sahu M.
Publication venue
Publication date: 29/12/2010
Field of study

Recent decades have seen the discovery of numerous complex materials. At the root of the complexity underlying many of these materials lies a large number of possible contending atomic- and larger-scale configurations and the intricate correlations between their constituents. For a detailed understanding, there is a need for tools that enable the detection of pertinent structures on all spatial and temporal scales. Towards this end, we suggest a new method by invoking ideas from network analysis and information theory. Our method efficiently identifies basic unit cells and topological defects in systems with low disorder and may analyze general amorphous structures to identify candidate natural structures where a clear definition of order is lacking. This general unbiased detection of physical structure does not require a guess as to which of the system properties should be deemed as important and may constitute a natural point of departure for further analysis. The method applies to both static and dynamic systems.Comment: (23 pages, 9 figures

arXiv.org e-Print Archive

Computational complexity of the landscape I

Author: Abbott
Agrawal
Ajtai
Arkani-Hamed
Ashok
Balasubramanian
Balasubramanian
Banks
Banks
Barahona
Barrow
Bennett
Blumenhagen
Bousso
Brown
Brown
Bryngelson
Burgess
Conlon
Cook
Dasgupta
Denef
Denef
Denef
Deutsch
Douglas
Douglas
Feng
Ferrara
Feynman
Frederik Denef
Freedman
Garey
Geroch
Gibbons
Giddings
Grover
Gukov
Han
Hawking
Johnson
Kachru
Kirkpatrick
Kirkpatrick
Kitaev
Klein
Klemm
Lautemann
Levinthal
Michael R. Douglas
Mézard
Nabutovsky
Nielsen
Papadimitriou
Preskill
Rubakov
Saltman
Sherrington
Shor
Socolich
Unger
Uzan
Wales
Wegener
Weinberg
Weinberg
Weinberg
Weinberger
Wright
Yao
Publication venue: 'Elsevier BV'
Publication date: 07/02/2006
Field of study

We study the computational complexity of the physical problem of finding vacua of string theory which agree with data, such as the cosmological constant, and show that such problems are typically NP hard. In particular, we prove that in the Bousso-Polchinski model, the problem is NP complete. We discuss the issues this raises and the possibility that, even if we were to find compelling evidence that some vacuum of string theory describes our universe, we might never be able to find that vacuum explicitly. In a companion paper, we apply this point of view to the question of how early cosmology might select a vacuum.Comment: JHEP3 Latex, 53 pp, 2 .eps figure

arXiv.org e-Print Archive

Graph-Based Approaches to Protein StructureComparison - From Local to Global Similarity

Author: Mernberger Marco
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2011
Field of study

The comparative analysis of protein structure data is a central aspect of structural bioinformatics. Drawing upon structural information allows the inference of function for unknown proteins even in cases where no apparent homology can be found on the sequence level. Regarding the function of an enzyme, the overall fold topology might less important than the specific structural conformation of the catalytic site or the surface region of a protein, where the interaction with other molecules, such as binding partners, substrates and ligands occurs. Thus, a comparison of these regions is especially interesting for functional inference, since structural constraints imposed by the demands of the catalyzed biochemical function make them more likely to exhibit structural similarity. Moreover, the comparative analysis of protein binding sites is of special interest in pharmaceutical chemistry, in order to predict cross-reactivities and gain a deeper understanding of the catalysis mechanism. From an algorithmic point of view, the comparison of structured data, or, more generally, complex objects, can be attempted based on different methodological principles. Global methods aim at comparing structures as a whole, while local methods transfer the problem to multiple comparisons of local substructures. In the context of protein structure analysis, it is not a priori clear, which strategy is more suitable. In this thesis, several conceptually different algorithmic approaches have been developed, based on local, global and semi-global strategies, for the task of comparing protein structure data, more specifically protein binding pockets. The use of graphs for the modeling of protein structure data has a long standing tradition in structural bioinformatics. Recently, graphs have been used to model the geometric constraints of protein binding sites. The algorithms developed in this thesis are based on this modeling concept, hence, from a computer scientist's point of view, they can also be regarded as global, local and semi-global approaches to graph comparison. The developed algorithms were mainly designed on the premise to allow for a more approximate comparison of protein binding sites, in order to account for the molecular flexibility of the protein structures. A main motivation was to allow for the detection of more remote similarities, which are not apparent by using more rigid methods. Subsequently, the developed approaches were applied to different problems typically encountered in the field of structural bioinformatics in order to assess and compare their performance and suitability for different problems. Each of the approaches developed during this work was capable of improving upon the performance of existing methods in the field. Another major aspect in the experiments was the question, which methodological concept, local, global or a combination of both, offers the most benefits for the specific task of protein binding site comparison, a question that is addressed throughout this thesis

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg