842 research outputs found
Information inequalities and Generalized Graph Entropies
In this article, we discuss the problem of establishing relations between
information measures assessed for network structures. Two types of entropy
based measures namely, the Shannon entropy and its generalization, the
R\'{e}nyi entropy have been considered for this study. Our main results involve
establishing formal relationship, in the form of implicit inequalities, between
these two kinds of measures when defined for graphs. Further, we also state and
prove inequalities connecting the classical partition-based graph entropies and
the functional-based entropy measures. In addition, several explicit
inequalities are derived for special classes of graphs.Comment: A preliminary version. To be submitted to a journa
Connections between Classical and Parametric Network Entropies
This paper explores relationships between classical and parametric measures of graph (or network) complexity. Classical measures are based on vertex decompositions induced by equivalence relations. Parametric measures, on the other hand, are constructed by using information functions to assign probabilities to the vertices. The inequalities established in this paper relating classical and parametric measures lay a foundation for systematic classification of entropy-based measures of graph complexity
Information Indices with High Discriminative Power for Graphs
In this paper, we evaluate the uniqueness of several information-theoretic measures for graphs based on so-called information functionals and compare the results with other information indices and non-information-theoretic measures such as the well-known Balaban index. We show that, by employing an information functional based on degree-degree associations, the resulting information index outperforms the Balaban index tremendously. These results have been obtained by using nearly 12 million exhaustively generated, non-isomorphic and unweighted graphs. Also, we obtain deeper insights on these and other topological descriptors when exploring their uniqueness by using exhaustively generated sets of alkane trees representing connected and acyclic graphs in which the degree of a vertex is at most four
New Polynomial-Based Molecular Descriptors with Low Degeneracy
In this paper, we introduce a novel graph polynomial called the ‘information polynomial’ of a graph. This graph polynomial can be derived by using a probability distribution of the vertex set. By using the zeros of the obtained polynomial, we additionally define some novel spectral descriptors. Compared with those based on computing the ordinary characteristic polynomial of a graph, we perform a numerical study using real chemical databases. We obtain that the novel descriptors do have a high discrimination power
Exploring Statistical and Population Aspects of Network Complexity
The characterization and the definition of the complexity of objects is an important but very difficult problem that attracted much interest in many different fields. In this paper we introduce a new measure, called network diversity score (NDS), which allows us to quantify structural properties of networks. We demonstrate numerically that our diversity score is capable of distinguishing ordered, random and complex networks from each other and, hence, allowing us to categorize networks with respect to their structural complexity. We study 16 additional network complexity measures and find that none of these measures has similar good categorization capabilities. In contrast to many other measures suggested so far aiming for a characterization of the structural complexity of networks, our score is different for a variety of reasons. First, our score is multiplicatively composed of four individual scores, each assessing different structural properties of a network. That means our composite score reflects the structural diversity of a network. Second, our score is defined for a population of networks instead of individual networks. We will show that this removes an unwanted ambiguity, inherently present in measures that are based on single networks. In order to apply our measure practically, we provide a statistical estimator for the diversity score, which is based on a finite number of samples
Shortest-Path Network Analysis Is a Useful Approach toward Identifying Genetic Determinants of Longevity
Background
Identification of genes that modulate longevity is a major focus of aging-related research and an area of intense public interest. In addition to facilitating an improved understanding of the basic mechanisms of aging, such genes represent potential targets for therapeutic intervention in multiple age-associated diseases, including cancer, heart disease, diabetes, and neurodegenerative disorders. To date, however, targeted efforts at identifying longevity-associated genes have been limited by a lack of predictive power, and useful algorithms for candidate gene-identification have also been lacking. Methodology/Principal Findings
We have utilized a shortest-path network analysis to identify novel genes that modulate longevity in Saccharomyces cerevisiae. Based on a set of previously reported genes associated with increased life span, we applied a shortest-path network algorithm to a pre-existing protein–protein interaction dataset in order to construct a shortest-path longevity network. To validate this network, the replicative aging potential of 88 single-gene deletion strains corresponding to predicted components of the shortest-path longevity network was determined. Here we report that the single-gene deletion strains identified by our shortest-path longevity analysis are significantly enriched for mutations conferring either increased or decreased replicative life span, relative to a randomly selected set of 564 single-gene deletion strains or to the current data set available for the entire haploid deletion collection. Further, we report the identification of previously unknown longevity genes, several of which function in a conserved longevity pathway believed to mediate life span extension in response to dietary restriction. Conclusions/Significance
This work demonstrates that shortest-path network analysis is a useful approach toward identifying genetic determinants of longevity and represents the first application of network analysis of aging to be extensively validated in a biological system. The novel longevity genes identified in this study are likely to yield further insight into the molecular mechanisms of aging and age-associated disease
Integrated information increases with fitness in the evolution of animats
One of the hallmarks of biological organisms is their ability to integrate
disparate information sources to optimize their behavior in complex
environments. How this capability can be quantified and related to the
functional complexity of an organism remains a challenging problem, in
particular since organismal functional complexity is not well-defined. We
present here several candidate measures that quantify information and
integration, and study their dependence on fitness as an artificial agent
("animat") evolves over thousands of generations to solve a navigation task in
a simple, simulated environment. We compare the ability of these measures to
predict high fitness with more conventional information-theoretic processing
measures. As the animat adapts by increasing its "fit" to the world,
information integration and processing increase commensurately along the
evolutionary line of descent. We suggest that the correlation of fitness with
information integration and with processing measures implies that high fitness
requires both information processing as well as integration, but that
information integration may be a better measure when the task requires memory.
A correlation of measures of information integration (but also information
processing) and fitness strongly suggests that these measures reflect the
functional complexity of the animat, and that such measures can be used to
quantify functional complexity even in the absence of fitness data.Comment: 27 pages, 8 figures, one supplementary figure. Three supplementary
video files available on request. Version commensurate with published text in
PLoS Comput. Bio
ANN multiscale model of anti-HIV Drugs activity vs AIDS prevalence in the US at county level based on information indices of molecular graphs and social networks
[Abstract] This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.Ministerio de Educación, Cultura y Deportes; AGL2011-30563-C03-0
- …