823 research outputs found

    Information inequalities and Generalized Graph Entropies

    Get PDF
    In this article, we discuss the problem of establishing relations between information measures assessed for network structures. Two types of entropy based measures namely, the Shannon entropy and its generalization, the R\'{e}nyi entropy have been considered for this study. Our main results involve establishing formal relationship, in the form of implicit inequalities, between these two kinds of measures when defined for graphs. Further, we also state and prove inequalities connecting the classical partition-based graph entropies and the functional-based entropy measures. In addition, several explicit inequalities are derived for special classes of graphs.Comment: A preliminary version. To be submitted to a journa

    Connections between Classical and Parametric Network Entropies

    Get PDF
    This paper explores relationships between classical and parametric measures of graph (or network) complexity. Classical measures are based on vertex decompositions induced by equivalence relations. Parametric measures, on the other hand, are constructed by using information functions to assign probabilities to the vertices. The inequalities established in this paper relating classical and parametric measures lay a foundation for systematic classification of entropy-based measures of graph complexity

    Information Indices with High Discriminative Power for Graphs

    Get PDF
    In this paper, we evaluate the uniqueness of several information-theoretic measures for graphs based on so-called information functionals and compare the results with other information indices and non-information-theoretic measures such as the well-known Balaban index. We show that, by employing an information functional based on degree-degree associations, the resulting information index outperforms the Balaban index tremendously. These results have been obtained by using nearly 12 million exhaustively generated, non-isomorphic and unweighted graphs. Also, we obtain deeper insights on these and other topological descriptors when exploring their uniqueness by using exhaustively generated sets of alkane trees representing connected and acyclic graphs in which the degree of a vertex is at most four

    New Polynomial-Based Molecular Descriptors with Low Degeneracy

    Get PDF
    In this paper, we introduce a novel graph polynomial called the ‘information polynomial’ of a graph. This graph polynomial can be derived by using a probability distribution of the vertex set. By using the zeros of the obtained polynomial, we additionally define some novel spectral descriptors. Compared with those based on computing the ordinary characteristic polynomial of a graph, we perform a numerical study using real chemical databases. We obtain that the novel descriptors do have a high discrimination power

    Exploring Statistical and Population Aspects of Network Complexity

    Get PDF
    The characterization and the definition of the complexity of objects is an important but very difficult problem that attracted much interest in many different fields. In this paper we introduce a new measure, called network diversity score (NDS), which allows us to quantify structural properties of networks. We demonstrate numerically that our diversity score is capable of distinguishing ordered, random and complex networks from each other and, hence, allowing us to categorize networks with respect to their structural complexity. We study 16 additional network complexity measures and find that none of these measures has similar good categorization capabilities. In contrast to many other measures suggested so far aiming for a characterization of the structural complexity of networks, our score is different for a variety of reasons. First, our score is multiplicatively composed of four individual scores, each assessing different structural properties of a network. That means our composite score reflects the structural diversity of a network. Second, our score is defined for a population of networks instead of individual networks. We will show that this removes an unwanted ambiguity, inherently present in measures that are based on single networks. In order to apply our measure practically, we provide a statistical estimator for the diversity score, which is based on a finite number of samples

    Shortest-Path Network Analysis Is a Useful Approach toward Identifying Genetic Determinants of Longevity

    Get PDF
    Background Identification of genes that modulate longevity is a major focus of aging-related research and an area of intense public interest. In addition to facilitating an improved understanding of the basic mechanisms of aging, such genes represent potential targets for therapeutic intervention in multiple age-associated diseases, including cancer, heart disease, diabetes, and neurodegenerative disorders. To date, however, targeted efforts at identifying longevity-associated genes have been limited by a lack of predictive power, and useful algorithms for candidate gene-identification have also been lacking. Methodology/Principal Findings We have utilized a shortest-path network analysis to identify novel genes that modulate longevity in Saccharomyces cerevisiae. Based on a set of previously reported genes associated with increased life span, we applied a shortest-path network algorithm to a pre-existing protein–protein interaction dataset in order to construct a shortest-path longevity network. To validate this network, the replicative aging potential of 88 single-gene deletion strains corresponding to predicted components of the shortest-path longevity network was determined. Here we report that the single-gene deletion strains identified by our shortest-path longevity analysis are significantly enriched for mutations conferring either increased or decreased replicative life span, relative to a randomly selected set of 564 single-gene deletion strains or to the current data set available for the entire haploid deletion collection. Further, we report the identification of previously unknown longevity genes, several of which function in a conserved longevity pathway believed to mediate life span extension in response to dietary restriction. Conclusions/Significance This work demonstrates that shortest-path network analysis is a useful approach toward identifying genetic determinants of longevity and represents the first application of network analysis of aging to be extensively validated in a biological system. The novel longevity genes identified in this study are likely to yield further insight into the molecular mechanisms of aging and age-associated disease

    Integrated information increases with fitness in the evolution of animats

    Get PDF
    One of the hallmarks of biological organisms is their ability to integrate disparate information sources to optimize their behavior in complex environments. How this capability can be quantified and related to the functional complexity of an organism remains a challenging problem, in particular since organismal functional complexity is not well-defined. We present here several candidate measures that quantify information and integration, and study their dependence on fitness as an artificial agent ("animat") evolves over thousands of generations to solve a navigation task in a simple, simulated environment. We compare the ability of these measures to predict high fitness with more conventional information-theoretic processing measures. As the animat adapts by increasing its "fit" to the world, information integration and processing increase commensurately along the evolutionary line of descent. We suggest that the correlation of fitness with information integration and with processing measures implies that high fitness requires both information processing as well as integration, but that information integration may be a better measure when the task requires memory. A correlation of measures of information integration (but also information processing) and fitness strongly suggests that these measures reflect the functional complexity of the animat, and that such measures can be used to quantify functional complexity even in the absence of fitness data.Comment: 27 pages, 8 figures, one supplementary figure. Three supplementary video files available on request. Version commensurate with published text in PLoS Comput. Bio

    ANN multiscale model of anti-HIV Drugs activity vs AIDS prevalence in the US at county level based on information indices of molecular graphs and social networks

    Get PDF
    [Abstract] This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.Ministerio de Educación, Cultura y Deportes; AGL2011-30563-C03-0
    corecore