17,590 research outputs found

    Complexity-entropy analysis at different levels of organization in written language

    Full text link
    Written language is complex. A written text can be considered an attempt to convey a meaningful message which ends up being constrained by language rules, context dependence and highly redundant in its use of resources. Despite all these constraints, unpredictability is an essential element of natural language. Here we present the use of entropic measures to assert the balance between predictability and surprise in written text. In short, it is possible to measure innovation and context preservation in a document. It is shown that this can also be done at the different levels of organization of a text. The type of analysis presented is reasonably general, and can also be used to analyze the same balance in other complex messages such as DNA, where a hierarchy of organizational levels are known to exist

    A Note on Zipf's Law, Natural Languages, and Noncoding DNA regions

    Get PDF
    In Phys. Rev. Letters (73:2, 5 Dec. 94), Mantegna et al. conclude on the basis of Zipf rank frequency data that noncoding DNA sequence regions are more like natural languages than coding regions. We argue on the contrary that an empirical fit to Zipf's ``law'' cannot be used as a criterion for similarity to natural languages. Although DNA is a presumably an ``organized system of signs'' in Mandelbrot's (1961) sense, an observation of statistical features of the sort presented in the Mantegna et al. paper does not shed light on the similarity between DNA's ``grammar'' and natural language grammars, just as the observation of exact Zipf-like behavior cannot distinguish between the underlying processes of tossing an MM sided die or a finite-state branching process.Comment: compressed uuencoded postscript file: 14 page

    Dagstuhl Reports : Volume 1, Issue 2, February 2011

    Get PDF
    Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

    The placement of the head that maximizes predictability. An information theoretic approach

    Get PDF
    The minimization of the length of syntactic dependencies is a well-established principle of word order and the basis of a mathematical theory of word order. Here we complete that theory from the perspective of information theory, adding a competing word order principle: the maximization of predictability of a target element. These two principles are in conflict: to maximize the predictability of the head, the head should appear last, which maximizes the costs with respect to dependency length minimization. The implications of such a broad theoretical framework to understand the optimality, diversity and evolution of the six possible orderings of subject, object and verb are reviewed.Comment: in press in Glottometric

    Gene Expression and its Discontents: Developmental disorders as dysfunctions of epigenetic cognition

    Get PDF
    Systems biology presently suffers the same mereological and sufficiency fallacies that haunt neural network models of high order cognition. Shifting perspective from the massively parallel space of gene matrix interactions to the grammar/syntax of the time series of expressed phenotypes using a cognitive paradigm permits import of techniques from statistical physics via the homology between information source uncertainty and free energy density. This produces a broad spectrum of possible statistical models of development and its pathologies in which epigenetic regulation and the effects of embedding environment are analogous to a tunable enzyme catalyst. A cognitive paradigm naturally incorporates memory, leading directly to models of epigenetic inheritance, as affected by environmental exposures, in the largest sense. Understanding gene expression, development, and their dysfunctions will require data analysis tools considerably more sophisticated than the present crop of simplistic models abducted from neural network studies or stochastic chemical reaction theory

    Without magic bullets: the biological basis for public health interventions against protein folding disorders

    Get PDF
    Protein folding disorders of aging like Alzheimer's and Parkinson's diseases currently present intractable medical challenges. 'Small molecule' interventions - drug treatments - often have, at best, palliative impact, failing to alter disease course. The design of individual or population level interventions will likely require a deeper understanding of protein folding and its regulation than currently provided by contemporary 'physics' or culture-bound medical magic bullet models. Here, a topological rate distortion analysis is applied to the problem of protein folding and regulation that is similar in spirit to Tlusty's (2010a) elegant exploration of the genetic code. The formalism produces large-scale, quasi-equilibrium 'resilience' states representing normal and pathological protein folding regulation under a cellular-level cognitive paradigm similar to that proposed by Atlan and Cohen (1998) for the immune system. Generalization to long times produces diffusion models of protein folding disorders in which epigenetic or life history factors determine the rate of onset of regulatory failure, in essence, a premature aging driven by familiar synergisms between disjunctions of resource allocation and need in the context of socially or physiologically toxic exposures and chronic powerlessness at individual and group scales. Application of an HPA axis model is made to recent observed differences in Alzheimer's onset rates in White and African American subpopulations as a function of an index of distress-proneness

    The modern versus extended evolutionary synthesis : sketch of an intra-genomic gene's eye view for the evolutionary-genetic underpinning of epigenetic and developmental evolution

    Get PDF
    Studying the phenotypic evolution of organisms in terms of populations of genes and genotypes, the Modern Synthesis (MS) conceptualizes biological evolution in terms of 'inter-organismal' interactions among genes sitting in the different individual organisms that constitute a population. It 'black-boxes' the complex 'intra-organismic' molecular and developmental epigenetics mediating between genotypes and phenotypes. To conceptually integrate epigenetics and evo-devo into evolutionary theory, advocates of an Extended Evolutionary Synthesis (EES) argue that the MS's reductive gene-centrism should be abandoned in favor of a more inclusive organism-centered approach. To push the debate to a new level of understanding, we introduce the evolutionary biology of 'intra-genomic conflict' (IGC) to the controversy. This strategy is based on a twofold rationale. First, the field of IGC is both ‘gene-centered’ and 'intra-organismic' and, as such, could build a bridge between the gene-centered MS and the intra-organismic fields of epigenetics and evo-devo. And second, it is increasingly revealed that IGC plays a significant causal role in epigenetic and developmental evolution and even in speciation. Hence, to deal with the ‘discrepancy’ between the ‘gene-centered’ MS and the ‘intra-organismic’ fields of epigenetics and evo-devo, we sketch a conceptual solution in terms of ‘intra-genomic conflict and compromise’ – an ‘intra-genomic gene’s eye view’ that thinks in terms of intra-genomic ‘evolutionarily stable strategies’ (ESSs) among numerous and various DNA regions and elements – to evolutionary-genetically underwrite both epigenetic and developmental evolution, as such questioning the ‘gene-de-centered’ stance put forward by EES-advocates

    A semiotic analysis of the genetic information

    Get PDF
    Terms loaded with informational connotations are often employed to refer to genes and their dynamics. Indeed, genes are usually perceived by biologists as basically ‘the carriers of hereditary information.’ Nevertheless, a number of researchers consider such talk as inadequate and ‘just metaphorical,’ thus expressing a skepticism about the use of the term ‘information’ and its derivatives in biology as a natural science. First, because the meaning of that term in biology is not as precise as it is, for instance, in the mathematical theory of communication. Second, because it seems to refer to a purported semantic property of genes without theoretically clarifying if any genuinely intrinsic semantics is involved. Biosemiotics, a field that attempts to analyze biological systems as semiotic systems, makes it possible to advance in the understanding of the concept of information in biology. From the perspective of Peircean biosemiotics, we develop here an account of genes as signs, including a detailed analysis of two fundamental processes in the genetic information system (transcription and protein synthesis) that have not been made so far in this field of research. Furthermore, we propose here an account of information based on Peircean semiotics and apply it to our analysis of transcription and protein synthesis

    Metabolic constraints on the evolution of genetic codes: Did multiple 'preaerobic' ecosystem transitions entrain richer dialects via Serial Endosymbiosis?

    Get PDF
    A mathematical model based on Tlusty's topological deconstruction suggests that multiple punctuated ecosystem shifts in available metabolic free energy, broadly akin to the 'aerobic' transition, enabled a punctuated sequence of increasingly complex genetic codes and protein translators under mechanisms similar to the Serial Endosymbiosis effecting the Eukaryotic transition. These evolved until the ancestor to the present narrow spectrum of nearly maximally robust codes became locked-in by path dependence
    • …
    corecore