27 research outputs found

    Chargaff's "Grammar of Biology": New Fractal-like Rules

    Full text link
    Chargaff once said that "I saw before me in dark contours the beginning of a grammar of Biology". In linguistics, "grammar" is the set of natural language rules, but we do not know for sure what Chargaff meant by "grammar" of Biology. Nevertheless, assuming the metaphor, Chargaff himself started a "grammar of Biology" discovering the so called Chargaff's rules. In this work, we further develop his grammar. Using new concepts, we were able to discovery new genomic rules that seem to be invariant across a large set of organisms, and show a fractal-like property, since no matter the scale, the same pattern is observed (self-similarity). We hope that these new invariant genomic rules may be used in different contexts since short read data bias detection to genome assembly quality assessment.Comment: 17 page

    Is a Genome a Codeword of an Error-Correcting Code?

    Get PDF
    Since a genome is a discrete sequence, the elements of which belong to a set of four letters, the question as to whether or not there is an error-correcting code underlying DNA sequences is unavoidable. The most common approach to answering this question is to propose a methodology to verify the existence of such a code. However, none of the methodologies proposed so far, although quite clever, has achieved that goal. In a recent work, we showed that DNA sequences can be identified as codewords in a class of cyclic error-correcting codes known as Hamming codes. In this paper, we show that a complete intron-exon gene, and even a plasmid genome, can be identified as a Hamming code codeword as well. Although this does not constitute a definitive proof that there is an error-correcting code underlying DNA sequences, it is the first evidence in this direction

    Vertical lossless genomic data compression tools for assembled genomes: A systematic literature review.

    No full text
    The recent decrease in cost and time to sequence and assemble of complete genomes created an increased demand for data storage. As a consequence, several strategies for assembled biological data compression were created. Vertical compression tools implement strategies that take advantage of the high level of similarity between multiple assembled genomic sequences for better compression results. However, current reviews on vertical compression do not compare the execution flow of each tool, which is constituted by phases of preprocessing, transformation, and data encoding. We performed a systematic literature review to identify and compare existing tools for vertical compression of assembled genomic sequences. The review was centered on PubMed and Scopus, in which 45726 distinct papers were considered. Next, 32 papers were selected according to the following criteria: to present a lossless vertical compression tool; to use the information contained in other sequences for the compression; to be able to manipulate genomic sequences in FASTA format; and no need prior knowledge. Although we extracted performance compression results, they were not compared as the tools did not use a standardized evaluation protocol. Thus, we conclude that there's a lack of definition of an evaluation protocol that must be applied by each tool

    Loss of GTF2I promotes neuronal apoptosis and synaptic reduction in human cellular models of neurodevelopment

    No full text
    Summary: Individuals with Williams syndrome (WS), a neurodevelopmental disorder caused by hemizygous loss of 26–28 genes at 7q11.23, characteristically portray a hypersocial phenotype. Copy-number variations and mutations in one of these genes, GTF2I, are associated with altered sociality and are proposed to underlie hypersociality in WS. However, the contribution of GTF2I to human neurodevelopment remains poorly understood. Here, human cellular models of neurodevelopment, including neural progenitors, neurons, and three-dimensional cortical organoids, are differentiated from CRISPR-Cas9-edited GTF2I-knockout (GTF2I-KO) pluripotent stem cells to investigate the role of GTF2I in human neurodevelopment. GTF2I-KO progenitors exhibit increased proliferation and cell-cycle alterations. Cortical organoids and neurons demonstrate increased cell death and synaptic dysregulation, including synaptic structural dysfunction and decreased electrophysiological activity on a multielectrode array. Our findings suggest that changes in synaptic circuit integrity may be a prominent mediator of the link between alterations in GTF2I and variation in the phenotypic expression of human sociality
    corecore