14 research outputs found

    Statistical distributions of oligonucleotide combinations: Applications in human chromosomes 21 and 22

    No full text
    Statistical properties of distances between all quintuplets and hexaplets are calculated and long-range distributions are observed for certain quintuplets and hexaplets, common in human chromosomes 21 and 22. The oligonucleotides were ordered according to their power-law exponents and this ordering has separated quintuplets or hexaplets which contain consensus sequences of promoters from other random oligonucleotides. These results are in accordance with earlier observations and theoretical predictions and demonstrate a different aspect of long-range correlations in genomic sequences. Comparison of the statistical properties of quintuplets and hexaplets between the two chromosomes is undertaken and it is shown that statistics are universal in the two chromosomes. © 2002 Elsevier Science B.V. All rights reserved

    Long range clustering of oligonucleotides containing the CG signal

    No full text
    The distance distributions between successive occurrences of the same oligonucleotides in chromosomal DNA are studied, in different classes of higher eucaryotic organisms. A two-parameter modeling is undertaken and applied on the distance distribution of quintuplets (sequences of size five bps) and hexaplets (sequences of size six bps); the first parameter k refers to the short range exponential decay of the distributions, whereas the second parameter m refers to the power law behavior. A two-dimensional scatter plot representing the model equation demonstrates that the points corresponding to the distance distribution of oligonucleotides containing the CG consensus sequence (promoter of the RNA polymerase II) cluster together (group α), apart from all other oligonucleotides (group β). This is shown for the available chordata Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, Gallus gallus and Danio rerio. This clustering is less evident in lower Animalia and plants, such as Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana. Moreover, in all organisms the oligonucleotides which contain any consensus sequence are found to be described by long range distributions, whereas all others have a stronger influence of short range decay. Various measures are introduced and evaluated, to numerically characterize the clustering of the two groups. The one which most clearly discriminates the two classes is shown to be the proximity factor. © 2009 Elsevier Ltd. All rights reserved

    Statistical algorithms for long DNA sequences: Oligonucleotide distributions and homogeneity maps

    No full text
    The statistical properties of oligonucleotide appearances within long DNA sequences often reveal useful characteristics of the corresponding DNA areas. Two algorithms to statistically analyze oligonucleotide appearances within long DNA sequences in genome banks are presented. The first algorithm determines statistical indices for arbitrary length oligonucleotides within arbitrary length DNA sequences. The critical exponent μ of the distance distribution between consecutive occurrences of the same oligonucleotide is calculated and its value is shown to characterize the functionality of the oligonucleotide. The second algorithm searches for areas with variable homogeneity, based on the density of oligonucleotides. The two algorithms have been applied to representative eucaryotes (the animal Mus musculusand the plant Arabidopsis thaliana) and interesting results were obtained, confirmed by biological observations. All programs are open source and publicly available on our web site. © 2005-IOS Press and the authors. All rights reserved

    Fractality in the neuron axonal topography of the human brain based on 3-D diffusion MRI

    No full text
    In this work the fractal architecture of the neuron axonal topography of the human brain is evaluated, as derived from 3-D diffusion MRI (dMRI) acquisitions. This is a 3D extension of work performed previously in 2D regions of interest (ROIs), where the fractal dimension of the neuron axonal topography was computed from dMRI data. A group study with 18 subjects is here conducted and the fractal dimensions Df of the entire 3-D volume of the brains is estimated via the box counting, the correlation dimension and the fractal mass dimension methods. The neuron axon data is obtained using tractography algorithms on diffusion tensor imaging of the brain. We find that all three calculations of Df give consistent results across subjects, namely, they demonstrate fractal characteristics in the short and medium length scales: different fractal exponents prevail at different length scales, an indication of multifractality. We surmise that this complexity stems as a collective property emerging when many local brain units, performing different functional tasks and having different local topologies, are recorded together

    Long-range correlations of RNA polymerase II promoter sequences across organisms

    No full text
    The statistical properties of the size distribution of DNA segments separating identical oligonucleotides are studied. For representative eukaryotes (Homo sapiens, Mus musculus, Saccharomyces cereviciae, Oryza sativa, Arabidopsis thaliana) we have demonstrated the existence of long-range correlations for the distances separating oligonucleotides of sizes 4, 5 and 6, which carry a promoter signature. This observation is independent of the consensus sequence used by the organism, as in the case of O. sativa (which mainly uses the CG promoter box) and A. thaliana (which mainly uses the TATA promoter box). If we use two parameters to characterise the size distribution separating oligonucleotides, we observe that oligonucleotides containing promoter signatures cluster together, away from the others. © 2005 Elsevier B.V. All rights reserved
    corecore