63,432 research outputs found

    Targeting determinants of dosage compensation in Drosophila

    Get PDF
    The dosage compensation complex (DCC) in Drosophila melanogaster is responsible for up-regulating transcription from the single male X chromosome to equal the transcription from the two X chromosomes in females. Visualization of the DCC, a large ribonucleoprotein complex, on male larval polytene chromosomes reveals that the complex binds selectively to many interbands on the X chromosome. The targeting of the DCC is thought to be in part determined by DNA sequences that are enriched on the X. So far, lack of knowledge about DCC binding sites has prevented the identification of sequence determinants. Only three binding sites have been identified to date, but analysis of their DNA sequence did not allow the prediction of further binding sites. We have used chromatin immunoprecipitation to identify a number of new DCC binding fragments and characterized them in vivo by visualizing DCC binding to autosomal insertions of these fragments, and we have demonstrated that they possess a wide range of potential to recruit the DCC. By varying the in vivo concentration of the DCC, we provide evidence that this range of recruitment potential is due to differences in affinity of the complex to these sites. We were also able to establish that DCC binding to ectopic high-affinity sites can allow nearby low-affinity sites to recruit the complex. Using the sequences of the newly identified and previously characterized binding fragments, we have uncovered a number of short sequence motifs, which in combination may contribute to DCC recruitment. Our findings suggest that the DCC is recruited to the X via a number of binding sites of decreasing affinities, and that the presence of high-and moderate-affinity sites on the X may ensure that lower-affinity sites are occupied in a context-dependent manner. Our bioinformatics analysis suggests that DCC binding sites may be composed of variable combinations of degenerate motifs

    Conspiracy in bacterial genomes

    Full text link
    The rank ordered distribution of the codon usage frequencies for 123 bacteriae is best fitted by a three parameters function that is the sum of a constant, an exponential and a linear term in the rank n. The parameters depend (two parabolically) from the total GC content. The rank ordered distribution of the amino acids is fitted by a straight line. The Shannon entropy computed over all the codons is well fitted by a parabola in the GC content, while the partial entropies computed over subsets of the codons show peculiar different behavior, exhibiting therefore a first conspiracy effect. Moreover the sum of the codon usage frequencies over particular sets, e.g. with C and A (respectively G and U) as i-th nucleotide, shows a clear linear dependence from the GC content, exhibiting another conspiracy effect.Comment: revised version: introduction and conclusion enhanced, references added, figures added, some tables remove

    Application of regulatory sequence analysis and metabolic network analysis to the interpretation of gene expression data

    Get PDF
    We present two complementary approaches for the interpretation of clusters of co-regulated genes, such as those obtained from DNA chips and related methods. Starting from a cluster of genes with similar expression profiles, two basic questions can be asked: 1. Which mechanism is responsible for the coordinated transcriptional response of the genes? This question is approached by extracting motifs that are shared between the upstream sequences of these genes. The motifs extracted are putative cis-acting regulatory elements. 2. What is the physiological meaning for the cell to express together these genes? One way to answer the question is to search for potential metabolic pathways that could be catalyzed by the products of the genes. This can be done by selecting the genes from the cluster that code for enzymes, and trying to assemble the catalyzed reactions to form metabolic pathways. We present tools to answer these two questions, and we illustrate their use with selected examples in the yeast Saccharomyces cerevisiae. The tools are available on the web (http://ucmb.ulb.ac.be/bioinformatics/rsa-tools/; http://www.ebi.ac.uk/research/pfbp/; http://www.soi.city.ac.uk/~msch/)

    Back-translation for discovering distant protein homologies

    Get PDF
    Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level. To cope with this situation, we propose a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. This allows us to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.Comment: The 9th International Workshop in Algorithms in Bioinformatics (WABI), Philadelphia : \'Etats-Unis d'Am\'erique (2009

    In the search for the low-complexity sequences in prokaryotic and eukaryotic genomes: how to derive a coherent picture from global and local entropy measures

    Full text link
    We investigate on a possible way to connect the presence of Low-Complexity Sequences (LCS) in DNA genomes and the nonstationary properties of base correlations. Under the hypothesis that these variations signal a change in the DNA function, we use a new technique, called Non-Stationarity Entropic Index (NSEI) method, and we prove that this technique is an efficient way to detect functional changes with respect to a random baseline. The remarkable aspect is that NSEI does not imply any training data or fitting parameter, the only arbitrarity being the choice of a marker in the sequence. We make this choice on the basis of biological information about LCS distributions in genomes. We show that there exists a correlation between changing the amount in LCS and the ratio of long- to short-range correlation
    corecore