18 research outputs found

    P elements and MITE relatives in the whole genome sequence of Anopheles gambiae

    Get PDF
    BACKGROUND: Miniature Inverted-repeat Terminal Elements (MITEs), which are particular class-II transposable elements (TEs), play an important role in genome evolution, because they have very high copy numbers and display recurrent bursts of transposition. The 5' and 3' subterminal regions of a given MITE family often show a high sequence similarity with the corresponding regions of an autonomous Class-II TE family. However, the sustained presence over a prolonged evolutionary time of MITEs and TE master copies able to promote their mobility has been rarely reported within the same genome, and this raises fascinating evolutionary questions. RESULTS: We report here the presence of P transposable elements with related MITE families in the Anopheles gambiae genome. Using a TE annotation pipeline we have identified and analyzed all the P sequences in the sequenced A. gambiae PEST strain genome. More than 0.49% of the genome consists of P elements and derivates. P elements can be divided into 9 different subfamilies, separated by more than 30% of nucleotide divergence. Seven of them present full length copies. Ten MITE families are associated with 6 out of the 9 Psubfamilies. Comparing their intra-element nucleotide diversities and their structures allows us to propose the putative dynamics of their emergence. In particular, one MITE family which has a hybrid structure, with ends each of which is related to a different P-subfamily, suggests a new mechanism for their emergence and their mobility. CONCLUSION: This work contributes to a greater understanding of the relationship between full-length class-II TEs and MITEs, in this case P elements and their derivatives in the genome of A. gambiae. Moreover, it provides the most comprehensive catalogue to date of P-like transposons in this genome and provides convincing yet indirect evidence that some of the subfamilies have been recently active

    Domesticated P elements in the Drosophila montium species subgroup have a new function related to a DNA binding property.

    Get PDF
    Molecular domestication of a transposable element is defined as its functional recruitment by the host genome. To date, two independent events of molecular domestication of the P transposable element have been described: in the Drosophila obscura species group and in the Drosophila montium species subgroup. These P neogenes consist to stationary, non repeated sequences, potentially encoding 66 kDa repressor-like proteins (RLs). Here we investigate the function of the montium P neogenes. We provide evidence for the presence of RLs proteins in two montium species (D. tsacasi and D. bocqueti) specifically expressed in adult and larval brain and gonads. We tested the hypothesis that the montium P neogenes function is related to the repression of the transposition of distant related mobile P elements which coexist in the genome. Our results strongly suggest that the montium P neogenes are not recruited to down regulate the P element transposition. Given that all the proteins encoded by mobile or stationary P homologous sequences show a strong conservation of the DNA Binding Domain, we tested the capacity of the RLs proteins to bind DNA in vivo. Immunstaining of polytene chromosomes in D. melanogaster transgenic lines strongly suggest that montium P neogenes encode proteins that bind DNA in vivo. RLs proteins show multiple binding to the chromosomes. We suggest that the property recruited in the case of the montium P neoproteins is their DNA binding property. The possible functions of these neogenes are discussed

    Combined Evidence Annotation of Transposable Elements in Genome Sequences

    Get PDF
    Transposable elements (TEs) are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from combining multiple independent sources of computational evidence. To elevate the quality of TE annotations to a level comparable to that of gene models, we have developed a combined evidence-model TE annotation pipeline, analogous to systems used for gene annotation, by integrating results from multiple homology-based and de novo TE identification methods. As proof of principle, we have annotated “TE models” in Drosophila melanogaster Release 4 genomic sequences using the combined computational evidence derived from RepeatMasker, BLASTER, TBLASTX, all-by-all BLASTN, RECON, TE-HMM and the previous Release 3.1 annotation. Our system is designed for use with the Apollo genome annotation tool, allowing automatic results to be curated manually to produce reliable annotations. The euchromatic TE fraction of D. melanogaster is now estimated at 5.3% (cf. 3.86% in Release 3.1), and we found a substantially higher number of TEs (n = 6,013) than previously identified (n = 1,572). Most of the new TEs derive from small fragments of a few hundred nucleotides long and highly abundant families not previously annotated (e.g., INE-1). We also estimated that 518 TE copies (8.6%) are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough annotation of even the most complex TE models, including highly deleted and/or nested elements such as those often found in heterochromatic sequences. Our pipeline can be easily adapted to other genome sequences, such as those of the D. melanogaster heterochromatin or other species in the genus Drosophila

    Recurrent exon shuffling between distant P-element families

    No full text
    International audienceTwo independent stationary P-related neogenes had been previously described in the Drosophila obscura species group and in the Drosophila montium species subgroup. In Drosophila melanogaster, P-transposable elements can encode an 87 kDa transposase and a 66 kDa repressor, but the P-neogenes have only conserved the capacity to encode a 66 kDa repressor-like protein specified by the first three exons. We have previously analyzed the genomic modifications associated with the transition of a P-element into the montium P-neogene, the coding capacity of which has been conserved for around 20 Myr ( Nouaud, D., and D. Anxolabéhère. 1997. Mol. Biol. Evol. 14:1132-1144). Here we show that the P-neogene of some species of the montium subgroup presents a new structure involving the capture of an additional exon from a very distant P-element subfamily. This additional exon is inserted either upstream or downstream of the first exon of the P-neogene. As a result of alternative splicing, these modified neogenes can produce, in addition to the repressor-like protein, a new protein which differs only by the NH2-terminal region. We hypothesize that this protein diversity within an organism results in a functional diversification due to the selective advantage associated with the domestication of the P-neogene in these species. Moreover, the autonomous P-element which provides the additional exons is still present in the genome. Its nucleotide sequence is more than 45% distant from the previously defined P-type element (M-type, O-type, T-type) and defines a new P-type element subfamily referred to as the K-type

    Detection of new transposable element families in Drosophila melanogaster and Anopheles gambiae genomes

    No full text
    International audienceThe techniques that are usually used to detect transposable elements (TEs) in nucleic acid sequences rely on sequence similarity with previously characterized elements. However, these methods are likely to miss many elements in various organisms. We tested two strategies for the detection of unknown elements. The first, which we call "TBLASTX strategy," searches for TE sequences by comparing the six-frame translations of the nucleic acid sequences of known TEs with the genomic sequence of interest. The second, "repeat-based strategy," searches genomic sequences for long repeats and clusters them in groups of similar sequences. TE copies from a given family are expected to cluster together. We tested the Drosophila melanogaster genomic sequence and the recently sequenced Anopheles gambiae genome in which most TEs remain unknown. We showed that the "TBLASTX strategy" is very efficient as it detected at least 332 new TE families in D. melanogaster and 400 in A. gambiae. This was unexpected in Drosophila as TEs of this organism have been extensively studied. The "repeat-based strategy" appeared to be very inefficient because of two problems: (i) TE copies are heavily deleted and few copies share homologous regions, and (ii) segmental duplications are frequent and it is not easy to distinguish them from TE copies
    corecore