46 research outputs found

    Using population admixture to help complete maps of the human genome

    Get PDF
    Tens of millions of base pairs of euchromatic human genome sequence, including many protein-coding genes, have no known location in the human genome. We describe an approach for localizing the human genome's missing pieces by utilizing the patterns of genome sequence variation created by population admixture. We mapped the locations of 70 scaffolds spanning four million base pairs of the human genome's unplaced euchromatic sequence, including more than a dozen protein-coding genes, and identified eight large novel inter-chromosomal segmental duplications. We find that most of these sequences are hidden in the genome's heterochromatin, particularly its pericentromeric regions. Many cryptic, pericentromeric genes are expressed in RNA and have been maintained intact for millions of years while their expression patterns diverged from those of paralogous genes elsewhere in the genome. We describe how knowledge of the locations of these sequences can inform disease association and genome biology studies

    Mitochondrial Pseudogenes in the Nuclear Genomes of Drosophila

    Get PDF
    Mitochondrial pseudogenes in nuclear chromosomes (numts) have been detected in the genomes of a diverse range of eukaryotic species. However, the numt content of different genomes and their properties is not uniform, and study of these differences provides insight into the mechanisms and dynamics of genome evolution in different organisms. In the genus Drosophila, numts have previously only been identified on a genome-wide scale in the melanogaster subgroup. The present study extends the identification to 11 species of the Drosophila genus. We identify a total of 302 numts and show that the numt complement is highly variable in Drosophilids, ranging from just 4 in D. melanogaster to 67 in D. willistoni, broadly correlating with genome size. Many numts have undergone large-scale rearrangements in the nucleus, including interruptions, inversions, deletions and duplications of sequence of variable size. Estimating the age of the numts in the nucleus by phylogenetic tree reconstruction reveals the vast majority of numts to be recent gains, 90% having arisen on terminal branches of the species tree. By identifying paralogs and counting duplications among the extant numts we estimate that 23% of extant numts arose through post-insertion duplications. We estimate genus average rates of insertion of 0.75 per million years, and a duplication rate of 0.010 duplications per numt per million years

    Segmental Duplication Implicated in the Genesis of Inversion 2Rj of Anopheles gambiae

    Get PDF
    The malaria vector Anopheles gambiae maintains high levels of inversion polymorphism that facilitate its exploitation of diverse ecological settings across tropical Africa. Molecular characterization of inversion breakpoints is a first step toward understanding the processes that generate and maintain inversions. Here we focused on inversion 2Rj because of its association with the assortatively mating Bamako chromosomal form of An. gambiae, whose distinctive breeding sites are rock pools beside the Niger River in Mali and Guinea. Sequence and computational analysis of 2Rj revealed the same 14.6 kb insertion between both breakpoints, which occurred near but not within predicted genes. Each insertion consists of 5.3 kb terminal inverted repeat arms separated by a 4 kb spacer. The insertions lack coding capacity, and are comprised of degraded remnants of repetitive sequences including class I and II transposable elements. Because of their large size and patchwork composition, and as no other instances of these insertions were identified in the An. gambiae genome, they do not appear to be transposable elements. The 14.6 kb modules inserted at both 2Rj breakpoint junctions represent low copy repeats (LCRs, also called segmental duplications) that are strongly implicated in the recent (∼0.4Ne generations) origin of 2Rj. The LCRs contribute to further genome instability, as demonstrated by an imprecise excision event at the proximal breakpoint of 2Rj in field isolates

    Interchromosomal Duplications on the Bactrocera oleae Y Chromosome Imply a Distinct Evolutionary Origin of the Sex Chromosomes Compared to Drosophila

    Get PDF
    BACKGROUND: Diptera have an extraordinary variety of sex determination mechanisms, and Drosophila melanogaster is the paradigm for this group. However, the Drosophila sex determination pathway is only partially conserved and the family Tephritidae affords an interesting example. The tephritid Y chromosome is postulated to be necessary to determine male development. Characterization of Y sequences, apart from elucidating the nature of the male determining factor, is also important to understand the evolutionary history of sex chromosomes within the Tephritidae. We studied the Y sequences from the olive fly, Bactrocera oleae. Its Y chromosome is minute and highly heterochromatic, and displays high heteromorphism with the X chromosome. METHODOLOGY/PRINCIPAL FINDINGS: A combined Representational Difference Analysis (RDA) and fluorescence in-situ hybridization (FISH) approach was used to investigate the Y chromosome to derive information on its sequence content. The Y chromosome is strewn with repetitive DNA sequences, the majority of which are also interdispersed in the pericentromeric regions of the autosomes. The Y chromosome appears to have accumulated small and large repetitive interchromosomal duplications. The large interchromosomal duplications harbour an importin-4-like gene fragment. Apart from these importin-4-like sequences, the other Y repetitive sequences are not shared with the X chromosome, suggesting molecular differentiation of these two chromosomes. Moreover, as the identified Y sequences were not detected on the Y chromosomes of closely related tephritids, we can infer divergence in the repetitive nature of their sequence contents. CONCLUSIONS/SIGNIFICANCE: The identification of Y-linked sequences may tell us much about the repetitive nature, the origin and the evolution of Y chromosomes. We hypothesize how these repetitive sequences accumulated and were maintained on the Y chromosome during its evolutionary history. Our data reinforce the idea that the sex chromosomes of the Tephritidae may have distinct evolutionary origins with respect to those of the Drosophilidae and other Dipteran families

    Holding it together: rapid evolution and positive selection in the synaptonemal complex of Drosophila

    Get PDF
    Background The synaptonemal complex (SC) is a highly conserved meiotic structure that functions to pair homologs and facilitate meiotic recombination in most eukaryotes. Five Drosophila SC proteins have been identified and localized within the complex: C(3)G, C(2)M, CONA, ORD, and the newly identified Corolla. The SC is required for meiotic recombination in Drosophila and absence of these proteins leads to reduced crossing over and chromosomal nondisjunction. Despite the conserved nature of the SC and the key role that these five proteins have in meiosis in D. melanogaster, they display little apparent sequence conservation outside the genus. To identify factors that explain this lack of apparent conservation, we performed a molecular evolutionary analysis of these genes across the Drosophila genus. Results For the five SC components, gene sequence similarity declines rapidly with increasing phylogenetic distance and only ORD and C(2)M are identifiable outside of the Drosophila genus. SC gene sequences have a higher dN/dS (ω) rate ratio than the genome wide average and this can in part be explained by the action of positive selection in almost every SC component. Across the genus, there is significant variation in ω for each protein. It further appears that ω estimates for the five SC components are in accordance with their physical position within the SC. Components interacting with chromatin evolve slowest and components comprising the central elements evolve the most rapidly. Finally, using population genetic approaches, we demonstrate that positive selection on SC components is ongoing. Conclusions SC components within Drosophila show little apparent sequence homology to those identified in other model organisms due to their rapid evolution. We propose that the Drosophila SC is evolving rapidly due to two combined effects. First, we propose that a high rate of evolution can be partly explained by low purifying selection on protein components whose function is to simply hold chromosomes together. We also propose that positive selection in the SC is driven by its sex-specificity combined with its role in facilitating both recombination and centromere clustering in the face of recurrent bouts of drive in female meiosis

    A common copy-number breakpoint of ERBB2 amplification in breast cancer colocalizes with a complex block of segmental duplications

    Full text link
    corecore