23,347 research outputs found

    Evaluation of methods for detecting conversion events in gene clusters

    Get PDF
    Background: Gene clusters are genetically important, but their analysis poses significant computational challenges. One of the major reasons for these difficulties is gene conversion among the duplicated regions of the cluster, which can obscure their true relationships. Many computational methods for detecting gene conversion events have been released, but their performance has not been assessed for wide deployment in evolutionary history studies due to a lack of accurate evaluation methods. Results: We designed a new method that simulates gene cluster evolution, including large-scale events of duplication, deletion, and conversion as well as small mutations. We used this simulation data to evaluate several different programs for detecting gene conversion events. Conclusions: Our evaluation identifies strengths and weaknesses of several methods for detecting gene conversion, which can contribute to more accurate analysis of gene cluster evolution

    Revealing mammalian evolutionary relationships by comparative analysis of gene clusters

    Get PDF
    Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events

    454 screening of individual MHC variation in an endemic island passerine

    Get PDF
    Genes of the major histocompatibility complex (MHC) code for receptors that are central to the adaptive immune response of vertebrates. These genes are therefore important genetic markers with which to study adaptive genetic variation in the wild. Next-generation sequencing (NGS) has increasingly been used in the last decade to genotype the MHC. However, NGS methods are highly prone to sequencing errors, and although several methodologies have been proposed to deal with this, until recently there have been no standard guidelines for the validation of putative MHC alleles. In this study, we used the 454 NGS platform to screen MHC class I exon 3 variation in a population of the island endemic Berthelot’s pipit (Anthus berthelotii). We were able to characterise MHC genotypes across 309 individuals with high levels of repeatability. We were also able to determine alleles that had low amplification efficiencies, whose identification within individuals may thus be less reliable. At the population level we found lower levels of MHC diversity in Berthelot’s pipit than in its widespread continental sister species the tawny pipit (Anthus campestris), and observed trans-species polymorphism. Using the sequence data, we identified signatures of gene conversion and evidence of maintenance of functionally divergent alleles in Berthelot’s pipit. We also detected positive selection at 10 codons. The present study therefore shows that we have an efficient method for screening individual MHC variation across large datasets in Berthelot’s pipit, and provides data that can be used in future studies investigating spatio-temporal patterns and scales of selection on the MHC

    Independent stratum formation on the avian sex chromosomes reveals inter-chromosomal gene conversion and predominance of purifying selection on the w chromosome

    Get PDF
    We used a comparative approach spanning three species and 90 million years to study the evolutionary history of the avian sex chromosomes. Using whole transcriptomes, we assembled the largest cross-species dataset of W-linked coding content to date. Our results show that recombination suppression in large portions of the avian sex chromosomes has evolved independently, and that long-term sex chromosome divergence is consistent with repeated and independent inversions spreading progressively to restrict recombination. In contrast, over short-term periods we observe heterogeneous and locus-specific divergence. We also uncover four instances of gene conversion between both highly diverged and recently evolved gametologs, suggesting a complex mosaic of recombination suppression across the sex chromosomes. Lastly, evidence from 16 gametologs reveal that the W chromosome is evolving with a significant contribution of purifying selection, consistent with previous findings that W-linked genes play an important role in encoding sex-specific fitness

    Cryptic MHC Polymorphism Revealed but Not Explained by Selection on the Class IIB Peptide-Binding Region

    Get PDF
    The immune genes of the major histocompatibility complex (MHC) are characterized by extraordinarily high levels of nucleotide and haplotype diversity. This variation is maintained by pathogen-mediated balancing selection that is operating on the peptide-binding region (PBR). Several recent studies have found, however, that some populations possess large clusters of alleles that are translated into virtually identical proteins. Here, we address the question of how this nucleotide polymorphism is maintained with little or no functional variation for selection to operate on. We investigate circa 750–850 bp of MHC class II DAB genes in four wild populations of the guppy Poecilia reticulata. By sequencing an extended region, we uncovered 40.9% more sequences (alleles), which would have been missed if we had amplified the exon 2 alone. We found evidence of several gene conversion events that may have homogenized sequence variation. This reduces the visible copy number variation (CNV) and can result in a systematic underestimation of the CNV in studies of the MHC and perhaps other multigene families. We then focus on a single cluster, which comprises 27 (of a total of 66) sequences. These sequences are virtually identical and show no signal of selection. We use microsatellites to reconstruct the populations' demography and employ simulations to examine whether so many similar nucleotide sequences can be maintained in the populations. Simulations show that this variation does not behave neutrally. We propose that selection operates outside the PBR, for example, on linked immune genes or on the “sheltered load” that is thought to be associated to the MHC. Future studies on the MHC would benefit from extending the amplicon size to include polymorphisms outside the exon with the PBR. This may capture otherwise cryptic haplotype variation and CNV, and it may help detect other regions in the MHC that are under selection

    Conversion events in gene clusters

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene clusters containing multiple similar genomic regions in close proximity are of great interest for biomedical studies because of their associations with inherited diseases. However, such regions are difficult to analyze due to their structural complexity and their complicated evolutionary histories, reflecting a variety of large-scale mutational events. In particular, conversion events can mislead inferences about the relationships among these regions, as traced by traditional methods such as construction of phylogenetic trees or multi-species alignments.</p> <p>Results</p> <p>To correct the distorted information generated by such methods, we have developed an automated pipeline called CHAP (Cluster History Analysis Package) for detecting conversion events. We used this pipeline to analyze the conversion events that affected two well-studied gene clusters (α-globin and β-globin) and three gene clusters for which comparative sequence data were generated from seven primate species: CCL (chemokine ligand), IFN (interferon), and CYP2abf (part of cytochrome P450 family 2). CHAP is freely available at <url>http://www.bx.psu.edu/miller_lab</url>.</p> <p>Conclusions</p> <p>These studies reveal the value of characterizing conversion events in the context of studying gene clusters in complex genomes.</p

    Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics.

    Get PDF
    BackgroundSingle-cell transcriptomics allows researchers to investigate complex communities of heterogeneous cells. It can be applied to stem cells and their descendants in order to chart the progression from multipotent progenitors to fully differentiated cells. While a variety of statistical and computational methods have been proposed for inferring cell lineages, the problem of accurately characterizing multiple branching lineages remains difficult to solve.ResultsWe introduce Slingshot, a novel method for inferring cell lineages and pseudotimes from single-cell gene expression data. In previously published datasets, Slingshot correctly identifies the biological signal for one to three branching trajectories. Additionally, our simulation study shows that Slingshot infers more accurate pseudotimes than other leading methods.ConclusionsSlingshot is a uniquely robust and flexible tool which combines the highly stable techniques necessary for noisy single-cell data with the ability to identify multiple trajectories. Accurate lineage inference is a critical step in the identification of dynamic temporal gene expression

    The evolutionary history of the SAL1 gene family in eutherian mammals

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>SAL1 (salivary lipocalin) is a member of the OBP (Odorant Binding Protein) family and is involved in chemical sexual communication in pig. SAL1 and its relatives may be involved in pheromone and olfactory receptor binding and in pre-mating behaviour. The evolutionary history and the selective pressures acting on SAL1 and its orthologous genes have not yet been exhaustively described. The aim of the present work was to study the evolution of these genes, to elucidate the role of selective pressures in their evolution and the consequences for their functions.</p> <p>Results</p> <p>Here, we present the evolutionary history of SAL1 gene and its orthologous genes in mammals. We found that (1) SAL1 and its related genes arose in eutherian mammals with lineage-specific duplications in rodents, horse and cow and are lost in human, mouse lemur, bushbaby and orangutan, (2) the evolution of duplicated genes of horse, rat, mouse and guinea pig is driven by concerted evolution with extensive gene conversion events in mouse and guinea pig and by positive selection mainly acting on paralogous genes in horse and guinea pig, (3) positive selection was detected for amino acids involved in pheromone binding and amino acids putatively involved in olfactory receptor binding, (4) positive selection was also found for lineage, indicating a species-specific strategy for amino acid selection.</p> <p>Conclusions</p> <p>This work provides new insights into the evolutionary history of SAL1 and its orthologs. On one hand, some genes are subject to concerted evolution and to an increase in dosage, suggesting the need for homogeneity of sequence and function in certain species. On the other hand, positive selection plays a role in the diversification of the functions of the family and in lineage, suggesting adaptive evolution, with possible consequences for speciation and for the reinforcement of prezygotic barriers.</p
    corecore