15 research outputs found

    Structural variation in the chicken genome identified by paired-end next-generation DNA sequencing of reduced representation libraries

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Variation within individual genomes ranges from single nucleotide polymorphisms (SNPs) to kilobase, and even megabase, sized structural variants (SVs), such as deletions, insertions, inversions, and more complex rearrangements. Although much is known about the extent of SVs in humans and mice, species in which they exert significant effects on phenotypes, very little is known about the extent of SVs in the 2.5-times smaller and less repetitive genome of the chicken.</p> <p>Results</p> <p>We identified hundreds of shared and divergent SVs in four commercial chicken lines relative to the reference chicken genome. The majority of SVs were found in intronic and intergenic regions, and we also found SVs in the coding regions. To identify the SVs, we combined high-throughput short read paired-end sequencing of genomic reduced representation libraries (RRLs) of pooled samples from 25 individuals and computational mapping of DNA sequences from a reference genome.</p> <p>Conclusion</p> <p>We provide a first glimpse of the high abundance of small structural genomic variations in the chicken. Extrapolating our results, we estimate that there are thousands of rearrangements in the chicken genome, the majority of which are located in non-coding regions. We observed that structural variation contributes to genetic differentiation among current domesticated chicken breeds and the Red Jungle Fowl. We expect that, because of their high abundance, SVs might explain phenotypic differences and play a role in the evolution of the chicken genome. Finally, our study exemplifies an efficient and cost-effective approach for identifying structural variation in sequenced genomes.</p

    Chicken genome analysis reveals novel genes encoding biotin-binding proteins related to avidin family

    Get PDF
    BACKGROUND: A chicken egg contains several biotin-binding proteins (BBPs), whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. RESULTS: Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. CONCLUSION: We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins

    The role of LINEs and CpG islands in dosage compensation on the chicken Z chromosome

    Get PDF
    Most avian Z genes are expressed more highly in ZZ males than ZW females, suggesting that chromosome-wide mechanisms of dosage compensation have not evolved. Nevertheless, a small percentage of Z genes are expressed at similar levels in males and females, an indication that a yet unidentified mechanism compensates for the sex difference in copy number. Primary DNA sequences are thought to have a role in determining chromosome gene inactivation status on the mammalian X chromosome. However, it is currently unknown whether primary DNA sequences also mediate chicken Z gene compensation status. Using a combination of chicken DNA sequences and Z gene compensation profiles of 310 genes, we explored the relationship between Z gene compensation status and primary DNA sequence features. Statistical analysis of different Z chromosomal features revealed that long interspersed nuclear elements (LINEs) and CpG islands are enriched on the Z chromosome compared with 329 other DNA features. Linear support vector machine (SVM) classifiers, using primary DNA sequences, correctly predict the Z compensation status for >60% of all Z-linked genes. CpG islands appear to be the most accurate classifier and alone can correctly predict compensation of 63% of Z genes. We also show that LINE CR1 elements are enriched 2.7-fold on the chicken Z chromosome compared with autosomes and that chicken chromosomal length is highly correlated with percentage LINE content. However, the position of LINE elements is not significantly associated with dosage compensation status of Z genes. We also find a trend for a higher proportion of CpG islands in the region of the Z chromosome with the fewest dosage-compensated genes compared with the region containing the greatest concentration of compensated genes. Comparison between chicken and platypus genomes shows that LINE elements are not enriched on sex chromosomes in platypus, indicating that LINE accumulation is not a feature of all sex chromosomes. Our results suggest that CpG islands are not randomly distributed on the Z chromosome and may influence Z gene dosage compensation status

    Genome Dynamics of Short Oligonucleotides: The Example of Bacterial DNA Uptake Enhancing Sequences

    Get PDF
    Among the many bacteria naturally competent for transformation by DNA uptake—a phenomenon with significant clinical and financial implications— Pasteurellaceae and Neisseriaceae species preferentially take up DNA containing specific short sequences. The genomic overrepresentation of these DNA uptake enhancing sequences (DUES) causes preferential uptake of conspecific DNA, but the function(s) behind this overrepresentation and its evolution are still a matter for discovery. Here I analyze DUES genome dynamics and evolution and test the validity of the results to other selectively constrained oligonucleotides. I use statistical methods and computer simulations to examine DUESs accumulation in Haemophilus influenzae and Neisseria gonorrhoeae genomes. I analyze DUESs sequence and nucleotide frequencies, as well as those of all their mismatched forms, and prove the dependence of DUESs genomic overrepresentation on their preferential uptake by quantifying and correlating both characteristics. I then argue that mutation, uptake bias, and weak selection against DUESs in less constrained parts of the genome combined are sufficient enough to cause DUESs accumulation in susceptible parts of the genome with no need for other DUES function. The distribution of overrepresentation values across sequences with different mismatch loads compared to the DUES suggests a gradual yet not linear molecular drive of DNA sequences depending on their similarity to the DUES. Other genomically overrepresented sequences, both pro- and eukaryotic, show similar distribution of frequencies suggesting that the molecular drive reported above applies to other frequent oligonucleotides. Rare oligonucleotides, however, seem to be gradually drawn to genomic underrepresentation, thus, suggesting a molecular drag. To my knowledge this work provides the first clear evidence of the gradual evolution of selectively constrained oligonucleotides, including repeated, palindromic and protein/transcription factor-binding DNAs
    corecore