30 research outputs found
Gene synteny comparisons between different vertebrates provide new insights into breakage and fusion events during mammalian karyotype evolution
<p>Abstract</p> <p>Background</p> <p>Genome comparisons have made possible the reconstruction of the eutherian ancestral karyotype but also have the potential to provide new insights into the evolutionary inter-relationship of the different eutherian orders within the mammalian phylogenetic tree. Such comparisons can additionally reveal (i) the nature of the DNA sequences present within the evolutionary breakpoint regions and (ii) whether or not the evolutionary breakpoints occur randomly across the genome. Gene synteny analysis (E-painting) not only greatly reduces the complexity of comparative genome sequence analysis but also extends its evolutionary reach.</p> <p>Results</p> <p>E-painting was used to compare the genome sequences of six different mammalian species and chicken. A total of 526 evolutionary breakpoint intervals were identified and these were mapped to a median resolution of 120 kb, the highest level of resolution so far obtained. A marked correlation was noted between evolutionary breakpoint frequency and gene density. This correlation was significant not only at the chromosomal level but also sub-chromosomally when comparing genome intervals of lengths as short as 40 kb. Contrary to previous findings, a comparison of evolutionary breakpoint locations with the chromosomal positions of well mapped common fragile sites and cancer-associated breakpoints failed to reveal any evidence for significant co-location. Primate-specific chromosomal rearrangements were however found to occur preferentially in regions containing segmental duplications and copy number variants.</p> <p>Conclusion</p> <p>Specific chromosomal regions appear to be prone to recurring rearrangement in different mammalian lineages ('breakpoint reuse') even if the breakpoints themselves are likely to be non-identical. The putative ancestral eutherian genome, reconstructed on the basis of the synteny analysis of 7 vertebrate genome sequences, not only confirmed the results of previous molecular cytogenetic studies but also increased the definition of the inferred structure of ancestral eutherian chromosomes. For the first time in such an analysis, the opossum was included as an outgroup species. This served to confirm our previous model of the ancestral eutherian genome since all ancestral syntenic segment associations were also noted in this marsupial.</p
The genome sequence of the outbreeding globe artichoke constructed de novo incorporating a phase-aware low-pass sequencing strategy of F1 progeny
Globe artichoke (Cynara cardunculus var. scolymus) is an out-crossing, perennial, multi-use crop species that is grown worldwide and belongs to the Compositae, one of the most successful Angiosperm families. We describe the first genome sequence of globe artichoke. The assembly, comprising of 13,588 scaffolds covering 725 of the 1,084 Mb genome, was generated using ~133-fold Illumina sequencing data and encodes 26,889 predicted genes. Re-sequencing (30×) of globe artichoke and cultivated cardoon (C. cardunculus var. altilis) parental genotypes and low-coverage (0.5 to 1×) genotyping-by-sequencing of 163 F(1) individuals resulted in 73% of the assembled genome being anchored in 2,178 genetic bins ordered along 17 chromosomal pseudomolecules. This was achieved using a novel pipeline, SOILoCo (Scaffold Ordering by Imputation with Low Coverage), to detect heterozygous regions and assign parental haplotypes with low sequencing read depth and of unknown phase. SOILoCo provides a powerful tool for de novo genome analysis of outcrossing species. Our data will enable genome-scale analyses of evolutionary processes among crops, weeds, and wild species within and beyond the Compositae, and will facilitate the identification of economically important genes from related species
Male Mouse Recombination Maps for Each Autosome Identified by Chromosome Painting
Linkage maps constructed from genetic analysis of gene order and crossover frequency provide few clues to the basis of genomewide distribution of meiotic recombination, such as chromosome structure, that influences meiotic recombination. To bridge this gap, we have generated the first cytological recombination map that identifies individual autosomes in the male mouse. We prepared meiotic chromosome (synaptonemal complex [SC]) spreads from 110 mouse spermatocytes, identified each autosome by multicolor fluorescence in situ hybridization of chromosome-specific DNA libraries, and mapped >2,000 sites of recombination along individual autosomes, using immunolocalization of MLH1, a mismatch repair protein that marks crossover sites. We show that SC length is strongly correlated with crossover frequency and distribution. Although the length of most SCs corresponds to that predicted from their mitotic chromosome length rank, several SCs are longer or shorter than expected, with corresponding increases and decreases in MLH1 frequency. Although all bivalents share certain general recombination features, such as few crossovers near the centromeres and a high rate of distal recombination, individual bivalents have unique patterns of crossover distribution along their length. In addition to SC length, other, as-yet-unidentified, factors influence crossover distribution leading to hot regions on individual chromosomes, with recombination frequencies as much as six times higher than average, as well as cold spots with no recombination. By reprobing the SC spreads with genetically mapped BACs, we demonstrate a robust strategy for integrating genetic linkage and physical contig maps with mitotic and meiotic chromosome structure
Recommended from our members
Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish
Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30-45× sequence coverage, and the Illumina platform was used to generate 50-160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses
A two-stage digestion of whole murine knee joints for single-cell RNA sequencing.
ObjectiveSingle-cell RNA sequencing (scRNA-seq) is a powerful technology that can be applied to the cells populating the whole knee in the study of joint pathology. The knee contains cells embedded in hard structural tissues, cells in softer tissues and membranes, and immune cells. This creates a technical challenge in preparing a viable and representative cell suspension suitable for use in scRNA-seq in minimal time, where under-digestion may exclude cells in hard tissues, over-digestion may damage soft tissue cells, and prolonged digestion may induce phenotypic drift. We developed a rapid two-stage digestion protocol to overcome these difficulties.DesignA two-stage digest consisting of first collagenase IV, an intermediate cell recovery, then collagenase II on the remaining hard tissue. Cells were sequenced on the 10x Genomics platform.ResultsWe observed consistent cell numbers and viable single cell suspensions suitable for scRNA-seq analysis. Comparison of contralateral knees and separate mice showed reproducible cell yields and gene expression patterns by similar cell-types. A diverse collection of structural and immune cells were captured with a majority from immune origins. Two digestions were necessary to capture all cell-types.ConclusionsThe knee contains a diverse mixture of stromal and immune cells that may be crucial for the study of osteoarthritis. The two-stage digestion presented here reproducibly generated highly viable and representative single-cell suspension for sequencing from the whole knee. This protocol facilitates transcriptomic studies of the joint as a complete organ
Are molecular cytogenetics and bioinformatics suggesting diverging models of ancestral mammalian genomes?
"Excavating" ancestral genomes The recent release of the chicken genome sequence (Hillier et al. 2004Go) provided exciting news for the comparative genomics community as it allows insights into the early evolution of the human genome. A bird species can now be used as an outgroup to model early mammalian genome organization and reshuffling. The genome sequence data have already been incorporated in a computational analysis of chicken, mouse, rat, and human genome sequences for the reconstruction of the ancestral genome organization of both a mammalian ancestor as well as a murid rodent ancestor (Hillier et al. 2004Go; Bourque et al. 2005Go). This bioinformatic effort joins a molecular cytogenetic model (Richard et al. 2003Go; Yang et al. 2003Go; Robinson et al. 2004Go; Svartman et al. 2004Go; Wienberg 2004Go; Froenicke 2005Go) as the second global approach to explore the architecture of the ancestral eutherian karyotype—a fundamental question in comparative genomics. Since both models use the human genome as reference, they are readily comparable. Surprisingly, however, they share few similarities. Only two small autosomes and the sex chromosomes of the hypothesized ancestral karyotypes are common to both. Unfortunately, given its significance, neither the extent of these differences nor their impact on comparative genomics have been discussed by Bourque and colleagues (2005Go). In an attempt to redress this, we compare the two methods of ancestral genome reconstruction, verify the resulting models, and discuss reasons for their apparent divergence
Recommended from our members
Are molecular cytogenetics and bioinformatics suggesting diverging models of ancestral mammalian genomes?
"Excavating" ancestral genomes The recent release of the chicken genome sequence (Hillier et al. 2004Go) provided exciting news for the comparative genomics community as it allows insights into the early evolution of the human genome. A bird species can now be used as an outgroup to model early mammalian genome organization and reshuffling. The genome sequence data have already been incorporated in a computational analysis of chicken, mouse, rat, and human genome sequences for the reconstruction of the ancestral genome organization of both a mammalian ancestor as well as a murid rodent ancestor (Hillier et al. 2004Go; Bourque et al. 2005Go). This bioinformatic effort joins a molecular cytogenetic model (Richard et al. 2003Go; Yang et al. 2003Go; Robinson et al. 2004Go; Svartman et al. 2004Go; Wienberg 2004Go; Froenicke 2005Go) as the second global approach to explore the architecture of the ancestral eutherian karyotype—a fundamental question in comparative genomics. Since both models use the human genome as reference, they are readily comparable. Surprisingly, however, they share few similarities. Only two small autosomes and the sex chromosomes of the hypothesized ancestral karyotypes are common to both. Unfortunately, given its significance, neither the extent of these differences nor their impact on comparative genomics have been discussed by Bourque and colleagues (2005Go). In an attempt to redress this, we compare the two methods of ancestral genome reconstruction, verify the resulting models, and discuss reasons for their apparent divergence
Consequences of Normalizing Transcriptomic and Genomic Libraries of Plant Genomes Using a Duplex-Specific Nuclease and Tetramethylammonium Chloride
<div><p>Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.</p> </div
The effect of renaturation in 3 M TMAC versus 0.5 M NaCl on the GC composition of normalized transcriptomic and genomic libraries of lettuce.
<p>A) Twenty million reads in each RNA-Seq library (green: renatured in 0.5 M NaCl; blue: renatured in 3 M TMAC) and 10 million reads in each genomic library (red: renatured in 0.5 M NaCl; orange: renatured in 3 M TMAC) were categorized by % GC content and then the percentage of the total number of reads in each GC category was plotted against the % GC content of the category. The average GC content of reads in both types of library renatured in 3 M TMAC was approximately 1% greater than in the libraries renatured in 0.5 M NaCl (+1.1% for mRNA and +1.4% for gDNA). That shift is statistically significant based on a matched pairs difference analysis where the probability of being inferior to the Wilcoxon Signed Rank Test Statistic S is 0.9830. B) RNA-Seq reads from two libraries, one normalized using 3 M TMAC and the other 0.5 M NaCl, were separately mapped to a QC set of 25,857 uninterrupted ORFs identified in the lettuce transcriptome assembly. The differential abundance of reads representing each ORF in the libraries was then calculated by subtracting the RPKM for each gene in the TMAC-renatured library from the RPKM for that sequence in the NaCl-renatured library and the sum divided by the total RPKM for the gene in both libraries and these values plotted against the average GC content of that gene (Low coverage, <20 RPKM: gray dots. Medium coverage, 20 to 40 RPKM: blue dots. High coverage, 40 to 300 RPKM: red dots). Statistical significance was assessed by two tail Student t-tests for each RPKM bin. Differences between the hybridizations in NaCl and TMAC were small for transcripts present at moderate levels (Medium RPKM bin). However, NaCl was significantly more effective than TMAC both in reducing abundantly expressed transcripts (RPKM >40; average RPKM −16% with NaCl treatment; P-value = 2.6 e<sup>−41</sup>) and increasing the number of relatively rare transcripts (RPKM <20; average RPKM +5% with NaCl treatment; P-value = 9.8 e<sup>−33</sup>). The negative slopes of the regression lines (gray for low RPKM, green for medium RPKM, and red for high RPKM genes) indicate that genes with higher GC content tended to be represented at higher levels in the library renatured using 3 M TMAC as compared to 0.5 M NaCl and that, conversely, genes with lower GC content tended to be represented at higher levels in the library renatured using 0.5 M NaCl. This trend was more pronounced for genes with higher RPKM. C) Sixty million genomic reads from two libraries, one renatured using 3 M TMAC and the other 0.5 M NaCl, were separately mapped to a QC set of 25,857 uninterrupted ORFs identifed in the lettuce transcriptome assembly. The differential abundance of reads representing each gene in the two libraries was then calculated (as described in the text) and plotted against the average GC content of that gene (Low Coverage, <30 RPKM: gray circles. Medium coverage, 30 to 80 RPKM: blue circles. High coverage, >80 RPKM: red circles). The negative slopes of the regression lines (gray for low RPKM, green for medium RPKM, and red for high RPKM) indicate that genes with higher GC content tended to be represented at higher levels in the library normalized using 3 M TMAC as compared with 0.5 M NaCl and that genes with lower GC content tended to be represented at higher levels in the library renatured using 0.5 M NaCl. The low and medium RPKM bins were highly significantly different for both NaCl and TMAC treatments. The treatments were not significantly different for the high RPKM bin.</p