36 research outputs found

    Integrating sequence with FPC fingerprint maps

    Get PDF
    Recent advances in both clone fingerprinting and draft sequencing technology have made it increasingly common for species to have a bacterial artificial clone (BAC) fingerprint map, BAC end sequences (BESs) and draft genomic sequence. The FPC (fingerprinted contigs) software package contains three modules that maximize the value of these resources. The BSS (blast some sequence) module provides a way to easily view the results of aligning draft sequence to the BESs, and integrates the results with the following two modules. The MTP (minimal tiling path) module uses sequence and fingerprints to determine a minimal tiling path of clones. The DSI (draft sequence integration) module aligns draft sequences to FPC contigs, displays them alongside the contigs and identifies potential discrepancies; the alignment can be based on either individual BES alignments to the draft, or on the locations of BESs that have been assembled into the draft. FPC also supports high-throughput fingerprint map generation as its time-intensive functions have been parallelized for Unix-based desktops or servers with multiple CPUs. Simulation results are provided for the MTP, DSI and parallelization. These features are in the FPC V9.3 software package, which is freely available

    A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The fermented dried seeds of <it>Theobroma cacao </it>(cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust <it>T. cacao </it>cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected.</p> <p>Results</p> <p>Here, we describe the construction of a BAC-based integrated genetic-physical map of the <it>T. cacao </it>cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of <it>T. cacao </it>is 374.6 Mbp. A comparative analysis with <it>A. thaliana, V. vinifera</it>, and <it>P. trichocarpa </it>suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two <it>T. cacao </it>cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed.</p> <p>Conclusions</p> <p>The results presented in this study are a stand-alone resource for functional exploitation and enhancement of <it>Theobroma cacao </it>but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the <it>T. cacao </it>genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays.</p

    The Physical and Genetic Framework of the Maize B73 Genome

    Get PDF
    Maize is a major cereal crop and an important model system for basic biological research. Knowledge gained from maize research can also be used to genetically improve its grass relatives such as sorghum, wheat, and rice. The primary objective of the Maize Genome Sequencing Consortium (MGSC) was to generate a reference genome sequence that was integrated with both the physical and genetic maps. Using a previously published integrated genetic and physical map, combined with in-coming maize genomic sequence, new sequence-based genetic markers, and an optical map, we dynamically picked a minimum tiling path (MTP) of 16,910 bacterial artificial chromosome (BAC) and fosmid clones that were used by the MGSC to sequence the maize genome. The final MTP resulted in a significantly improved physical map that reduced the number of contigs from 721 to 435, incorporated a total of 8,315 mapped markers, and ordered and oriented the majority of FPC contigs. The new integrated physical and genetic map covered 2,120 Mb (93%) of the 2,300-Mb genome, of which 405 contigs were anchored to the genetic map, totaling 2,103.4 Mb (99.2% of the 2,120 Mb physical map). More importantly, 336 contigs, comprising 94.0% of the physical map (∼1,993 Mb), were ordered and oriented. Finally we used all available physical, sequence, genetic, and optical data to generate a golden path (AGP) of chromosome-based pseudomolecules, herein referred to as the B73 Reference Genome Sequence version 1 (B73 RefGen_v1)

    Genome-wide BAC-end sequencing of Cucumis melo using two BAC libraries

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although melon (<it>Cucumis melo </it>L.) is an economically important fruit crop, no genome-wide sequence information is openly available at the current time. We therefore sequenced BAC-ends representing a total of 33,024 clones, half of them from a previously described melon BAC library generated with restriction endonucleases and the remainder from a new random-shear BAC library.</p> <p>Results</p> <p>We generated a total of 47,140 high-quality BAC-end sequences (BES), 91.7% of which were paired-BES. Both libraries were assembled independently and then cross-assembled to obtain a final set of 33,372 non-redundant, high-quality sequences. These were grouped into 6,411 contigs (4.5 Mb) and 26,961 non-assembled BES (14.4 Mb), representing ~4.2% of the melon genome. The sequences were used to screen genomic databases, identifying 7,198 simple sequence repeats (corresponding to one microsatellite every 2.6 kb) and 2,484 additional repeats of which 95.9% represented transposable elements. The sequences were also used to screen expressed sequence tag (EST) databases, revealing 11,372 BES that were homologous to ESTs. This suggests that ~30% of the melon genome consists of coding DNA. We observed regions of microsynteny between melon paired-BES and six other dicotyledonous plant genomes.</p> <p>Conclusion</p> <p>The analysis of nearly 50,000 BES from two complementary genomic libraries covered ~4.2% of the melon genome, providing insight into properties such as microsatellite and transposable element distribution, and the percentage of coding DNA. The observed synteny between melon paired-BES and six other plant genomes showed that useful comparative genomic data can be derived through large scale BAC-end sequencing by anchoring a small proportion of the melon genome to other sequenced genomes.</p

    A hybrid BAC physical map of potato: a framework for sequencing a heterozygous genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Potato is the world's third most important food crop, yet cultivar improvement and genomic research in general remain difficult because of the heterozygous and tetraploid nature of its genome. The development of physical map resources that can facilitate genomic analyses in potato has so far been very limited. Here we present the methods of construction and the general statistics of the first two genome-wide BAC physical maps of potato, which were made from the heterozygous diploid clone RH89-039-16 (RH).</p> <p>Results</p> <p>First, a gel electrophoresis-based physical map was made by AFLP fingerprinting of 64478 BAC clones, which were aligned into 4150 contigs with an estimated total length of 1361 Mb. Screening of BAC pools, followed by the KeyMaps <it>in silico </it>anchoring procedure, identified 1725 AFLP markers in the physical map, and 1252 BAC contigs were anchored the ultradense potato genetic map. A second, sequence-tag-based physical map was constructed from 65919 whole genome profiling (WGP) BAC fingerprints and these were aligned into 3601 BAC contigs spanning 1396 Mb. The 39733 BAC clones that overlap between both physical maps provided anchors to 1127 contigs in the WGP physical map, and reduced the number of contigs to around 2800 in each map separately. Both physical maps were 1.64 times longer than the 850 Mb potato genome. Genome heterozygosity and incomplete merging of BAC contigs are two factors that can explain this map inflation. The contig information of both physical maps was united in a single table that describes hybrid potato physical map.</p> <p>Conclusions</p> <p>The AFLP physical map has already been used by the Potato Genome Sequencing Consortium for sequencing 10% of the heterozygous genome of clone RH on a BAC-by-BAC basis. By layering a new WGP physical map on top of the AFLP physical map, a genetically anchored genome-wide framework of 322434 sequence tags has been created. This reference framework can be used for anchoring and ordering of genomic sequences of clone RH (and other potato genotypes), and opens the possibility to finish sequencing of the RH genome in a more efficient way via high throughput next generation approaches.</p

    A physical map of Brassica oleracea shows complexity of chromosomal changes following recursive paleopolyploidizations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Evolution of the Brassica species has been recursively affected by polyploidy events, and comparison to their relative, <it>Arabidopsis thaliana</it>, provides means to explore their genomic complexity.</p> <p>Results</p> <p>A genome-wide physical map of a rapid-cycling strain of <it>B. oleracea </it>was constructed by integrating high-information-content fingerprinting (HICF) of Bacterial Artificial Chromosome (BAC) clones with hybridization to sequence-tagged probes. Using 2907 contigs of two or more BACs, we performed several lines of comparative genomic analysis. Interspecific DNA synteny is much better preserved in euchromatin than heterochromatin, showing the qualitative difference in evolution of these respective genomic domains. About 67% of contigs can be aligned to the Arabidopsis genome, with 96.5% corresponding to euchromatic regions, and 3.5% (shown to contain repetitive sequences) to pericentromeric regions. Overgo probe hybridization data showed that contigs aligned to Arabidopsis euchromatin contain ~80% of low-copy-number genes, while genes with high copy number are much more frequently associated with pericentromeric regions. We identified 39 interchromosomal breakpoints during the diversification of <it>B. oleracea </it>and <it>Arabidopsis thaliana</it>, a relatively high level of genomic change since their divergence. Comparison of the <it>B. oleracea </it>physical map with Arabidopsis and other available eudicot genomes showed appreciable 'shadowing' produced by more ancient polyploidies, resulting in a web of relatedness among contigs which increased genomic complexity.</p> <p>Conclusions</p> <p>A high-resolution genetically-anchored physical map sheds light on Brassica genome organization and advances positional cloning of specific genes, and may help to validate genome sequence assembly and alignment to chromosomes.</p> <p>All the physical mapping data is freely shared at a WebFPC site (<url>http://lulu.pgml.uga.edu/fpc/WebAGCoL/brassica/WebFPC/</url>; Temporarily password-protected: account: pgml; password: 123qwe123.</p

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore