30 research outputs found

    Valuation of Property--Economic and Legal Standards

    Get PDF

    The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High quality annotation of the genes and transposable elements in complex genomes requires a human-curated integration of multiple sources of computational evidence. These evidences include results from a diversity of <it>ab initio </it>prediction programs as well as homology-based searches. Most of these programs operate on a single contiguous sequence at a time, and the results are generated in a diverse array of readable formats that must be translated to a standardized file format. These translated results must then be concatenated into a single source, and then presented in an integrated form for human curation.</p> <p>Results</p> <p>We have designed, implemented, and assessed a Perl-based workflow named DAWGPAWS for the generation of computational results for human curation of the genes and transposable elements in plant genomes. The use of DAWGPAWS was found to accelerate annotation of 80–200 kb wheat DNA inserts in bacterial artificial chromosome (BAC) vectors by approximately twenty-fold and to also significantly improve the quality of the annotation in terms of completeness and accuracy.</p> <p>Conclusion</p> <p>The DAWGPAWS genome annotation pipeline fills an important need in the annotation of plant genomes by generating computational evidences in a high throughput manner, translating these results to a common file format, and facilitating the human curation of these computational results. We have verified the value of DAWGPAWS by using this pipeline to annotate the genes and transposable elements in 220 BAC insertions from the hexaploid wheat genome (<it>Triticum aestivum </it>L.). DAWGPAWS can be applied to annotation efforts in other plant genomes with minor modifications of program-specific configuration files, and the modular design of the workflow facilitates integration into existing pipelines.</p

    Tuberculosis in the Western Pacific Region: Estimating the burden of disease and return on investment 2020–2030 in four countries

    Get PDF
    Background: We aimed to estimate the disease burden of Tuberculosis (TB) and return on investment of TB care in selected high-burden countries of the Western Pacific Region (WPR) until 2030. Methods: We projected the TB epidemic in Viet Nam and Lao People's Democratic Republic (PDR) 2020–2030 using a mathematical model under various scenarios: counterfactual (no TB care); baseline (TB care continues at current levels); and 12 different diagnosis and treatment interventions. We retrieved previous modeling results for China and the Philippines. We pooled the new and existing information on incidence and deaths in the four countries, covering >80% of the TB burden in WPR. We estimated the return on investment of TB care and interventions in Viet Nam and Lao PDR using a Solow model. Findings: In the baseline scenario, TB incidence in the four countries decreased from 97•0/100,000/year (2019) to 90•1/100,000/year (2030), and TB deaths from 83,300/year (2019) to 71,100/year (2030). Active case finding (ACF) strategies (screening people not seeking care for respiratory symptoms) were the most effective single interventions. Return on investment (2020–2030) for TB care in Viet Nam and Lao PDR ranged US4−US4-US49/dollar spent; additional interventions brought up to US$2•7/dollar spent. Interpretation: In the modeled countries, TB incidence will only modestly decrease without additional interventions. Interventions that include ACF can reduce TB burden but achieving the End TB incidence and mortality targets will be difficult without new transformational tools (e.g. vaccine, new diagnostic tools, shorter treatment). However, TB care, even at its current level, can bring a multiple-fold return on investment

    Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends.</p> <p>Results</p> <p>A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the <it>Sal</it>I MSLL libraries being the most highly enriched (31% align to an EST contig), while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%). These two techniques were compared with other gene-enrichment methods, and shown to be complementary.</p> <p>Conclusion</p> <p>MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of epigenetic boundaries are barely understood at this time, MSLL technology flags both approximate boundaries and methylated genes that deserve additional investigation. MSLL and HMPR sequences provide a valuable resource for maize genome annotation, and are a uniquely valuable complement to any plant genome sequencing project. In order to make these results fully accessible to the community, a web display was developed that shows the alignment of MSLL, HMPR, and other gene-rich sequences to the BACs; this display is continually updated with the latest ESTs and BAC sequences.</p

    Exceptional Diversity, Non-Random Distribution, and Rapid Evolution of Retroelements in the B73 Maize Genome

    Get PDF
    Recent comprehensive sequence analysis of the maize genome now permits detailed discovery and description of all transposable elements (TEs) in this complex nuclear environment. Reiteratively optimized structural and homology criteria were used in the computer-assisted search for retroelements, TEs that transpose by reverse transcription of an RNA intermediate, with the final results verified by manual inspection. Retroelements were found to occupy the majority (>75%) of the nuclear genome in maize inbred B73. Unprecedented genetic diversity was discovered in the long terminal repeat (LTR) retrotransposon class of retroelements, with >400 families (>350 newly discovered) contributing >31,000 intact elements. The two other classes of retroelements, SINEs (four families) and LINEs (at least 30 families), were observed to contribute 1,991 and ∼35,000 copies, respectively, or a combined ∼1% of the B73 nuclear genome. With regard to fully intact elements, median copy numbers for all retroelement families in maize was 2 because >250 LTR retrotransposon families contained only one or two intact members that could be detected in the B73 draft sequence. The majority, perhaps all, of the investigated retroelement families exhibited non-random dispersal across the maize genome, with LINEs, SINEs, and many low-copy-number LTR retrotransposons exhibiting a bias for accumulation in gene-rich regions. In contrast, most (but not all) medium- and high-copy-number LTR retrotransposons were found to preferentially accumulate in gene-poor regions like pericentromeric heterochromatin, while a few high-copy-number families exhibited the opposite bias. Regions of the genome with the highest LTR retrotransposon density contained the lowest LTR retrotransposon diversity. These results indicate that the maize genome provides a great number of different niches for the survival and procreation of a great variety of retroelements that have evolved to differentially occupy and exploit this genomic diversity

    A draft physical map of a D-genome cotton species (Gossypium raimondii)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genetically anchored physical maps of large eukaryotic genomes have proven useful both for their intrinsic merit and as an adjunct to genome sequencing. Cultivated tetraploid cottons, <it>Gossypium hirsutum </it>and <it>G. barbadense</it>, share a common ancestor formed by a merger of the A and D genomes about 1-2 million years ago. Toward the long-term goal of characterizing the spectrum of diversity among cotton genomes, the worldwide cotton community has prioritized the D genome progenitor <it>Gossypium raimondii </it>for complete sequencing.</p> <p>Results</p> <p>A whole genome physical map of <it>G. raimondii</it>, the putative D genome ancestral species of tetraploid cottons was assembled, integrating genetically-anchored overgo hybridization probes, agarose based fingerprints and 'high information content fingerprinting' (HICF). A total of 13,662 BAC-end sequences and 2,828 DNA probes were used in genetically anchoring 1585 contigs to a cotton consensus genetic map, and 370 and 438 contigs, respectively to <it>Arabidopsis thaliana </it>(AT) and <it>Vitis vinifera </it>(VV) whole genome sequences.</p> <p>Conclusion</p> <p>Several lines of evidence suggest that the <it>G. raimondii </it>genome is comprised of two qualitatively different components. Much of the gene rich component is aligned to the <it>Arabidopsis </it>and <it>Vitis vinifera </it>genomes and shows promise for utilizing translational genomic approaches in understanding this important genome and its resident genes. The integrated genetic-physical map is of value both in assembling and validating a planned reference sequence.</p

    A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    Get PDF
    Background: Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome.Results: Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella.Conclusions: When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution

    Bacterial Artificial Chromosome Data Management (BACMan)

    No full text
    Bacterial Artificial Chromosome Data Management (BACMan) is a Microsoft Access based application designed for the management and analysis of hybridization data related to the high throughput screening of large insert genomic libraries associated with physical mapping projects. BACMan integrates information from the entire process involved in BAC based hybridization screening so that final hybridization data are tractable through individual autoradiographs, probe preparation, and gridding of the blots used in the experiment. All BAC library plates and autoradiographs are indexed using a barcode system both to facilitate this data tracking and to ensure quality control. This barcode system also allows for computer-assisted scoring of autoradiographs and therefore increases the potential throughput of an individual user. BACMan also facilitates the design and deconvolution of pooled probe multiplexed experiments to maximize data generation, and allows for interoperability with FPC.Upload of resource previously published to SourceForge in 2005

    An SNP Resource for Rice Genetics and Breeding Based on Subspecies Indica and Japonica Genome Alignments

    No full text
    Dense coverage of the rice genome with polymorphic DNA markers is an invaluable tool for DNA marker-assisted breeding, positional cloning, and a wide range of evolutionary studies. We have aligned drafts of two rice subspecies, indica and japonica, and analyzed levels and patterns of genetic diversity. After filtering multiple copy and low quality sequence, 408,898 candidate DNA polymorphisms (SNPs/INDELs) were discerned between the two subspecies. These filters have the consequence that our data set includes only a subset of the available SNPs (in particular excluding large numbers of SNPs that may occur between repetitive DNA alleles) but increase the likelihood that this subset is useful: Direct sequencing suggests that 79.8% ± 7.5% of the in silico SNPs are real. The SNP sample in our database is not randomly distributed across the genome. In fact, 566 rice genomic regions had unusually high (328 contigs/48.6 Mb/13.6% of genome) or low (237 contigs/64.7 Mb/18.1% of genome) polymorphism rates. Many SNP-poor regions were substantially longer than most SNP-rich regions, covering up to 4 Mb, and possibly reflecting introgression between the respective gene pools that may have occurred hundreds of years ago. Although 46.2% ± 8.3% of the SNPs differentiate other pairs of japonica and indica genotypes, SNP rates in rice were not predictive of evolutionary rates for corresponding genes in another grass species, sorghum. The data set is freely available at http://www.plantgenome.uga.edu/snp
    corecore