762 research outputs found

    Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps

    Get PDF
    The assembly methods used for whole-genome shotgun (WGS) data have a major impact on the quality of resulting draft genomes. We present a novel algorithm to generate a set of “reliable” overlaps based on identifying repeat k-mers. To demonstrate the benefits of using reliable overlaps, we have created a version of the Phrap assembly program that uses only overlaps from a specific list. We call this version PhrapUMD. Integrating PhrapUMD and our “reliable-overlap” algorithm with the Baylor College of Medicine assembler, Atlas, we assemble the BACs from the Rattus norvegicus genome project. Starting with the same data as the Nov. 2002 Atlas assembly, we compare our results and the Atlas assembly to the 4.3 Mb of rat sequence in the 21 BACs that have been finished. Our version of the draft assembly of the 21 BACs increases the coverage of finished sequence from 93.4% to 96.3%, while simultaneously reducing the base error rate from 4.5 to 1.1 errors per 10,000 bases. There are a number of ways of assessing the relative merits of assemblies when the finished sequence is available. If one views the overall quality of an assembly as proportional to the inverse of the product of the error rate and sequence missed, then the assembly presented here is seven times better. The UMD Overlapper with options for reliable overlaps is available from the authors at http://www.genome.umd.edu. We also provide the changes to the Phrap source code enabling it to use only the reliable overlaps

    Microarrays for global expression constructed with a low redundancy set of 27,500 sequenced cDNAs representing an array of developmental stages and physiological conditions of the soybean plant

    Get PDF
    BACKGROUND: Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays. RESULTS: We report the construction and analysis of a low redundancy 'unigene' set of 27,513 clones that represent a variety of soybean cDNA libraries made from a wide array of source tissue and organ systems, developmental stages, and stress or pathogen-challenged plants. The set was assembled from the 5' sequence data of the cDNA clones using cluster analysis programs. The selected clones were then physically reracked and sequenced at the 3' end. In order to increase gene discovery from immature cotyledon libraries that contain abundant mRNAs representing storage protein gene families, we utilized a high density filter normalization approach to preferentially select more weakly expressed cDNAs. All 27,513 cDNA inserts were amplified by polymerase chain reaction. The amplified products, along with some repetitively spotted control or 'choice' clones, were used to produce three 9,728-element microarrays that have been used to examine tissue specific gene expression and global expression in mutant isolines. CONCLUSIONS: Global expression studies will be greatly aided by the availability of the sequence-validated and low redundancy cDNA sets described in this report. These cDNAs and ESTs represent a wide array of developmental stages and physiological conditions of the soybean plant. We also demonstrate that the quality of the data from the soybean cDNA microarrays is sufficiently reliable to examine isogenic lines that differ with respect to a mutant phenotype and thereby to define a small list of candidate genes potentially encoding or modulated by the mutant phenotype

    Removal of PCR Error Products and Unincorporated Primers by Metal-Chelate Affinity Chromatography

    Get PDF
    Immobilized Metal Affinity Chromatography (IMAC) has been used for decades to purify proteins on the basis of amino acid content, especially surface-exposed histidines and “histidine tags” genetically added to recombinant proteins. We and others have extended the use of IMAC to purification of nucleic acids via interactions with the nucleotide bases, especially purines, of single-stranded RNA and DNA. We also have demonstrated the purification of plasmid DNA from contaminating genomic DNA by IMAC capture of selectively-denatured genomic DNA. Here we describe an efficient method of purifying PCR products by specifically removing error products, excess primers, and unincorporated dNTPs from PCR product mixtures using flow-through metal-chelate affinity adsorption. By flowing a PCR product mixture through a Cu2+-iminodiacetic acid (IDA) agarose spin column, 94–99% of the dNTPs and nearly all the primers can be removed. Many of the error products commonly formed by Taq polymerase also are removed. Sequencing of the IMAC-processed PCR product gave base-calling accuracy comparable to that obtained with a commercial PCR product purification method. The results show that IMAC matrices (specifically Cu2+-IDA agarose) can be used for the purification of PCR products. Due to the generality of the base-specific mechanism of adsorption, IMAC matrices may also be used in the purification of oligonucleotides, cDNA, mRNA and micro RNAs

    Culture-Independent Microbiological Analysis of Foley Urinary Catheter Biofilms

    Get PDF
    Background: Prevention of catheter-associated urinary tract infection (CAUTI), a leading cause of nosocomial disease, is complicated by the propensity of bacteria to form biofilms on indwelling medical devices [1,2,3,4,5]. Methodology/Principal Findings: To better understand the microbial diversity of these communities, we report the results of a culture-independent bacterial survey of Foley urinary catheters obtained from patients following total prostatectomy. Two patient subsets were analyzed, based on treatment or no treatment with systemic fluoroquinolone antibiotics during convalescence. Results indicate the presence of diverse polymicrobial assemblages that were most commonly observed in patients who did not receive systemic antibiotics. The communities typically contained both Gram-positive and Gramnegative microorganisms that included multiple potential pathogens. Conclusion/Significance: Prevention and treatment of CAUTI must take into consideration the possible polymicrobial nature of any particular infection

    Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput sequencing technologies, such as the Illumina Genome Analyzer, are powerful new tools for investigating a wide range of biological and medical questions. Statistical and computational methods are key for drawing meaningful and accurate conclusions from the massive and complex datasets generated by the sequencers. We provide a detailed evaluation of statistical methods for normalization and differential expression (DE) analysis of Illumina transcriptome sequencing (mRNA-Seq) data.</p> <p>Results</p> <p>We compare statistical methods for detecting genes that are significantly DE between two types of biological samples and find that there are substantial differences in how the test statistics handle low-count genes. We evaluate how DE results are affected by features of the sequencing platform, such as, varying gene lengths, base-calling calibration method (with and without phi X control lane), and flow-cell/library preparation effects. We investigate the impact of the read count normalization method on DE results and show that the standard approach of scaling by total lane counts (e.g., RPKM) can bias estimates of DE. We propose more general quantile-based normalization procedures and demonstrate an improvement in DE detection.</p> <p>Conclusions</p> <p>Our results have significant practical and methodological implications for the design and analysis of mRNA-Seq experiments. They highlight the importance of appropriate statistical methods for normalization and DE inference, to account for features of the sequencing platform that could impact the accuracy of results. They also reveal the need for further research in the development of statistical and computational methods for mRNA-Seq.</p

    Sexual Size Dimorphism and Body Condition in the Australasian Gannet

    Get PDF
    Funding: The research was financially supported by the Holsworth Wildlife Research Endowment. Acknowledgments We thank the Victorian Marine Science Consortium, Sea All Dolphin Swim, Parks Victoria, and the Point Danger Management Committee for logistical support. We are grateful for the assistance of the many field volunteers involved in the study.Peer reviewedPublisher PD

    Factors associated with problem drinking among women employed in food and recreational facilities in northern Tanzania.

    Get PDF
    BACKGROUND: There is growing evidence that alcohol consumption is associated with increased risk of HIV infection. To determine factors associated with problem drinking, we analyzed data collected in two prospective cohorts of at-risk female food and recreational facility workers in northern Tanzania. METHODS: We enrolled HIV seronegative women aged 18-44 years and employed in the towns of Geita, Kahama, Moshi, and Shinyanga. At enrolment, women were interviewed to obtain information about alcohol use, using CAGE and AUDIT screening scales, and risk factors for HIV infection. Blood and genital samples were collected for detection of HIV and sexually transmitted infections (STIs). We characterized alcohol use, concordance, and agreement of the scales, and examined the associations between characteristics of participants and problem drinking as defined by both scales using logistic regression. Lastly, we assessed problem drinking as a risk factor for recent sexual behavior and prevalent STIs. RESULTS: Among enrollees, 68% women reported ever drinking alcohol; of these 76% reported drinking alcohol in the past 12 months. The prevalence of problem drinking was 20% using CAGE and 13% using AUDIT. Overall concordance between the scales was 75.0% with a Kappa statistic of 0.58. After adjusting for age, independent factors associated with problem drinking, on both scales, were marital status, occupation, facility type, increasing number of lifetime sexual partners, and transactional sex in the past 12 months. In addition, women who were problem drinkers on either scale were more likely to report having ≥ 1 sexual partner (CAGE: aOR = 1.56, 95% confidence interval, CI: 1.10-2.23; AUDIT: aOR = 2.00, 95% CI: 1.34-3.00) and transactional sex (CAGE: aOR = 1.79, 95% CI: 1.26-2.56; AUDIT: aOR = 1.51, 95% CI: 1.04-2.18), in the past 3 months. CONCLUSION: These findings suggest that interventions to reduce problem drinking in this population may reduce high-risk sexual behaviors and contribute in lowering the risk of HIV infection

    Development of Genomic Resources for Pacific Herring through Targeted Transcriptome Pyrosequencing

    Get PDF
    Pacific herring (Clupea pallasii) support commercially and culturally important fisheries but have experienced significant additional pressure from a variety of anthropogenic and environmental sources. In order to provide genomic resources to facilitate organismal and population level research, high-throughput pyrosequencing (Roche 454) was carried out on transcriptome libraries from liver and testes samples taken in Prince William Sound, the Bering Sea, and the Gulf of Alaska. Over 40,000 contigs were identified with an average length of 728 bp. We describe an annotated transcriptome as well as a workflow for single nucleotide polymorphism (SNP) discovery and validation. A subset of 96 candidate SNPs chosen from 10,933 potential SNPs, were tested using a combination of Sanger sequencing and high-resolution melt-curve analysis. Five SNPs supported between-ocean-basin differentiation, while one SNP associated with immune function provided high differentiation between Prince William Sound and Kodiak Island within the Gulf of Alaska. These genomic resources provide a basis for environmental physiology studies and opportunities for marker development and subsequent population structure analysis
    corecore