335 research outputs found

    SNP-PHAGE – High throughput SNP discovery pipeline

    Get PDF
    BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. RESULTS: We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at . CONCLUSION: SNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers

    Application of machine learning in SNP discovery

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Single nucleotide polymorphisms (SNP) constitute more than 90% of the genetic variation, and hence can account for most trait differences among individuals in a given species. Polymorphism detection software PolyBayes and PolyPhred give high false positive SNP predictions even with stringent parameter values. We developed a machine learning (ML) method to augment PolyBayes to improve its prediction accuracy. ML methods have also been successfully applied to other bioinformatics problems in predicting genes, promoters, transcription factor binding sites and protein structures.</p> <p>Results</p> <p>The ML program C4.5 was applied to a set of features in order to build a SNP classifier from training data based on human expert decisions (True/False). The training data were 27,275 candidate SNP generated by sequencing 1973 STS (sequence tag sites) (12 Mb) in both directions from 6 diverse homozygous soybean cultivars and PolyBayes analysis. Test data of 18,390 candidate SNP were generated similarly from 1359 additional STS (8 Mb). SNP from both sets were classified by experts. After training the ML classifier, it agreed with the experts on 97.3% of test data compared with 7.8% agreement between PolyBayes and experts. The PolyBayes positive predictive values (PPV) (i.e., fraction of candidate SNP being real) were 7.8% for all predictions and 16.7% for those with 100% posterior probability of being real. Using ML improved the PPV to 84.8%, a 5- to 10-fold increase. While both ML and PolyBayes produced a similar number of true positives, the ML program generated only 249 false positives as compared to 16,955 for PolyBayes. The complexity of the soybean genome may have contributed to high false SNP predictions by PolyBayes and hence results may differ for other genomes.</p> <p>Conclusion</p> <p>A machine learning (ML) method was developed as a supplementary feature to the polymorphism detection software for improving prediction accuracies. The results from this study indicate that a trained ML classifier can significantly reduce human intervention and in this case achieved a 5–10 fold enhanced productivity. The optimized feature set and ML framework can also be applied to all polymorphism discovery software. ML support software is written in Perl and can be easily integrated into an existing SNP discovery pipeline.</p

    Phytophthora Root Rot Resistance in Soybean E00003

    Get PDF
    Phytophthora root rot (PRR) is a devastating disease in soybean [Glycine max (L.) Merr.] production. Michigan elite soybean E00003 is resistant to Phytophthora sojae and has been used as a resistance source in breeding. Genetic control of PRR resistance in this source is unknown. To facilitate marker-assisted selection (MAS), the PRR resistance loci in E00003 and their map locations need to be determined. In this study, a genetic mapping approach was used to identify major PRR -resistant loci in E00003. The mapping population consists of 240 F4–derived lines developed by crossing E00003 with the P. sojae susceptible line PI 567543C. In 2009 and 2010, the mapping population was evaluated in the greenhouse for PRR resistance against P. sojae races 1, 4, and 7, using modified rice (Oryza sativa L.) grain inoculation method. The population was genotyped with seven simple sequence repeat (SSR) and three single nucleotide polymorphism (SNP) markers derived from bulk segregant analysis. The heritability of resistance in the population ranged from 83 to 94%. A major locus, contributing 50 to 76% of the phenotypic variation, was mapped within a 3 cM interval in the Rps1 region. The interval was further saturated with more BARCSOY SSRs and SNPs with TaqMan assays. Two SSRs and three SNPs within the Rps1k gene were highly associated with PRR resistance in the mapping population. The major resistance gene in E00003 is either allelic or tightly linked to Rps1k.The molecular markers located in the Rps1k gene can be used to improve MAS for PRR resistance

    QTL for seed protein and amino acids in the Benning × Danbaekkong soybean population

    Get PDF
    Soybean, rather than nitrogen-containing forages, is the primary source of quality protein in feed formulations for domestic swine, poultry, and dairy industries. As a sole dietary source of protein, soybean is deficient in the amino acids lysine (Lys), threonine (Thr), methionine (Met), and cysteine (Cys). Increasing these amino acids would benefit the feed industry. The objective of the present study was to identify quantitative trait loci (QTL) associated with crude protein (cp) and amino acids in the ‘Benning’ × ‘Danbaekkong’ population. The population was grown in five southern USA environments. Amino acid concentrations as a fraction of cp (Lys/cp, Thr/cp, Met/cp, Cys/cp, and Met + Cys/cp) were determined by near-infrared reflectance spectroscopy. Four QTL associated with the variation in crude protein were detected on chromosomes (Chr) 14, 15, 17, and 20, of which, a QTL on Chr 20 explained 55 % of the phenotypic variation. In the same chromosomal region, QTL for Lys/cp, Thr/cp, Met/cp, Cys/cp and Met + Cys/cp were detected. At these QTL, the Danbaekkong allele resulted in reduced levels of these amino acids and increased protein concentration. Two additional QTL for Lys/cp were detected on Chr 08 and 20, and three QTL for Thr/cp on Chr 01, 09, and 17. Three QTL were identified on Chr 06, 09 and 10 for Met/cp, and one QTL was found for Cys/cp on Chr 10. The study provides information concerning the relationship between crude protein and levels of essential amino acids and may allow for the improvement of these traits in soybean using marker-assisted selection

    Quantum sticking, scattering and transmission of 4He atoms from superfluid 4He surfaces

    Get PDF
    We develop a microscopic theory of the scattering, transmission, and sticking of 4He atoms impinging on a superfluid 4He slab at near normal incidence, and inelastic neutron scattering from the slab. The theory includes coupling between different modes and allows for inelastic processes. We find a number of essential aspects that must be observed in a physically meaningful and reliable theory of atom transmission and scattering; all are connected with multiparticle scattering, particularly the possibility of energy loss. These processes are (a) the coupling to low-lying (surface) excitations (ripplons/third sound) which is manifested in a finite imaginary part of the self energy, and (b) the reduction of the strength of the excitation in the maxon/roton region

    Nonlinear localized waves in a periodic medium

    Full text link
    We analyze the existence and stability of nonlinear localized waves in a periodic medium described by the Kronig-Penney model with a nonlinear defect. We demonstrate the existence of a novel type of stable nonlinear band-gap localized states, and also reveal an important physical mechanism of the oscillatory wave instabilities associated with the band-gap resonances.Comment: 4 pages, 5 figure

    Genomic strategies for soybean oil improvement and biodiesel production

    Get PDF
    Track II: Transportation and BiofuelsIncludes audio file (21 min.)Soybean oil, a promising renewable energy resource, comprises 73% of biodiesel in addition to other industrial applications. Missouri is the fifth largest state in the US for soybean plantation. With the target to produce 225 million gallons of biodiesel by 2015 from the current 75 million gallons produced in 2005, efforts should not only focus on expanding the number of oil crops to meet the demand but also to increase the amount of oil per hectare for each crop. Considering the ever increasing need for biodiesel and the potential for Missouri to play a major role in national and international demand, We, at the National Center for Soybean Biotechnology focus on discovering the genetic factors that are responsible for oil content in soybean using genetic and genomic strategies. The long term goal is to apply discoveries in breeding programs and biotechnology for the development of improved soybean cultivars with increased oil content that will make this crop more competitive in end-uses. Our multidisciplinary approaches include traditional Quantitative Trait Loci (QTL) mapping, association mapping, bioinformatics and transgenics by developing new resources and utilizing already available resources such as mapping populations, diverse germplasm collections, genome sequence information and transgenes. In addition to total oil content, we are focusing on improving quality traits such as oleic acid which has direct human health benefits and application in biodiesel production. With the use of advanced genomic technologies, genetic materials, and synergistic efforts involving intra- and inter institutional collaborations, we believe that our current and future research will contribute substantially to biodiesel production. Increased production using high oil soybean cultivars will not only increase the economic gains to farmers/growers but also facilitate the US to emerge as the global leader in biodiesel production

    Scattering of He-3 Atoms from He-4 Surfaces

    Full text link
    We develop a first principles, microscopic theory of impurity atom scattering from inhomogeneous quantum liquids such as adsorbed films, slabs, or clusters of He-4. The theory is built upon a quantitative, microscopic description of the ground state of both the host liquid as well as the impurity atom. Dynamic effects are treated by allowing all ground-state correlation functions to be time-dependent. Our description includes both the elastic and inelastic coupling of impurity motion to the excitations of the host liquid. As a specific example, we study the scattering of He-3 atoms from adsorbed He-4 films. We examine the dependence of ``quantum reflection'' on the substrate, and the consequences of impurity bound states, resonances, and background excitations for scattering properties. A thorough analysis of the theoretical approach and the physical circumstances point towards the essential role played by inelastic processes which determine almost exclusively the reflection probabilities. The coupling to impurity resonances within the film leads to a visible dependence of the reflection coefficient on the direction of the impinging particle.Comment: 36 pages, 16 figure
    • 

    corecore