176 research outputs found

    Application of machine learning in SNP discovery

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Single nucleotide polymorphisms (SNP) constitute more than 90% of the genetic variation, and hence can account for most trait differences among individuals in a given species. Polymorphism detection software PolyBayes and PolyPhred give high false positive SNP predictions even with stringent parameter values. We developed a machine learning (ML) method to augment PolyBayes to improve its prediction accuracy. ML methods have also been successfully applied to other bioinformatics problems in predicting genes, promoters, transcription factor binding sites and protein structures.</p> <p>Results</p> <p>The ML program C4.5 was applied to a set of features in order to build a SNP classifier from training data based on human expert decisions (True/False). The training data were 27,275 candidate SNP generated by sequencing 1973 STS (sequence tag sites) (12 Mb) in both directions from 6 diverse homozygous soybean cultivars and PolyBayes analysis. Test data of 18,390 candidate SNP were generated similarly from 1359 additional STS (8 Mb). SNP from both sets were classified by experts. After training the ML classifier, it agreed with the experts on 97.3% of test data compared with 7.8% agreement between PolyBayes and experts. The PolyBayes positive predictive values (PPV) (i.e., fraction of candidate SNP being real) were 7.8% for all predictions and 16.7% for those with 100% posterior probability of being real. Using ML improved the PPV to 84.8%, a 5- to 10-fold increase. While both ML and PolyBayes produced a similar number of true positives, the ML program generated only 249 false positives as compared to 16,955 for PolyBayes. The complexity of the soybean genome may have contributed to high false SNP predictions by PolyBayes and hence results may differ for other genomes.</p> <p>Conclusion</p> <p>A machine learning (ML) method was developed as a supplementary feature to the polymorphism detection software for improving prediction accuracies. The results from this study indicate that a trained ML classifier can significantly reduce human intervention and in this case achieved a 5–10 fold enhanced productivity. The optimized feature set and ML framework can also be applied to all polymorphism discovery software. ML support software is written in Perl and can be easily integrated into an existing SNP discovery pipeline.</p

    SNP-PHAGE – High throughput SNP discovery pipeline

    Get PDF
    BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. RESULTS: We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at . CONCLUSION: SNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers

    Thermal Cycle Testing of the Powersphere Engineering Development Unit

    Get PDF
    During the past three years the team of The Aerospace Corporation, Lockheed Martin Space Systems, NASA Glenn Research Center, and ILC Dover LP have been developing a multifunctional inflatable structure for the PowerSphere concept under contract with NASA (NAS3-01115). The PowerSphere attitude insensitive solar power-generating microsatellite, which could be used for many different space and Earth science purposes, is ready for further refinement and flight demonstration. The development of micro- and nanosatellites requires the energy collection system, namely the solar array, to be of lightweight and small size. The limited surface area of these satellites precludes the possibility of body mounting the solar array system for required power generation. The use of large traditional solar arrays requires the support of large satellite volumes and weight and also requires a pointing apparatus. The current PowerSphere concept (geodetic sphere), which was envisioned in the late 1990 s by Mr. Simburger of The Aerospace Corporation, has been systematically developed in the past several years.1-7 The PowerSphere system is a low mass and low volume system suited for micro and nanosatellites. It is a lightweight solar array that is spherical in shape and does not require a pointing apparatus. The recently completed project culminated during the third year with the manufacturing of the PowerSphere Engineering Development Unit (EDU). One hemisphere of the EDU system was tested for packing and deployment and was subsequently rigidized. The other hemisphere was packed and stored for future testing in an uncured state. Both cured and uncured hemisphere components were delivered to NASA Glenn Research Center for thermal cycle testing and long-term storage respectively. This paper will discuss the design, thermal cycle testing of the PowerSphere EDU

    The NEMP family supports metazoan fertility and nuclear envelope stiffness.

    Get PDF
    Human genome-wide association studies have linked single-nucleotide polymorphisms (SNPs) in NEMP1 (nuclear envelope membrane protein 1) with early menopause; however, it is unclear whether NEMP1 has any role in fertility. We show that whole-animal loss of NEMP1 homologs in Drosophila, Caenorhabditis elegans, zebrafish, and mice leads to sterility or early loss of fertility. Loss of Nemp leads to nuclear shaping defects, most prominently in the germ line. Biochemical, biophysical, and genetic studies reveal that NEMP proteins support the mechanical stiffness of the germline nuclear envelope via formation of a NEMP-EMERIN complex. These data indicate that the germline nuclear envelope has specialized mechanical properties and that NEMP proteins play essential and conserved roles in fertility

    Transcription Factor SP4 Is a Susceptibility Gene for Bipolar Disorder

    Get PDF
    The Sp4 transcription factor plays a critical role for both development and function of mouse hippocampus. Reduced expression of the mouse Sp4 gene results in a variety of behavioral abnormalities relevant to human psychiatric disorders. The human SP4 gene is therefore examined for its association with both bipolar disorder and schizophrenia in European Caucasian and Chinese populations respectively. Out of ten SNPs selected from human SP4 genomic locus, four displayed significant association with bipolar disorder in European Caucasian families (rs12668354, p = 0.022; rs12673091, p = 0.0005; rs3735440, p = 0.019; rs11974306, p = 0.018). To replicate the genetic association, the same set of SNPs was examined in a Chinese bipolar case control sample. Four SNPs displayed significant association (rs40245, p = 0.009; rs12673091, p = 0.002; rs1018954, p = 0.001; rs3735440, p = 0.029), and two of them (rs12673091, rs3735440) were shared with positive SNPs from European Caucasian families. Considering the genetic overlap between bipolar disorder and schizophrenia, we extended our studies in Chinese trios families for schizophrenia. The SNP7 (rs12673091, p = 0.012) also displayed a significant association. The SNP7 (rs12673091) was therefore significantly associated in all three samples, and shared the same susceptibility allele (A) across all three samples. On the other hand, we found a gene dosage effect for mouse Sp4 gene in the modulation of sensorimotor gating, a putative endophenotype for both schizophrenia and bipolar disorder. The deficient sensorimotor gating in Sp4 hypomorphic mice was partially reversed by the administration of dopamine D2 antagonist or mood stabilizers. Both human genetic and mouse pharmacogenetic studies support Sp4 gene as a susceptibility gene for bipolar disorder or schizophrenia. The studies on the role of Sp4 gene in hippocampal development may provide novel insights for the contribution of hippocampal abnormalities in these psychiatric disorders

    Crop Updates 2006 - Cereals

    Get PDF
    This session covers twenty nine papers from different authors: PLENARY 1. The 2005 wheat streak mosaic virus epidemic in New South Wales and the threat posed to the Western Australian wheat industry, Roger Jones and Nichole Burges, Department of Agriculture SOUTH COAST AGRONOMY 2. South coast wheat variety trial results and best options for 2006, Mohammad Amjad, Ben Curtis and Wal Anderson, Department of Agriculture 3. Dual purpose winter wheats to improve productivity, Mohammad Amjad and Ben Curtis, Department of Agriculture 4. South coast large-scale premium wheat variety trials, Mohammad Amjad and Ben Curtis, Department of Agriculture 5. Optimal input packages for noodle wheat in Dalwallinu – Liebe practice for profit trial, Darren Chitty, Agritech Crop Research and Brianna Peake, Liebe Group 6. In-crop risk management using yield prophet®, Harm van Rees1, Cherie Reilly1, James Hunt1, Dean Holzworth2, Zvi Hochman2; 1Birchip Cropping Group, Victoria; 2CSIRO, Toowoomba, Qld 7. Yield Prophet® 2005 – On-line yield forecasting, James Hunt1, Harm van Rees1, Zvi Hochman2,Allan Peake2, Neal Dalgliesh2, Dean Holzworth2, Stephen van Rees1, Trudy McCann1 and Peter Carberry2; 1Birchip Cropping Group, Victoria; 2CSIRO, Toowoomba, Qld 8. Performance of oaten hay varieties in Western Australian environments, Raj Malik and Kellie Winfield, Department of Agriculture 9. Performance of dwarf potential milling varieties in Western Australian environments, Kellie Winfield and Raj Malik, Department of Agriculture 10. Agronomic responses of new wheat varieties in the Southern agricultural region of WA, Brenda Shackley and Judith Devenish, Department of Agriculture 11. Responses of new wheat varieties to management factors in the central agricultural region of Western Australia, Darshan Sharma, Steve Penny and Wal Anderson,Department of Agriculture 12. Sowing time on wheat yield, quality and $ - Northern agricultural region, Christine Zaicou-Kunesch, Department of Agriculture NUTRITION 13.The most effective method of applying phosphorus, copper and zinc to no-till crops, Mike Bolland and Ross Brennan, Department of Agriculture 14. Uptake of K from the soil profile by wheat, Paul Damon and Zed Rengel, Faculty of Natural and Agricultural Sciences, University of Western Australia 15. Reducing nitrogen fertiliser risks, Jeremy Lemon, Department of Agriculture 16. Yield Prophet® and canopy management, Harm van Rees1, Zvi Hochman2, Perry Poulton2, Nick Poole3, Brooke Thompson4, James Hunt1; 1Birchip Cropping Group, Victoria; 2CSIRO, Toowoomba, Qld; 3Foundation for Arable Research, New Zealand; 4Cropfacts, Victoria 17. Producing profits with phosphorus, Stephen Loss, CSBP Ltd, WA 18. Potassium response in cereal cropping within the medium rainfall central wheatbelt, Jeff Russell1, Angie Roe2 and James Eyres2, Department of Agriculture1, Farm Focus Consultants, Northam2 19. Matching nitrogen supply to wheat demand in the high rainfall cropping zone, Narelle Simpson, Ron McTaggart, Wal Anderson, Lionel Martin and Dave Allen, Department of Agriculture DISEASES 20. Comparative study of commercial wheat cultivars and differential lines (with known Pm resistance genes) to powdery mildew response, Hossein Golzar, Manisha Shankar and Robert Loughman, Department of Agriculture 21. On farm research to investigate fungicide applications to minimise leaf disease impacts in wheat – part II, Jeff Russell1, Angie Roe2and James Eyres2, Department of Agriculture1, and Farm Focus Consultants, Northam2 22. Disease resistance update for wheat varieties in WA, Manisha Shankar, John Majewski, Donna Foster, Hossein Golzar, Jamie Piotrowski, Nicole Harry and Rob Loughman, Department of Agriculture 23. Effect of time of stripe rust inoculum arrival on variety response in wheat, Manisha Shankar, John Majewski and Rob Loughman, Department of Agriculture 24. Fungicide seed dressing management of loose smut in Baudin barley, Geoff Thomas and Kith Jayasena, Department of Agriculture PESTS 25. How to avoid insect contamination in cereal grain at harvest, Svetlana Micic, Paul Matson and Tony Dore, Department of Agriculture ABIOTIC 26. Environment – is it as important as variety in sprouting tolerance? Thomas (Ben) Biddulph1, Dr Daryl Mares1, Dr Julie Plummer1 and Dr Tim Setter2, School of Plant Biology, University of Western Australia1 and Department of Agriculture2 27. Frost or fiction, Garren Knell, Steve Curtin and Wade Longmuir, ConsultAg Pty Ltd, WA 28. High moisture wheat harvesting in Esperance 2005, Nigel Metz, South East Premium Wheat Growers Association (SEPWA) Projects Coordinator, Esperance, WA SOILS 28. Hardpan penetration ability of wheat roots, Tina Botwright Acuña and Len Wade, School of Plant Biology, University of Western Australia MARKETS 29. Crop shaping to meet predicted market demands for wheat in the 21st Century, Cindy Mills and Peter Stone,Australian Wheat Board, Melbourn

    KELT-24b: A 5M_J Planet on a 5.6 day Well-Aligned Orbit around the Young V=8.3 F-star HD 93148

    Get PDF
    We present the discovery of KELT-24 b, a massive hot Jupiter orbiting a bright (V=8.3 mag, K=7.2 mag) young F-star with a period of 5.6 days. The host star, KELT-24 (HD 93148), has a T_(eff) =6508±49 K, a mass of M∗ = 1.461^(+0.056)_(−0.060) M_⊙, radius of R∗ = 1.506±0.022 R_⊙, and an age of 0.77^(+0.61)_(−0.42) Gyr. Its planetary companion (KELT-24 b) has a radius of R_P = 1.272^(+0.021)_(−0.022) R_J, a mass of MP = 5.18^(+0.21)_(−0.22) M_J, and from Doppler tomographic observations, we find that the planet's orbit is well-aligned to its host star's projected spin axis (λ = 2.6^(+5.1)_(−3.6)). The young age estimated for KELT-24 suggests that it only recently started to evolve from the zero-age main sequence. KELT-24 is the brightest star known to host a transiting giant planet with a period between 5 and 10 days. Although the circularization timescale is much longer than the age of the system, we do not detect a large eccentricity or significant misalignment that is expected from dynamical migration. The brightness of its host star and its moderate surface gravity make KELT-24b an intriguing target for detailed atmospheric characterization through spectroscopic emission measurements since it would bridge the current literature results that have primarily focused on lower mass hot Jupiters and a few brown dwarfs
    • …
    corecore