163,170 research outputs found

    Development of Novel Calibrations for FT-NIR Analysis of Protein, Oil, Carbohydrates and Isoflavones in Foods

    Get PDF
    The development of calibration methodology for novel FT-NIRS analysis of soybean-based foods is presented together with high-precision NIRS spectra and composition measurements in terms of proteins, oil and carbohydrates in soybean-based foods/soy foods.
&#xa

    Development and Validation of Clinical Whole-Exome and Whole-Genome Sequencing for Detection of Germline Variants in Inherited Disease

    Get PDF
    Context.-With the decrease in the cost of sequencing, the clinical testing paradigm has shifted from single gene to gene panel and now whole-exome and whole-genome sequencing. Clinical laboratories are rapidly implementing next-generation sequencing-based whole-exome and whole-genome sequencing. Because a large number of targets are covered by whole-exome and whole-genome sequencing, it is critical that a laboratory perform appropriate validation studies, develop a quality assurance and quality control program, and participate in proficiency testing. Objective.-To provide recommendations for wholeexome and whole-genome sequencing assay design, validation, and implementation for the detection of germline variants associated in inherited disorders. Data Sources.-An example of trio sequencing, filtration and annotation of variants, and phenotypic consideration to arrive at clinical diagnosis is discussed. Conclusions.-It is critical that clinical laboratories planning to implement whole-exome and whole-genome sequencing design and validate the assay to specifications and ensure adequate performance prior to implementation. Test design specifications, including variant filtering and annotation, phenotypic consideration, guidance on consenting options, and reporting of incidental findings, are provided. These are important steps a laboratory must take to validate and implement whole-exome and whole-genome sequencing in a clinical setting for germline variants in inherited disorders

    Feature importance for machine learning redshifts applied to SDSS galaxies

    Full text link
    We present an analysis of importance feature selection applied to photometric redshift estimation using the machine learning architecture Decision Trees with the ensemble learning routine Adaboost (hereafter RDF). We select a list of 85 easily measured (or derived) photometric quantities (or `features') and spectroscopic redshifts for almost two million galaxies from the Sloan Digital Sky Survey Data Release 10. After identifying which features have the most predictive power, we use standard artificial Neural Networks (aNN) to show that the addition of these features, in combination with the standard magnitudes and colours, improves the machine learning redshift estimate by 18% and decreases the catastrophic outlier rate by 32%. We further compare the redshift estimate using RDF with those from two different aNNs, and with photometric redshifts available from the SDSS. We find that the RDF requires orders of magnitude less computation time than the aNNs to obtain a machine learning redshift while reducing both the catastrophic outlier rate by up to 43%, and the redshift error by up to 25%. When compared to the SDSS photometric redshifts, the RDF machine learning redshifts both decreases the standard deviation of residuals scaled by 1/(1+z) by 36% from 0.066 to 0.041, and decreases the fraction of catastrophic outliers by 57% from 2.32% to 0.99%.Comment: 10 pages, 4 figures, updated to match version accepted in MNRA

    ConSole: using modularity of contact maps to locate solenoid domains in protein structures.

    Get PDF
    BackgroundPeriodic proteins, characterized by the presence of multiple repeats of short motifs, form an interesting and seldom-studied group. Due to often extreme divergence in sequence, detection and analysis of such motifs is performed more reliably on the structural level. Yet, few algorithms have been developed for the detection and analysis of structures of periodic proteins.ResultsConSole recognizes modularity in protein contact maps, allowing for precise identification of repeats in solenoid protein structures, an important subgroup of periodic proteins. Tests on benchmarks show that ConSole has higher recognition accuracy as compared to Raphael, the only other publicly available solenoid structure detection tool. As a next step of ConSole analysis, we show how detection of solenoid repeats in structures can be used to improve sequence recognition of these motifs and to detect subtle irregularities of repeat lengths in three solenoid protein families.ConclusionsThe ConSole algorithm provides a fast and accurate tool to recognize solenoid protein structures as a whole and to identify individual solenoid repeat units from a structure. ConSole is available as a web-based, interactive server and is available for download at http://console.sanfordburnham.org

    A critical evaluation of network and pathway based classifiers for outcome prediction in breast cancer

    Get PDF
    Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single gene classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single gene classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single gene classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single gene sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single gene classifiers for predicting outcome in breast cancer

    A system performance throughput model applicable to advanced manned telescience systems

    Get PDF
    As automated space systems become more complex, autonomous, and opaque to the flight crew, it becomes increasingly difficult to determine whether the total system is performing as it should. Some of the complex and interrelated human performance measurement issues are addressed that are related to total system validation. An evaluative throughput model is presented which can be used to generate a human operator-related benchmark or figure of merit for a given system which involves humans at the input and output ends as well as other automated intelligent agents. The concept of sustained and accurate command/control data information transfer is introduced. The first two input parameters of the model involve nominal and off-nominal predicted events. The first of these calls for a detailed task analysis while the second is for a contingency event assessment. The last two required input parameters involving actual (measured) events, namely human performance and continuous semi-automated system performance. An expression combining these four parameters was found using digital simulations and identical, representative, random data to yield the smallest variance
    • …
    corecore