73 research outputs found

    A Project Based Approach to Statistics and Data Science

    Full text link
    In an increasingly data-driven world, facility with statistics is more important than ever for our students. At institutions without a statistician, it often falls to the mathematics faculty to teach statistics courses. This paper presents a model that a mathematician asked to teach statistics can follow. This model entails connecting with faculty from numerous departments on campus to develop a list of topics, building a repository of real-world datasets from these faculty, and creating projects where students interface with these datasets to write lab reports aimed at consumers of statistics in other disciplines. The end result is students who are well prepared for interdisciplinary research, who are accustomed to coping with the idiosyncrasies of real data, and who have sharpened their technical writing and speaking skills

    Wavelet Neural Networks: A Practical Guide

    Get PDF
    Wavelet networks (WNs) are a new class of networks which have been used with great success in a wide range of application. However a general accepted framework for applying WNs is missing from the literature. In this study, we present a complete statistical model identification framework in order to apply WNs in various applications. The following subjects were thorough examined: the structure of a WN, training methods, initialization algorithms, variable significance and variable selection algorithms, model selection methods and finally methods to construct confidence and prediction intervals. In addition the complexity of each algorithm is discussed. Our proposed framework was tested in two simulated cases, in one chaotic time series described by the Mackey-Glass equation and in three real datasets described by daily temperatures in Berlin, daily wind speeds in New York and breast cancer classification. Our results have shown that the proposed algorithms produce stable and robust results indicating that our proposed framework can be applied in various applications

    Feature signature prediction of a boring process using neural network modeling with confidence bounds

    Full text link
    Prediction of machine tool failure has been very important in modern metal cutting operations in order to meet the growing demand for product quality and cost reduction. This paper presents the study of building a neural network model for predicting the behavior of a boring process during its full life cycle. This prediction is achieved by the fusion of the predictions of three principal components extracted as features from the joint time–frequency distributions of energy of the spindle loads observed during the boring process. Furthermore, prediction uncertainty is assessed using nonlinear regression in order to quantify the errors associated with the prediction. The results show that the implemented Elman recurrent neural network is a viable method for the prediction of the feature behavior of the boring process, and that the constructed confidence bounds provide information crucial for subsequent maintenance decision making based on the predicted cutting tool degradation.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45845/1/170_2005_Article_114.pd

    Curriculum Guidelines for Undergraduate Programs in Data Science

    Get PDF
    The Park City Math Institute 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in data science. The group consisted of 25 undergraduate faculty from a variety of institutions in the United States, primarily from the disciplines of mathematics, statistics, and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in data science

    A High Throughput Genetic Screen Identifies New Early Meiotic Recombination Functions in Arabidopsis thaliana

    Get PDF
    Meiotic recombination is initiated by the formation of numerous DNA double-strand breaks (DSBs) catalysed by the widely conserved Spo11 protein. In Saccharomyces cerevisiae, Spo11 requires nine other proteins for meiotic DSB formation; however, unlike Spo11, few of these are conserved across kingdoms. In order to investigate this recombination step in higher eukaryotes, we took advantage of a high-throughput meiotic mutant screen carried out in the model plant Arabidopsis thaliana. A collection of 55,000 mutant lines was screened, and spo11-like mutations, characterised by a drastic decrease in chiasma formation at metaphase I associated with an absence of synapsis at prophase, were selected. This screen led to the identification of two populations of mutants classified according to their recombination defects: mutants that repair meiotic DSBs using the sister chromatid such as Atdmc1 or mutants that are unable to make DSBs like Atspo11-1. We found that in Arabidopsis thaliana at least four proteins are necessary for driving meiotic DSB repair via the homologous chromosomes. These include the previously characterised DMC1 and the Hop1-related ASY1 proteins, but also the meiotic specific cyclin SDS as well as the Hop2 Arabidopsis homologue AHP2. Analysing the mutants defective in DSB formation, we identified the previously characterised AtSPO11-1, AtSPO11-2, and AtPRD1 as well as two new genes, AtPRD2 and AtPRD3. Our data thus increase the number of proteins necessary for DSB formation in Arabidopsis thaliana to five. Unlike SPO11 and (to a minor extent) PRD1, these two new proteins are poorly conserved among species, suggesting that the DSB formation mechanism, but not its regulation, is conserved among eukaryotes

    A Novel Protein Kinase-Like Domain in a Selenoprotein, Widespread in the Tree of Life

    Get PDF
    Selenoproteins serve important functions in many organisms, usually providing essential oxidoreductase enzymatic activity, often for defense against toxic xenobiotic substances. Most eukaryotic genomes possess a small number of these proteins, usually not more than 20. Selenoproteins belong to various structural classes, often related to oxidoreductase function, yet a few of them are completely uncharacterised

    A Guided Tour of Modern Regression Methods

    No full text
    The statistical practitioner today, who wants to find new methods to fit historical data is confronted by a often bewildering morass of acronyms. We will attempt, via a few examples, to shed some light on how techniques such as CART, MARS, GAM, PLS, PCR and ANN work and how they can be used effectively. This paper is based on an invited tutorial on modern regression methods given at the 1995 Fall Technical Conference in St. Louis. KEYWORDS: nonparametric regression; function approximation; neural networks; generalized additive models; tree based regression. 1 Introduction Our aim in this paper is to provide an introduction to several of the more popular regression based techniques currently used by data analysts. Our intent is to familiarize the reader with each technique, not to provide an in-depth analysis of each. We will illustrate the techniques via examples, referring the reader to the vast bibliography on the subject for more details on the estimation and inference properties of..
    corecore