88,538 research outputs found

    A Data Science Course for Undergraduates: Thinking with Data

    Get PDF
    Data science is an emerging interdisciplinary field that combines elements of mathematics, statistics, computer science, and knowledge in a particular application domain for the purpose of extracting meaningful information from the increasingly sophisticated array of data available in many settings. These data tend to be non-traditional, in the sense that they are often live, large, complex, and/or messy. A first course in statistics at the undergraduate level typically introduces students with a variety of techniques to analyze small, neat, and clean data sets. However, whether they pursue more formal training in statistics or not, many of these students will end up working with data that is considerably more complex, and will need facility with statistical computing techniques. More importantly, these students require a framework for thinking structurally about data. We describe an undergraduate course in a liberal arts environment that provides students with the tools necessary to apply data science. The course emphasizes modern, practical, and useful skills that cover the full data analysis spectrum, from asking an interesting question to acquiring, managing, manipulating, processing, querying, analyzing, and visualizing data, as well communicating findings in written, graphical, and oral forms.Comment: 21 pages total including supplementary material

    Scalable visualisation methods for modern Generalized Additive Models

    Full text link
    In the last two decades the growth of computational resources has made it possible to handle Generalized Additive Models (GAMs) that formerly were too costly for serious applications. However, the growth in model complexity has not been matched by improved visualisations for model development and results presentation. Motivated by an industrial application in electricity load forecasting, we identify the areas where the lack of modern visualisation tools for GAMs is particularly severe, and we address the shortcomings of existing methods by proposing a set of visual tools that a) are fast enough for interactive use, b) exploit the additive structure of GAMs, c) scale to large data sets and d) can be used in conjunction with a wide range of response distributions. All the new visual methods proposed in this work are implemented by the mgcViz R package, which can be found on the Comprehensive R Archive Network

    Neutrino Mass from Laboratory: Contribution of Double Beta Decay to the Neutrino Mass Matrix

    Get PDF
    Double beta decay is indispensable to solve the question of the neutrino mass matrix together with ν\nu oscillation experiments. The most sensitive experiment - since eight years the HEIDELBERG-MOSCOW experiment in Gran-Sasso - already now, with the experimental limit of <0.26 < 0.26 eV practically excludes degenerate ν\nu mass scenarios allowing neutrinos as hot dark matter in the universe for the smallangle MSW solution of the solar neutrino problem. It probes cosmological models including hot dark matter already now on the level of future satellite experiments MAP and PLANCK. It further probes many topics of beyond SM physics at the TeV scale. Future experiments should give access to the multi-TeV range and complement on many ways the search for new physics at future colliders like LHC and NLC. For neutrino physics some of them (GENIUS) will allow to test almost all neutrino mass scenarios allowed by the present neutrino oscillation experiments.Comment: 5 pages, revtex, 6 figures, Talk was presented at International Europhysics Neutrino Oscillation Workshop, Conca Specchiulla (Otranto, Italy), September 9-16, 2000, to be published in Nucl. Phys. B (2001), Home Page of Heidelberg-Moscow Experiment: http://www.mpi-hd.mpg.de/non_acc

    Courseware Reviews

    Get PDF

    Selection of Statistical Software for Solving Big Data Problems: A Guide for Businesses, Students, and Universities

    Get PDF
    The need for analysts with expertise in big data software is becoming more apparent in today’s society. Unfortunately, the demand for these analysts far exceeds the number available. A potential way to combat this shortage is to identify the software taught in colleges or universities. This article will examine four data analysis software—Excel add-ins, SPSS, SAS, and R—and we will outline the cost, training, and statistical methods/tests/uses for each of these software. It will further explain implications for universities and future students
    • …
    corecore