677,482 research outputs found

    Seeking Excellence: Improving Objectivity in Player Analysis in Professional Basketball

    Get PDF
    This thesis details the creation and testing of an original statistical metric for analyzing individual basketball players in the National Basketball Association (NBA) by both their commonly measured statistics and their so-called “intangibles.” By using existing methods as both guides and a caution against potential shortcomings, an inclusive statistic with multiple layers of data can be built to best reflect an individual player’s overall value to his team. This metric will be adjusted to account for the differences across multiple eras of NBA play and the levels of talent with which a player played in order to avoid penalizing a player for the unique aspects of his career

    A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

    Full text link
    Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham's razor non plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.Comment: 18 pages, 2 figures 3 table

    Bayesian astrostatistics: a backward look to the future

    Full text link
    This perspective chapter briefly surveys: (1) past growth in the use of Bayesian methods in astrophysics; (2) current misconceptions about both frequentist and Bayesian statistical inference that hinder wider adoption of Bayesian methods by astronomers; and (3) multilevel (hierarchical) Bayesian modeling as a major future direction for research in Bayesian astrostatistics, exemplified in part by presentations at the first ISI invited session on astrostatistics, commemorated in this volume. It closes with an intentionally provocative recommendation for astronomical survey data reporting, motivated by the multilevel Bayesian perspective on modeling cosmic populations: that astronomers cease producing catalogs of estimated fluxes and other source properties from surveys. Instead, summaries of likelihood functions (or marginal likelihood functions) for source properties should be reported (not posterior probability density functions), including nontrivial summaries (not simply upper limits) for candidate objects that do not pass traditional detection thresholds.Comment: 27 pp, 4 figures. A lightly revised version of a chapter in "Astrostatistical Challenges for the New Astronomy" (Joseph M. Hilbe, ed., Springer, New York, forthcoming in 2012), the inaugural volume for the Springer Series in Astrostatistics. Version 2 has minor clarifications and an additional referenc
    • …
    corecore