Search CORE

677,482 research outputs found

Seeking Excellence: Improving Objectivity in Player Analysis in Professional Basketball

Author: Cook Nathan
Publication venue: Scholars Crossing
Publication date: 26/11/2018
Field of study

This thesis details the creation and testing of an original statistical metric for analyzing individual basketball players in the National Basketball Association (NBA) by both their commonly measured statistics and their so-called “intangibles.” By using existing methods as both guides and a caution against potential shortcomings, an inclusive statistic with multiple layers of data can be built to best reflect an individual player’s overall value to his team. This metric will be adjusted to account for the differences across multiple eras of NBA play and the levels of talent with which a player played in order to avoid penalizing a player for the unique aspects of his career

Liberty University Digital Commons

A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

Author: Fokoue Ernest
Publication venue
Publication date: 01/01/2014
Field of study

Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham's razor non plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.Comment: 18 pages, 2 figures 3 table

arXiv.org e-Print Archive

Bulgarian Digital Mathematics Library at IMI-BAS

Bayesian astrostatistics: a backward look to the future

Author: A Gelman
A Gelman
ACS Readhead
B Efron
BC Kelly
BD Ripley
BD Wandelt
C Graziani
DA Dyk van
DA Dyk van
DN Esch
ET Jaynes
ET Jaynes
FS Kitaura
G Patanchon
H Jeffreys
J Bovy
JM Petit
KS Mandel
M Kunz
M Lampton
MJ Bayarri
N Dobigeon
P Hadjicostas
PA Sturrock
PK Goel
RJ Carroll
RJ Little
S Luo
S Sinharay
SF Gull
SF Gull
T Budavári
TJ Loredo
TJ Loredo
TJ Loredo
TJ Loredo
TJ Loredo
TJ Loredo
TJ Loredo
TJ Loredo
TS Kuhn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2012
Field of study

This perspective chapter briefly surveys: (1) past growth in the use of Bayesian methods in astrophysics; (2) current misconceptions about both frequentist and Bayesian statistical inference that hinder wider adoption of Bayesian methods by astronomers; and (3) multilevel (hierarchical) Bayesian modeling as a major future direction for research in Bayesian astrostatistics, exemplified in part by presentations at the first ISI invited session on astrostatistics, commemorated in this volume. It closes with an intentionally provocative recommendation for astronomical survey data reporting, motivated by the multilevel Bayesian perspective on modeling cosmic populations: that astronomers cease producing catalogs of estimated fluxes and other source properties from surveys. Instead, summaries of likelihood functions (or marginal likelihood functions) for source properties should be reported (not posterior probability density functions), including nontrivial summaries (not simply upper limits) for candidate objects that do not pass traditional detection thresholds.Comment: 27 pp, 4 figures. A lightly revised version of a chapter in "Astrostatistical Challenges for the New Astronomy" (Joseph M. Hilbe, ed., Springer, New York, forthcoming in 2012), the inaugural volume for the Springer Series in Astrostatistics. Version 2 has minor clarifications and an additional referenc

arXiv.org e-Print Archive

Crossref