Search CORE

1,182 research outputs found

Evaluating Throwing Ability in Baseball

Author: Carruth Matthew
Jensen Shane T.
Publication venue
Publication date: 01/06/2007
Field of study

We present a quantitative analysis of throwing ability for major league outfielders and catchers. We use detailed game event data to tabulate success and failure events in outfielder and catcher throwing opportunities. We attribute a run contribution to each success or failure which are tabulated for each player in each season. We use four seasons of data to estimate the overall throwing ability of each player using a Bayesian hierarchical model. This model allows us to shrink individual player estimates towards an overall population mean depending on the number of opportunities for each player. We use the posterior distribution of player abilities from this model to identify players with significant positive and negative throwing contributions.Comment: Accepted for publication in the Journal of Quantitative Analysis in Sport

arXiv.org e-Print Archive

ScholarlyCommons@Penn

Changes in the Distribution of Income Volatility

Author: Jensen Shane T.
Shore Stephen H.
Publication venue
Publication date: 01/01/2008
Field of study

Recent research has documented a significant rise in the volatility (e.g., expected squared change) of individual incomes in the U.S. since the 1970s. Existing measures of this trend abstract from individual heterogeneity, effectively estimating an increase in average volatility. We decompose this increase in average volatility and find that it is far from representative of the experience of most people: there has been no systematic rise in volatility for the vast majority of individuals. The rise in average volatility has been driven almost entirely by a sharp rise in the income volatility of those expected to have the most volatile incomes, identified ex-ante by large income changes in the past. We document that the self-employed and those who self-identify as risk-tolerant are much more likely to have such volatile incomes; these groups have experienced much larger increases in income volatility than the population at large. These results color the policy implications one might draw from the rise in average volatility. While the basic results are apparent from PSID summary statistics, providing a complete characterization of the dynamics of the volatility distribution is a methodological challenge. We resolve these difficulties with a Markovian hierarchical Dirichlet process that builds on work from the non-parametric Bayesian statistics literature

arXiv.org e-Print Archive

CiteSeerX

Estimating an NBA player's impact on his team's chances of winning

Author: Deshpande Sameer K.
Jensen Shane T.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 11/04/2016
Field of study

Traditional NBA player evaluation metrics are based on scoring differential or some pace-adjusted linear combination of box score statistics like points, rebounds, assists, etc. These measures treat performances with the outcome of the game still in question (e.g. tie score with five minutes left) in exactly the same way as they treat performances with the outcome virtually decided (e.g. when one team leads by 30 points with one minute left). Because they ignore the context in which players perform, these measures can result in misleading estimates of how players help their teams win. We instead use a win probability framework for evaluating the impact NBA players have on their teams' chances of winning. We propose a Bayesian linear regression model to estimate an individual player's impact, after controlling for the other players on the court. We introduce several posterior summaries to derive rank-orderings of players within their team and across the league. This allows us to identify highly paid players with low impact relative to their teammates, as well as players whose high impact is not captured by existing metrics.Comment: To appear in the Journal of Quantitative Analysis of Spor

arXiv.org e-Print Archive

Hierarchical Bayesian Modeling of Hitting Performance in Baseball

Author: Jensen Shane T.
McShane Blake
Wyner Abraham J.
Publication venue
Publication date: 01/01/2009
Field of study

We have developed a sophisticated statistical model for predicting the hitting performance of Major League baseball players. The Bayesian paradigm provides a principled method for balancing past performance with crucial covariates, such as player age and position. We share information across time and across players by using mixture distributions to control shrinkage for improved accuracy. We compare the performance of our model to current sabermetric methods on a held-out season (2006), and discuss both successes and limitations

arXiv.org e-Print Archive

CiteSeerX

Crossref

Bayesian variable selection and data integration for biological regulatory networks

Author: Chen Guang
Jensen Shane T.
Stoeckert Jr, Christian J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

A substantial focus of research in molecular biology are gene regulatory networks: the set of transcription factors and target genes which control the involvement of different biological processes in living cells. Previous statistical approaches for identifying gene regulatory networks have used gene expression data, ChIP binding data or promoter sequence data, but each of these resources provides only partial information. We present a Bayesian hierarchical model that integrates all three data types in a principled variable selection framework. The gene expression data are modeled as a function of the unknown gene regulatory network which has an informed prior distribution based upon both ChIP binding and promoter sequence data. We also present a variable weighting methodology for the principled balancing of multiple sources of prior information. We apply our procedure to the discovery of gene regulatory relationships in Saccharomyces cerevisiae (Yeast) for which we can use several external sources of information to validate our results. Our inferred relationships show greater biological relevance on the external validation measures than previous data integration methods. Our model also estimates synergistic and antagonistic interactions between transcription factors, many of which are validated by previous studies. We also evaluate the results from our procedure for the weighting for multiple sources of prior information. Finally, we discuss our methodology in the context of previous approaches to data integration and Bayesian variable selection.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS130 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn