Search CORE

1,208 research outputs found

A Web-Based Tool for Analysing Normative Documents in English

Author: Azzopardi Shaun
Azzopardi Shaun
Mercatali Pietro
Prisacariu Cristian
Ranta Aarne
Wyner Adam
Wyner Adam
Publication venue
Publication date: 13/07/2017
Field of study

Our goal is to use formal methods to analyse normative documents written in English, such as privacy policies and service-level agreements. This requires the combination of a number of different elements, including information extraction from natural language, formal languages for model representation, and an interface for property specification and verification. We have worked on a collection of components for this task: a natural language extraction tool, a suitable formalism for representing such documents, an interface for building models in this formalism, and methods for answering queries asked of a given model. In this work, each of these concerns is brought together in a web-based tool, providing a single interface for analysing normative texts in English. Through the use of a running example, we describe each component and demonstrate the workflow established by our tool

arXiv.org e-Print Archive

Crossref

Chalmers Research

Structural Equity: Big-Picture Thinking & Partnerships That Improve Community College Student Outcomes

Author: Joshua Wyner
Keith Witham
Mandy Zatynski
Publication venue: Aspen Institute College Excellence Program
Publication date: 06/06/2016
Field of study

While access to higher education has grown considerably for low-income students and students of color over the past decades, the rates at which those students succeed in completing or transferring to a four-year university remain low and have been slow to improve. This report describes how four successful community colleges have cultivated robust, cross-sector partnerships to create seamless educational pathways for students, and highlights three specific strategies the institutions have used to help eliminate structural barriers that perpetuate student success gaps along racial/ethnic and socioeconomic lines. Development of this guide was supported by the Lumina Foundation

IssueLab

Recognizing cited facts and principles in legal judgements

Author: Shulayeva Olga
Siddharthan Advaith
Wyner Adam
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In common law jurisdictions, legal professionals cite facts and legal principles from precedent cases to support their arguments before the court for their intended outcome in a current case. This practice stems from the doctrine of stare decisis, where cases that have similar facts should receive similar decisions with respect to the principles. It is essential for legal professionals to identify such facts and principles in precedent cases, though this is a highly time intensive task. In this paper, we present studies that demonstrate that human annotators can achieve reasonable agreement on which sentences in legal judgements contain cited facts and principles (respectively, κ=0.65 and κ=0.95 for inter- and intra-annotator agreement). We further demonstrate that it is feasible to automatically annotate sentences containing such legal facts and principles in a supervised machine learning framework based on linguistic features, reporting per category precision and recall figures of between 0.79 and 0.89 for classifying sentences in legal judgements as cited facts, principles or neither using a Bayesian classifier, with an overall κ of 0.72 with the human-annotated gold standard

Aberdeen University Research

Crossref

Springer - Publisher Connector

Open Research Online (The Open University)

Cronfa at Swansea University

A Hierarchical Bayesian Model of Pitch Framing

Author: Deshpande Sameer K.
Wyner Abraham J.
Publication venue
Publication date: 09/09/2017
Field of study

Since the advent of high-resolution pitch tracking data (PITCHf/x), many in the sabermetrics community have attempted to quantify a Major League Baseball catcher's ability to "frame" a pitch (i.e. increase the chance that a pitch is called as a strike). Especially in the last three years, there has been an explosion of interest in the "art of pitch framing" in the popular press as well as signs that teams are considering framing when making roster decisions. We introduce a Bayesian hierarchical model to estimate each umpire's probability of calling a strike, adjusting for pitch participants, pitch location, and contextual information like the count. Using our model, we can estimate each catcher's effect on an umpire's chance of calling a strike.We are then able to translate these estimated effects into average runs saved across a season. We also introduce a new metric, analogous to Jensen, Shirley, and Wyner's Spatially Aggregate Fielding Evaluation metric, which provides a more honest assessment of the impact of framing

arXiv.org e-Print Archive

A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?

Author: McShane Blakeley B.
Wyner Abraham J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

Predicting historic temperatures based on tree rings, ice cores, and other natural proxies is a difficult endeavor. The relationship between proxies and temperature is weak and the number of proxies is far larger than the number of target data points. Furthermore, the data contain complex spatial and temporal dependence structures which are not easily captured with simple models. In this paper, we assess the reliability of such reconstructions and their statistical significance against various null models. We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Furthermore, various model specifications that perform similarly at predicting temperature produce extremely different historical backcasts. Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago. We propose our own reconstruction of Northern Hemisphere average annual land temperature over the last millennium, assess its reliability, and compare it to those from the climate science literature. Our model provides a similar reconstruction but has much wider standard errors, reflecting the weak signal and large uncertainty encountered in this setting.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS398 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

ScholarlyCommons@Penn

Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting

Author: Buja Andreas
Mease David
Wyner Abraham J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 17/04/2008
Field of study

The authors are doing the readers of Statistical Science a true service with a well-written and up-to-date overview of boosting that originated with the seminal algorithms of Freund and Schapire. Equally, we are grateful for high-level software that will permit a larger readership to experiment with, or simply apply, boosting-inspired model fitting. The authors show us a world of methodology that illustrates how a fundamental innovation can penetrate every nook and cranny of statistical thinking and practice. They introduce the reader to one particular interpretation of boosting and then give a display of its potential with extensions from classification (where it all started) to least squares, exponential family models, survival analysis, to base-learners other than trees such as smoothing splines, to degrees of freedom and regularization, and to fascinating recent work in model selection. The uninitiated reader will find that the authors did a nice job of presenting a certain coherent and useful interpretation of boosting. The other reader, though, who has watched the business of boosting for a while, may have quibbles with the authors over details of the historic record and, more importantly, over their optimism about the current state of theoretical knowledge. In fact, as much as ``the statistical view'' has proven fruitful, it has also resulted in some ideas about why boosting works that may be misconceived, and in some recommendations that may be misguided. [arXiv:0804.2752]Comment: Published in at http://dx.doi.org/10.1214/07-STS242B the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

The Transfer Playbook: Essential Practices For Two- And Four-year Colleges

Author: Davis Jenkins
John Fink
Joshua Wyner
KC Deane
Publication venue: Community College Research Center at Teachers College, Columbia University
Publication date: 05/05/2016
Field of study

Recognizing the critical need to help millions of community college students failed by current transfer practices and policies. A new report provides a detailed guide for two- and four-year colleges on how to improve bachelor's degree outcomes for students who start at community college.Every year, millions of students aiming to attain a bachelor's degree attend community colleges because of their affordability and accessibility. Most will not realize their goals. While the vast majority of students report they want to earn a bachelor's degree, only 14 percent of degree-seeking students achieve that goal within six years, according to recent research from CCRC, Aspen, and the National Student Clearinghouse Research Center. The odds are worse for low-income students, first-generation college students, and students of color—those most likely to start at a community college

IssueLab

Hierarchical Bayesian Modeling of Hitting Performance in Baseball

Author: Jensen Shane T.
McShane Blake
Wyner Abraham J.
Publication venue
Publication date: 01/01/2009
Field of study

We have developed a sophisticated statistical model for predicting the hitting performance of Major League baseball players. The Bayesian paradigm provides a principled method for balancing past performance with crucial covariates, such as player age and position. We share information across time and across players by using mixture distributions to control shrinkage for improved accuracy. We compare the performance of our model to current sabermetric methods on a held-out season (2006), and discuss both successes and limitations

arXiv.org e-Print Archive

CiteSeerX

Crossref