23 research outputs found

    An Algorithmic Assessment of Parole Decisions

    No full text
    Objectives: Parole is an important mechanism for alleviating the extraordinary social and financial costs of mass incarceration. Yet parole boards can also present a major obstacle, denying parole to low-risk inmates who could safely be released from prison. We evaluate a major parole institution, the New York State Parole Board, quantifying the costs of suboptimal decision-making.Methods: Using ensemble Machine Learning, we predict any arrest and any violent felony arrest within three years to generate criminal risk predictions for individuals released on parole in New York from 2012–2015. We quantify the social welfare loss of the Board’s suboptimal decisions by rank ordering inmates by their predicted risk and estimating the crime rates that could be observed with counterfactual risk-based release decisions. We also estimate the release rates that could be achieved holding arrest rates constant. We attend to the “selective labels” problem in several ways, including by testing the validity of the algorithm for individuals who were denied parole but later released after the expiration of their sentence.Results: We conservatively estimate that the Board could have more than doubled the release rate without increasing the total or violent felony arrest rate, and that they could have achieved these gains while simultaneously eliminating racial disparities in release rates.Conclusions: This study demonstrates the use of algorithms for evaluating criminal justice decision-making. Our analyses suggest that many low risk individuals are being unnecessarily incarcerated, highlighting the need for major parole reform

    Entertainment As Crime Prevention: Evidence from Chicago Sports Games.

    No full text
    The concern that mass media may be responsible for aggressive and criminal behavior is widespread. Comparatively little consideration has been given to its potential diversionary function. This paper contributes to the emerging body of literature on entertainment as a determinant of crime by analyzing Chicago by-the-minute crime reports during major sporting events. Sports provide an exogenous infusion of TV diversion that we leverage to test the effect of entertainment on crime. Because the scheduling of a sporting event should be random with respect to crime within a given month, day of the week, and time, we use month-time-day-of-week fixed effects to estimate the effect of the sporting events on crime. We compare crime reports by the half hour when Chicago’s NFL, NBA, or MLB teams are playing to crime reports at the same time, day, and month when the teams are not playing. We conduct the same analysis for the Super Bowl, NBA Finals, and MLB World Series. The Super Bowl generates the most dramatic declines: total crime reports decrease by approximately 25 percent (roughly 60 fewer crimes). The decline is partially offset by an increase in crime before the game, most notably in drug and prostitution reports, and an uptick in reports of violent crime immediately after the game. Crime during Chicago Bears Monday night football games is roughly 15 percent lower (30 fewer crimes) than during the same time on non-game nights. Our results show similar but smaller effects for NBA and MLB games. Except for the Super Bowl, we find little evidence for temporal crime displacement before or after the games. In general, we find substantial declines during games across crime types – property, violent, drug, and other – with the largest reductions for drug crime. We believe fewer potential offenders on the streets largely explain the declines in crime

    Big Data, Machine Learning, and the Credibility Revolution in Empirical Legal Studies

    No full text
    The so-called credibility revolution changed empirical research (see Angrist and Pischke 2010). Before the revolution, researchers frequently relied on attempts to statistically model the world to make causal inferences from observational data. They would control for confounders, make functional form assumptions about the relationships between variables, and read regression coefficients on variables of interest as causal estimates. In essence, they would rely heavily on ex post statistical analysis to make causal inferences. The revolution centered around the idea that the only way to truly account for possible sources of bias is to remove the influence of all confounders ex ante through better research design. Thus, since the revolution, researchers have attempted to design studies around sources of random or as-if random variation, either with experiments or what have become known as “quasi-experimental” designs. This credibility revolution has increasingly brought quantitative researchers into agreement that, in the words of Donald Rubin, “design trumps analysis” (Rubin 2008). However, the research landscape has changed dramatically in recent years. We are now in an era of “big data.” At the same time as the internet vastly expanded the number of available data sources, sophisticated computational resources became widely accessible. This has opened up a whole new frontier for social scientists and empirical legal scholars: textual data. Indeed, most of the information we have about law, politics, and society is contained in texts of one kind or another, almost all of which are now digitized and available online. For example, in the 1990s, federal courts began to adopt online case records management—known as CM/ECF—where attorneys, clerks, and judges file and access documents related to each case.1 Using the federal government’s PACER database (available at pacer.gov), researchers (both academic and professional) can now easily access the dockets and filings for each case that is filed in a federal court. LexisNexis, Westlaw, and other companies have further improved access by providing raw text versions of a wide range of legal documents, along with expert-coded metadata to help researchers more easily find what they are looking for. And yet, despite the potential of these newly available resources, the sheer volume presents challenges for researchers. A core problem is how to draw substantively important inferences from a mountain of often unstructured digitized text. To deal with this challenge, researchers are turning their attention back toward the tools of statistical analysis. As many of the essays in this volume demonstrate, there is now a surging interest among researchers in one particularly powerful tool of statistical analysis: machine learning. This chapter addresses the place of machine learning in a post–“credibility revolution” landscape. We begin with an overview of machine learning and then make four main points. First, design still trumps analysis. The lessons of the credibility revolution should not be forgotten in the excitement around machine learning; machine learning does nothing to address the problem of omitted variable bias. Nonetheless, machine learning can improve a researcher’s data analysis. Indeed, with growing concerns about the reliability of even design-based research, perhaps we should be aiming for triangulation rather than design purism. Further, for some questions, we do not have the luxury of waiting for a strong design, and we need a best approximation of answer in the meantime. Second, even design-committed researchers should not ignore machine learning: it can be used in service of design-based studies to make causal estimates less variable, less biased, and more heterogeneous. Third, there are important policy-relevant prediction problems for which machine learning is particularly valuable (e.g., predicting recidivism in the criminal justice system). Yet even with research questions centered around prediction, a focus on design is still essential. As with causal inference, researchers cannot simply rely on statistical models but must also carefully consider threats to the validity of predictions. We briefly review some of these threats: GIGO (“garbage in, garbage out”), selective labels, and Campbell’s law. Fourth, the predictive power of machine learning can be leveraged for descriptive research. Where possible, we illustrate these points using examples drawn from real-world research
    corecore