313 research outputs found

    A Penalized Likelihood Method for Balancing Accuracy and Fairness in Predictive Policing

    Get PDF
    Racial bias of predictive policing algorithms has been the focus of recent research and, in the case of Hawkes processes, feedback loops are possible where biased arrests are amplified through self-excitation, leading to hotspot formation and further arrests of minority populations. In this article we develop a penalized likelihood approach for introducing fairness into point process models of crime. In particular, we add a penalty term to the likelihood function that encourages the amount of police patrol received by each of several demographic groups to be proportional to the representation of that group in the total population. We apply our model to historical crime incident data in Indianapolis and measure the fairness and accuracy of the two approaches across several crime categories. We show that fairness can be introduced into point process models of crime so that patrol levels proportionally match demographics, though at a cost of reduced accuracy of the algorithms

    Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns

    Full text link
    As machine learning is increasingly used to make real-world decisions, recent research efforts aim to define and ensure fairness in algorithmic decision making. Existing methods often assume a fixed set of observable features to define individuals, but lack a discussion of certain features not being observed at test time. In this paper, we study fairness of naive Bayes classifiers, which allow partial observations. In particular, we introduce the notion of a discrimination pattern, which refers to an individual receiving different classifications depending on whether some sensitive attributes were observed. Then a model is considered fair if it has no such pattern. We propose an algorithm to discover and mine for discrimination patterns in a naive Bayes classifier, and show how to learn maximum likelihood parameters subject to these fairness constraints. Our approach iteratively discovers and eliminates discrimination patterns until a fair model is learned. An empirical evaluation on three real-world datasets demonstrates that we can remove exponentially many discrimination patterns by only adding a small fraction of them as constraints

    Learning to rank spatio-temporal event hotspots

    Get PDF
    Background Crime, traffic accidents, terrorist attacks, and other space-time random events are unevenly distributed in space and time. In the case of crime, hotspot and other proactive policing programs aim to focus limited resources at the highest risk crime and social harm hotspots in a city. A crucial step in the implementation of these strategies is the construction of scoring models used to rank spatial hotspots. While these methods are evaluated by area normalized Recall@k (called the predictive accuracy index), models are typically trained via maximum likelihood or rules of thumb that may not prioritize model accuracy in the top k hotspots. Furthermore, current algorithms are defined on fixed grids that fail to capture risk patterns occurring in neighborhoods and on road networks with complex geometries. Results We introduce CrimeRank, a learning to rank boosting algorithm for determining a crime hotspot map that directly optimizes the percentage of crime captured by the top ranked hotspots. The method employs a floating grid combined with a greedy hotspot selection algorithm for accurately capturing spatial risk in complex geometries. We illustrate the performance using crime and traffic incident data provided by the Indianapolis Metropolitan Police Department, IED attacks in Iraq, and data from the 2017 NIJ Real-time crime forecasting challenge. Conclusion Our learning to rank strategy was the top performing solution (PAI metric) in the 2017 challenge. We show that CrimeRank achieves even greater gains when the competition rules are relaxed by removing the constraint that grid cells be a regular tessellation

    Challenges of contemporary predictive policing

    Get PDF
    Big data algorithms developed for predictive policing are increasingly present in the everyday work of law enforcement. There are various applications of such technologies to pre-dict crimes, potential crime scenes, profiles of perpetrators, and more. In this way, police officers are provided with appropriate assistance in their work, increasing their efficiency or entirely replacing them in specific tasks. Although technologically advanced, police use force and arrest, so prediction algorithms can have significantly different, more drastic consequences as com-pared to those that similar technologies would produce in agriculture, industry, or health. For further development of predictive policing, it is necessary to have a clear picture of the problems it can cause. This paper discusses modern predictive policing from the perspective of challenges that negatively affect its application

    FiSH: Fair Spatial Hotspots

    Get PDF
    Pervasiveness of tracking devices and enhanced availability of spatially located data has deepened interest in using them for various policy interventions, through computational data analysis tasks such as spatial hot spot detection. In this paper, we consider, for the first time to our best knowledge, fairness in detecting spatial hot spots. We motivate the need for ensuring fairness through statistical parity over the collective population covered across chosen hot spots. We then characterize the task of identifying a diverse set of solutions in the noteworthiness-fairness trade-off spectrum, to empower the user to choose a trade-off justified by the policy domain. Being a novel task formulation, we also develop a suite of evaluation metrics for fair hot spots, motivated by the need to evaluate pertinent aspects of the task. We illustrate the computational infeasibility of identifying fair hot spots using naive and/or direct approaches and devise a method, codenamed {\it FiSH}, for efficiently identifying high-quality, fair and diverse sets of spatial hot spots. FiSH traverses the tree-structured search space using heuristics that guide it towards identifying effective and fair sets of spatial hot spots. Through an extensive empirical analysis over a real-world dataset from the domain of human development, we illustrate that FiSH generates high-quality solutions at fast response times

    Towards social fairness in smart policing: leveraging territorial, racial, and workload fairness in the police districting problem

    Get PDF
    Recent events (e.g., George Floyd protests) have shown the impact that inequality in policing can have on society. Thus, police operations should be planned and designed taking into account the interests of three main groups of directly affected stakeholders (i.e., general population, minorities, and police agents) to pursue fairness. Most models presented so far in the literature failed at this, optimizing cost efficiency or operational effectiveness instead while disregarding other social goals. In this paper, a Smart Policing model that produces operational patrolling districts and includes territorial, racial, and workload fairness criteria is proposed. The patrolling configurations are designed according to the territorial distribution of crime risk and population subgroups, while equalizing the total risk exposure across the districts, according to the preferences of a decision-maker. The model is formulated as a multi-objective mixed-integer program. Computational experiments on randomly generated data are used to empirically draw insights into the relationship between the fairness criteria considered. Finally, the model is tested and validated on a real-world dataset about the Central District of Madrid (Spain). Experiments show that the model identifies solutions that dominate the current patrolling configuration used

    Equal Protection Under Algorithms: A New Statistical and Legal Framework

    Get PDF
    In this Article, we provide a new statistical and legal framework to understand the legality and fairness of predictive algorithms under the Equal Protection Clause. We begin by reviewing the main legal concerns regarding the use of protected characteristics such as race and the correlates of protected characteristics such as criminal history. The use of race and nonrace correlates in predictive algorithms generates direct and proxy effects of race, respectively, that can lead to racial disparities that many view as unwarranted and discriminatory. These effects have led to the mainstream legal consensus that the use of race and nonrace correlates in predictive algorithms is both problematic and potentially unconstitutional under the Equal Protection Clause. This mainstream position is also reflected in practice, with all commonly used predictive algorithms excluding race and many excluding nonrace correlates such as employment and education. Next, we challenge the mainstream legal position that the use of a protected characteristic always violates the Equal Protection Clause. We develop a statistical framework that formalizes exactly how the direct and proxy effects of race can lead to algorithmic predictions that disadvantage minorities relative to nonminorities. While an overly formalistic solution requires exclusion of race and all potential nonrace correlates, we show that this type of algorithm is unlikely to work in practice because nearly all algorithmic inputs are correlated with race. We then show that there are two simple statistical solutions that can eliminate the direct and proxy effects of race, and which are implementable even when all inputs are correlated with race. We argue that our proposed algorithms uphold the principles of the equal protection doctrine because they ensure that individuals are not treated differently on the basis of membership in a protected class, in stark contrast to commonly used algorithms that unfairly disadvantage minorities despite the exclusion of race. We conclude by empirically testing our proposed algorithms in the context of the New York City pretrial system. We show that nearly all commonly used algorithms violate certain principles underlying the Equal Protection Clause by including variables that are correlated with race, generating substantial proxy effects that unfairly disadvantage Black individuals relative to white individuals. Both of our proposed algorithms substantially reduce the number of Black defendants detained compared to commonly used algorithms by eliminating these proxy effects. These findings suggest a fundamental rethinking of the equal protection doctrine as it applies to predictive algorithms and the folly of relying on commonly used algorithms

    Towards fair budget-constrained machine learning

    Get PDF

    The Scored Society: Due Process for Automated Predictions

    Get PDF
    Big Data is increasingly mined to rank and rate individuals. Predictive algorithms assess whether we are good credit risks, desirable employees, reliable tenants, valuable customers—or deadbeats, shirkers, menaces, and “wastes of time.” Crucial opportunities are on the line, including the ability to obtain loans, work, housing, and insurance. Though automated scoring is pervasive and consequential, it is also opaque and lacking oversight. In one area where regulation does prevail—credit—the law focuses on credit history, not the derivation of scores from data. Procedural regularity is essential for those stigmatized by “artificially intelligent” scoring systems. The American due process tradition should inform basic safeguards. Regulators should be able to test scoring systems to ensure their fairness and accuracy. Individuals should be granted meaningful opportunities to challenge adverse decisions based on scores miscategorizing them. Without such protections in place, systems could launder biased and arbitrary data into powerfully stigmatizing scores
    • …
    corecore