83 research outputs found
Latent Self-Exciting Point Process Model for Spatial-Temporal Networks
We propose a latent self-exciting point process model that describes
geographically distributed interactions between pairs of entities. In contrast
to most existing approaches that assume fully observable interactions, here we
consider a scenario where certain interaction events lack information about
participants. Instead, this information needs to be inferred from the available
observations. We develop an efficient approximate algorithm based on
variational expectation-maximization to infer unknown participants in an event
given the location and the time of the event. We validate the model on
synthetic as well as real-world data, and obtain very promising results on the
identity-inference task. We also use our model to predict the timing and
participants of future events, and demonstrate that it compares favorably with
baseline approaches.Comment: 20 pages, 6 figures (v3); 11 pages, 6 figures (v2); previous version
appeared in the 9th Bayesian Modeling Applications Workshop, UAI'1
Multivariate Spatiotemporal Hawkes Processes and Network Reconstruction
There is often latent network structure in spatial and temporal data and the
tools of network analysis can yield fascinating insights into such data. In
this paper, we develop a nonparametric method for network reconstruction from
spatiotemporal data sets using multivariate Hawkes processes. In contrast to
prior work on network reconstruction with point-process models, which has often
focused on exclusively temporal information, our approach uses both temporal
and spatial information and does not assume a specific parametric form of
network dynamics. This leads to an effective way of recovering an underlying
network. We illustrate our approach using both synthetic networks and networks
constructed from real-world data sets (a location-based social media network, a
narrative of crime events, and violent gang crimes). Our results demonstrate
that, in comparison to using only temporal data, our spatiotemporal approach
yields improved network reconstruction, providing a basis for meaningful
subsequent analysis --- such as community structure and motif analysis --- of
the reconstructed networks
Crime Topic Modeling
The classification of crime into discrete categories entails a massive loss
of information. Crimes emerge out of a complex mix of behaviors and situations,
yet most of these details cannot be captured by singular crime type labels.
This information loss impacts our ability to not only understand the causes of
crime, but also how to develop optimal crime prevention strategies. We apply
machine learning methods to short narrative text descriptions accompanying
crime records with the goal of discovering ecologically more meaningful latent
crime classes. We term these latent classes "crime topics" in reference to
text-based topic modeling methods that produce them. We use topic distributions
to measure clustering among formally recognized crime types. Crime topics
replicate broad distinctions between violent and property crime, but also
reveal nuances linked to target characteristics, situational conditions and the
tools and methods of attack. Formal crime types are not discrete in topic
space. Rather, crime types are distributed across a range of crime topics.
Similarly, individual crime topics are distributed across a range of formal
crime types. Key ecological groups include identity theft, shoplifting,
burglary and theft, car crimes and vandalism, criminal threats and confidence
crimes, and violent crimes. Though not a replacement for formal legal crime
classifications, crime topics provide a unique window into the heterogeneous
causal processes underlying crime.Comment: 47 pages, 4 tables, 7 figure
Reducing Bias in Estimates for the Law of Crime Concentration
Objectives
The law of crime concentration states that half of the cumulative crime in a city will occur within approximately 4% of the city’s geography. The law is demonstrated by counting the number of incidents in each of N spatial areas (street segments or grid cells) and then computing a parameter based on the counts, such as a point estimate on the Lorenz curve or the Gini index. Here we show that estimators commonly used in the literature for these statistics are biased when the number of incidents is low (several thousand or less). Our objective is to significantly reduce bias in estimators for the law of crime concentration.
Methods
By modeling crime counts as a negative binomial, we show how to compute an improved estimate of the law of crime concentration at low event counts that significantly reduces bias. In particular, we use the Poisson–Gamma representation of the negative binomial and compute the concentration statistic via integrals for the Lorenz curve and Gini index of the inferred continuous Gamma distribution.
Results
We illustrate the Poisson–Gamma method with synthetic data along with homicide data from Chicago. We show that our estimator significantly reduces bias and is able to recover the true law of crime concentration with only several hundred events.
Conclusions
The Poisson–Gamma method has applications to measuring the concentration of rare events, comparisons of concentration across cities of different sizes, and improving time series estimates of crime concentration
Towards understanding crime dynamics in a heterogeneous environment:A mathematical approach
Crime data provides information on the nature and location of the crime but, in general, does not include information on the number of criminals operating in a region. By contrast, many approaches to crime reduction necessarily involve working with criminals or individuals at risk of engaging in criminal activity and so the dynamics of the criminal population is important. With this in mind, we develop a mechanistic, mathematical model which combines the number of crimes and number of criminals to create a dynamical system. Analysis of the model highlights a threshold for criminal efficiency, below which criminal numbers will settle to an equilibrium level that can be exploited to reduce crime through prevention. This efficiency measure arises from the initiation of new criminals in response to observation of criminal activity; other initiation routes - via opportunism or peer pressure - do not exhibit such thresholds although they do impact on the level of criminal activity observed. We used data from Cape Town, South Africa, to obtain parameter estimates and predicted that the number of criminals in the region is tending towards an equilibrium point but in a heterogeneous manner - a drop in the number of criminals from low crime neighbourhoods is being offset by an increase from high crime neighbourhoods
A Penalized Likelihood Method for Balancing Accuracy and Fairness in Predictive Policing
Racial bias of predictive policing algorithms has been the focus of recent research and, in the case of Hawkes processes, feedback loops are possible where biased arrests are amplified through self-excitation, leading to hotspot formation and further arrests of minority populations. In this article we develop a penalized likelihood approach for introducing fairness into point process models of crime. In particular, we add a penalty term to the likelihood function that encourages the amount of police patrol received by each of several demographic groups to be proportional to the representation of that group in the total population. We apply our model to historical crime incident data in Indianapolis and measure the fairness and accuracy of the two approaches across several crime categories. We show that fairness can be introduced into point process models of crime so that patrol levels proportionally match demographics, though at a cost of reduced accuracy of the algorithms
Semi-Supervised First-Person Activity Recognition in Body-Worn Video
Body-worn cameras are now commonly used for logging daily life, sports, and
law enforcement activities, creating a large volume of archived footage. This
paper studies the problem of classifying frames of footage according to the
activity of the camera-wearer with an emphasis on application to real-world
police body-worn video. Real-world datasets pose a different set of challenges
from existing egocentric vision datasets: the amount of footage of different
activities is unbalanced, the data contains personally identifiable
information, and in practice it is difficult to provide substantial training
footage for a supervised approach. We address these challenges by extracting
features based exclusively on motion information then segmenting the video
footage using a semi-supervised classification algorithm. On publicly available
datasets, our method achieves results comparable to, if not better than,
supervised and/or deep learning methods using a fraction of the training data.
It also shows promising results on real-world police body-worn video
- …