9 research outputs found
Statistical modelling of spatio-temporal decision data
This thesis consists of three independent research studies in the fields of statistical and behavioural science. Each study is concerned with modelling complex spatio-temporal decisions recorded in police data. Analysing decisions at a high resolution requires a comprehensive understanding of the social phenomenon and data-generating mechanism, combined with careful modelling choices.
Chapter 1 is a novel model of ethnic bias at the officer-level in stop and search. Using a Bayesian hierarchical model, we model officer over-searching against two officer-specific baselines: the crime suspects that the officer encounters and the local patrolling area of the officer. We find that most police officers are biased against Black and Asian people in their search decisions, independently of which baseline we use. Furthermore, we decompose bias against ethnic minority groups into bias due to officer over-searching and over-patrolling.
Chapter 2 showcases the use of a spatio-temporal Hawkes-type point process to model the reporting of domestic abuse. Extending existing Hawkes models, we test for the existence of two spillover channels in crime victim reporting. Despite well-documented spillover effects in other human behaviour, we find no evidence to support such effects in the reporting of domestic abuse.
Chapter 3 introduces a new, robust statistical inference procedure for discrete outcomes. We propose using the Total Variation Distance together with Bayesian Nonparametric Learning to robustify inference. We show that this procedure possesses a range of desirable theoretical properties. Furthermore, we demonstrate that our method outperforms standard inference both in terms of inference and out-of-sample performance on simulated data. Lastly, we show that robust inference is important for modelling police-recorded incidence of sexual offences where fluctuations in reporting can drastically affect inference.
I conclude by discussing the importance of sophisticated statistical approaches to reflect often complicated underlying social phenomenon and the equally complex process by which it is recorded in data
Officer bias, over-patrolling, and ethnic disparities in stop and search
Black and Asian people in the United Kingdom are more likely to be stopped and searched by police than White people. Following a panel of 36,000 searches by 1,100 police officers at a major English police force, we provide officer-specific measures of over-searching relative to two baselines: the ethnic composition of crime suspects officers interact with and the ethnic composition of the areas they patrol. We show that the vast majority of officers over-search ethnic minorities against both baselines. But we also find that the over-searching by individual officers cannot account for all of the over-representation of ethnic minorities in stop and search: over-patrolling of minority areas is also a key factor. Decomposing the overall search bias, we find that the over-representation of Asian people in stop and search is primarily accounted for by over-patrolling, while the over-representation of Black people is a combination of officer and patrol effects, with the larger contribution coming from biases of officers
Improving crime count forecasts using Twitter and taxi data
Data from social media has created opportunities to understand how and why people move through their urban environment and how this relates to criminal activity. To aid resource allocation decisions in the scope of predictive policing, the paper proposes an approach to predict weekly crime counts. The novel approach captures spatial dependency of criminal activity through approximating human dynamics. It integrates point of interest data in the form of Foursquare venues with Twitter activity and taxi trip data, and introduces a set of approaches to create features from these data sources. Empirical results demonstrate the explanatory and predictive power of the novel features. Analysis of a six-month period of real-world crime data for the city of New York evidences that both temporal and static features are necessary to eectively account for human dynamics and predict crime counts accurately. Furthermore, results provide new evidence into the underlying mechanisms of crime and give implications for crime analysis and intervention
Marketplaces for Digital Data: Quo Vadis?
The newly emerging market for data is insufficiently researched up to now. The survey presented in this work - which is the third iteration of a a series of studies that started in 2012 - intends to provide a deeper understanding of this emerging type of market. Research questions concerning the provider manifestations and the commoditization of data are identified. The findings indicate that data providers focus on limited business models and that data remains individualized and differentiated. Nevertheless, a trend towards commoditization for certain types of data can be foreseen, which even allows an outlook to further developments in this area
Estimating carbon footprints from large scale financial transaction data
Financial transactions are increasingly used by consumer apps and financial service providers to estimate consumption-based carbon emissions. This approach promises a low-resource, ultra-fast, and highly scalable approach to measuring emissions at different levels of potential policy intervention—spanning the national, subnational, local, and individual level. Despite this potential, there is a lack of research exploring the validity of this approach to carbon profiling. Here we address this oversight in three ways. First, we provide a step-by-step description of our approach toward estimating carbon footprints from micro-level transaction data generated by more than 100,000 customers of a large retail bank in the United Kingdom. Second, we quantitatively compare emission estimates obtained from transaction data with those calculated from a more standard data source used in carbon profiling, the largest household expenditure survey in the United Kingdom. Third, we offer a detailed qualitative comparison of the advantages and disadvantages of transactions versus alternative data sources (such as survey data), across key dimensions including data availability, data quality, and data detail. We find that financial transactions offer a credible alternative to survey-based sources and, if made more widely accessible, could provide important advantages for profiling emissions. These include objective, micro-level data on consumption behaviors, larger sample sizes, and longitudinal, frequent data capture
Estimating carbon footprints from large scale financial transaction data
Financial transactions are increasingly used by consumer apps and financial services providers to estimate consumption-based carbon emissions. This approach promises a low-resource, ultra-fast and highly scalable approach to measuring emissions at different levels of potential policy intervention – spanning the national, subnational, local, and individual-level. Despite this potential, there is a lack of research formally exploring the validity of this approach to carbon profiling. Here we address this oversight in two ways. Firstly, by using transactions from more than 100,000 customers of a large retail bank, we quantitatively compare emission estimates with those calculated from a more standard data source used in carbon profiling, the UK household expenditure survey. Secondly, we offer a detailed qualitative comparison of the advantages and disadvantages of transactions versus alternatives (such as survey data), across dimensions including data availability, data quality and data detail. We find financial transactions offer a credible alternative to survey-based sources and, if made more widely accessible, provide important advantages for profiling emissions. These include objective, micro-level data on consumption behaviours, larger sample sizes and longitudinal, frequent data capture