5 research outputs found
A space-time conditional intensity model for infectious disease occurence
A novel point process model continuous in space-time is proposed for infectious disease data. Modelling is based on the conditional intensity function (CIF) and extends an additive-multiplicative CIF model previously proposed for discrete space epidemic modelling. Estimation is performed by means of full maximum likelihood and a simulation algorithm is presented. The particular application of interest is the stochastic modelling of the transmission dynamics of the two most common meningococcal antigenic sequence types observed in Germany 2002â2008. Altogether, the proposed methodology represents a comprehensive and universal regression framework for the modelling, simulation and inference of self-exciting spatio-temporal point processes based on the CIF. Application is promoted by an implementation in the R package RLadyBug
Learning to rank spatio-temporal event hotspots
Background
Crime, traffic accidents, terrorist attacks, and other space-time random events are unevenly distributed in space and time. In the case of crime, hotspot and other proactive policing programs aim to focus limited resources at the highest risk crime and social harm hotspots in a city. A crucial step in the implementation of these strategies is the construction of scoring models used to rank spatial hotspots. While these methods are evaluated by area normalized Recall@k (called the predictive accuracy index), models are typically trained via maximum likelihood or rules of thumb that may not prioritize model accuracy in the top k hotspots. Furthermore, current algorithms are defined on fixed grids that fail to capture risk patterns occurring in neighborhoods and on road networks with complex geometries.
Results
We introduce CrimeRank, a learning to rank boosting algorithm for determining a crime hotspot map that directly optimizes the percentage of crime captured by the top ranked hotspots. The method employs a floating grid combined with a greedy hotspot selection algorithm for accurately capturing spatial risk in complex geometries. We illustrate the performance using crime and traffic incident data provided by the Indianapolis Metropolitan Police Department, IED attacks in Iraq, and data from the 2017 NIJ Real-time crime forecasting challenge.
Conclusion
Our learning to rank strategy was the top performing solution (PAI metric) in the 2017 challenge. We show that CrimeRank achieves even greater gains when the competition rules are relaxed by removing the constraint that grid cells be a regular tessellation
Spatio-temporal clustering of natural hazards
Natural hazards are inherently spatio-temporal processes. Spatio-temporal clustering methodologies applied to natural hazard data can help distinguish clustering patterns that would not only identify point-event dense regions and time periods, but also characterise the hazardous process. In Chapter 2, spatio-temporal clustering methodologies applicable to point event and trajectory datasets representative of natural hazards are reviewed by critically examining 143 scientific publications from various fields of study. These methodologies include clustering measures that are either (i) global (providing a single quantitative measure of the degree of clustering in the dataset) or (ii) local (i.e. assigning individual point events to a cluster). A common application and analysis framework of combining global and local measures for application to point event data is proposed. For global measures, K-functions analysis and for local measures, a space-time scan statistic with kernel density estimation as an aiding methodology within the framework are selected. For trajectories, a density-based local clustering measure Trajectory-OPTICS is selected. In Chapter 3, to assess the performance of the methodology framework, real-world natural hazard data and synthetic datasets, either representative of natural hazards or used as performance benchmarks for application, are presented and characterised. A point event dataset of 12,521 lightning strikes recorded on 1 July 2015 over the UK is selected, where a severe three-storm system crossed the region with different convective modes. It is also used as a case study together with a dataset of 77,252 lightning strikes on 28 June 2012 over the UK to characterise and model lightning strikes as point events produced by a moving source. Each source has a set number of points events, initiation point in space and time, movement speed, direction, inter-event time distribution and spatial spread distribution. Movement speed, inter-event time and spatial spread distributions are characterised based on the two case studies. Inter-event time values range from below 0.01 s to over 100 s for individual storms from both case studies. A least-squares plane fit in the spatio-temporal domain estimates a range of representative movement speed values of 47â60 km hâ1 for the first and 66â111 km hâ1 for the second case study. Based on these values, single (Model 3) and three storm (Model 4) models are generated to form a simulation study of point event datasets representing various physical lightning characteristics, each with three variations in their movement speed and spatial spread input parameters. For trajectories, the Atlantic hurricane database (HURDAT2) is used to select a real-world dataset of 316 hurricanes. Homogeneous and clustered trajectory datasets are generated as benchmarks for Trajectory-OPTICS. In Chapter 4, the clustering methodology framework identified in Chapter 2 is applied to all the real-world and synthetic datasets presented in Chapter 3. K-function analysis results are used to inform the range of bandwidth values for the kernel density estimation. A leave-one-out estimator is used to find the optimal values. A value threshold on the probability density values from the kernel density estimation is imposed to identify high probability density space-time volumes. These volumes are used as centroids for applying the scan statistic as a local clustering measure. The elliptic scan statistic is unable to identify individual lightning strike clusters within the same storm source for storm sources with small temporal separation (Model 4). Chapter 5 extends the elliptic scan statistic by including an âInclination heightâ parameter as the temporal distance between the major axis points of the ellipse basis. With detailed selection of input parameter ranges, the inclined elliptic scan statistic is applied to Model 4 and its variations and is able to identify point event cluster produced by a moving source and the point events assigned to the cluster are from the same storm source
Surveillance to detect emerging space time clusters.
The interest is on monitoring incoming space time events to detect an emergent
space time cluster as early as possible. Assume that point process events are continuously
recorded in space and time. In a certain unknown moment, a small localized cluster of
increased intensity starts to emerge. Its location is also unknown. The aim is to let an alarm
to go off as soon as possible after its emergence, but avoiding that it goes off unnecessarily.
The alarm system should also provide an estimate of the cluster location. In addition to
that, the alarm system should take into account the purely spatial and the purely temporal
heterogeneity, which are not specified by the user. A space time surveillance system
with these characteristics using a martingale approach to derive the surveillance system
properties is proposed. The average run length for the situation when there are clusters
present in the data is appropriately defined and the method is illustrated in practice. The
algorithm is implemented in a freely available stand-alone software and it is also a feature
in a freely available GIS system