4,777 research outputs found
Mining large-scale human mobility data for long-term crime prediction
Traditional crime prediction models based on census data are limited, as they
fail to capture the complexity and dynamics of human activity. With the rise of
ubiquitous computing, there is the opportunity to improve such models with data
that make for better proxies of human presence in cities. In this paper, we
leverage large human mobility data to craft an extensive set of features for
crime prediction, as informed by theories in criminology and urban studies. We
employ averaging and boosting ensemble techniques from machine learning, to
investigate their power in predicting yearly counts for different types of
crimes occurring in New York City at census tract level. Our study shows that
spatial and spatio-temporal features derived from Foursquare venues and
checkins, subway rides, and taxi rides, improve the baseline models relying on
census and POI data. The proposed models achieve absolute R^2 metrics of up to
65% (on a geographical out-of-sample test set) and up to 89% (on a temporal
out-of-sample test set). This proves that, next to the residential population
of an area, the ambient population there is strongly predictive of the area's
crime levels. We deep-dive into the main crime categories, and find that the
predictive gain of the human dynamics features varies across crime types: such
features bring the biggest boost in case of grand larcenies, whereas assaults
are already well predicted by the census features. Furthermore, we identify and
discuss top predictive features for the main crime categories. These results
offer valuable insights for those responsible for urban policy or law
enforcement
Modeling Taxi Drivers' Behaviour for the Next Destination Prediction
In this paper, we study how to model taxi drivers' behaviour and geographical
information for an interesting and challenging task: the next destination
prediction in a taxi journey. Predicting the next location is a well studied
problem in human mobility, which finds several applications in real-world
scenarios, from optimizing the efficiency of electronic dispatching systems to
predicting and reducing the traffic jam. This task is normally modeled as a
multiclass classification problem, where the goal is to select, among a set of
already known locations, the next taxi destination. We present a Recurrent
Neural Network (RNN) approach that models the taxi drivers' behaviour and
encodes the semantics of visited locations by using geographical information
from Location-Based Social Networks (LBSNs). In particular, RNNs are trained to
predict the exact coordinates of the next destination, overcoming the problem
of producing, in output, a limited set of locations, seen during the training
phase. The proposed approach was tested on the ECML/PKDD Discovery Challenge
2015 dataset - based on the city of Porto -, obtaining better results with
respect to the competition winner, whilst using less information, and on
Manhattan and San Francisco datasets.Comment: preprint version of a paper submitted to IEEE Transactions on
Intelligent Transportation System
Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization
Creating impact in real-world settings requires artificial intelligence
techniques to span the full pipeline from data, to predictive models, to
decisions. These components are typically approached separately: a machine
learning model is first trained via a measure of predictive accuracy, and then
its predictions are used as input into an optimization algorithm which produces
a decision. However, the loss function used to train the model may easily be
misaligned with the end goal, which is to make the best decisions possible.
Hand-tuning the loss function to align with optimization is a difficult and
error-prone process (which is often skipped entirely).
We focus on combinatorial optimization problems and introduce a general
framework for decision-focused learning, where the machine learning model is
directly trained in conjunction with the optimization algorithm to produce
high-quality decisions. Technically, our contribution is a means of integrating
common classes of discrete optimization problems into deep learning or other
predictive models, which are typically trained via gradient descent. The main
idea is to use a continuous relaxation of the discrete problem to propagate
gradients through the optimization procedure. We instantiate this framework for
two broad classes of combinatorial problems: linear programs and submodular
maximization. Experimental results across a variety of domains show that
decision-focused learning often leads to improved optimization performance
compared to traditional methods. We find that standard measures of accuracy are
not a reliable proxy for a predictive model's utility in optimization, and our
method's ability to specify the true goal as the model's training objective
yields substantial dividends across a range of decision problems.Comment: Full version of paper accepted at AAAI 201
Multi Stage based Time Series Analysis of User Activity on Touch Sensitive Surfaces in Highly Noise Susceptible Environments
This article proposes a multistage framework for time series analysis of user
activity on touch sensitive surfaces in noisy environments. Here multiple
methods are put together in multi stage framework; including moving average,
moving median, linear regression, kernel density estimation, partial
differential equations and Kalman filter. The proposed three stage filter
consisting of partial differential equation based denoising, Kalman filter and
moving average method provides ~25% better noise reduction than other methods
according to Mean Squared Error (MSE) criterion in highly noise susceptible
environments. Apart from synthetic data, we also obtained real world data like
hand writing, finger/stylus drags etc. on touch screens in the presence of high
noise such as unauthorized charger noise or display noise and validated our
algorithms. Furthermore, the proposed algorithm performs qualitatively better
than the existing solutions for touch panels of the high end hand held devices
available in the consumer electronics market qualitatively.Comment: 9 pages (including 9 figures and 3 tables); International Journal of
Computer Applications (published
Recommended from our members
Towards Prediction of Non-Radiative Decay Pathways in Organic Compounds I: The Case of Naphthalene Quantum Yields
Many emerging technologies depend on human’s ability to control and manipulate the excited-state properties of molecular systems. These technologies include fluorescent labeling in biomedical imaging, light harvesting in photovoltaics, and electroluminescence in light-emitting devices. All of these systems suffer from non-radiative loss pathways that dissipate electronic energy as heat, which causes the overall system efficiency to be directly linked to quantum yield (Φ) of the molecular excited state. Unfortunately, Φ is very difficult to predict from first principles because the description of a slow non-radiative decay mechanism requires an accurate description of long-timescale excited-state quantum dynamics. In the present study, we introduce an efficient semiempirical method of calculating the fluorescence quantum yield (Φfl) for molecular chromophores, which, based on machine learning, converts simple electronic energies computed using time-dependent density functional theory (TDDFT) into an estimate of Φfl. As with all machine learning strategies, the algorithm needs to be trained on fluorescent dyes for which Φfl’s are known, so as to provide a black-box method which can later predict Φfl’s for chemically similar chromophores that have not been studied experimentally. As a first illustration of how our proposed algorithm can be trained, we examine a family of 25 naphthalene derivatives. The simplest application of the energy gap law is found to be inadequate to explain the rates of internal conversion (IC) or intersystem crossing (ISC) – the electronic properties of at least one higher-lying electronic state (Sn or Tn) or one far-from-equilibrium geometry are typically needed to obtain accurate results. Indeed, the key descriptors turn out to be the transition state between the Franck–Condon minimum a distorted local minimum near an S0/S1 conical intersection (which governs IC) and the magnitude of the spin–orbit coupling (which governs ISC). The resulting Φfl’s are predicted with reasonable accuracy (±22%), making our approach a promising ingredient for high-throughput screening and rational design of the molecular excited states with desired Φ’s. We thus conclude that our model, while semi-empirical in nature, does in fact extract sound physical insight into the challenge of describing non-radiative relaxations
Advanced real-time indoor tracking based on the Viterbi algorithm and semantic data
A real-time indoor tracking system based on the Viterbi algorithm is developed. This Viterbi principle is used in combination with semantic data to improve the accuracy, that is, the environment of the object that is being tracked and a motion model. The starting point is a fingerprinting technique for which an advanced network planner is used to automatically construct the radio map, avoiding a time consuming measurement campaign. The developed algorithm was verified with simulations and with experiments in a building-wide testbed for sensor experiments, where a median accuracy below 2 m was obtained. Compared to a reference algorithm without Viterbi or semantic data, the results indicated a significant improvement: the mean accuracy and standard deviation improved by, respectively, 26.1% and 65.3%. Thereafter a sensitivity analysis was conducted to estimate the influence of node density, grid size, memory usage, and semantic data on the performance
- …