6,138 research outputs found
Adapted K-Nearest Neighbors for Detecting Anomalies on Spatio–Temporal Traffic Flow
Outlier detection is an extensive research area, which has been intensively studied in several domains such as biological sciences, medical diagnosis, surveillance, and traffic anomaly detection. This paper explores advances in the outlier detection area by finding anomalies in spatio-temporal urban traffic flow. It proposes a new approach by considering the distribution of the flows in a given time interval. The flow distribution probability (FDP) databases are first constructed from the traffic flows by considering both spatial and temporal information. The outlier detection mechanism is then applied to the coming flow distribution probabilities, the inliers are stored to enrich the FDP databases, while the outliers are excluded from the FDP databases. Moreover, a k-nearest neighbor for distance-based outlier detection is investigated and adopted for FDP outlier detection. To validate the proposed framework, real data from Odense traffic flow case are evaluated at ten locations. The results reveal that the proposed framework is able to detect the real distribution of flow outliers. Another experiment has been carried out on Beijing data, the results show that our approach outperforms the baseline algorithms for high-urban traffic flow
A Simple Baseline for Travel Time Estimation using Large-Scale Trip Data
The increased availability of large-scale trajectory data around the world
provides rich information for the study of urban dynamics. For example, New
York City Taxi Limousine Commission regularly releases source-destination
information about trips in the taxis they regulate. Taxi data provide
information about traffic patterns, and thus enable the study of urban flow --
what will traffic between two locations look like at a certain date and time in
the future? Existing big data methods try to outdo each other in terms of
complexity and algorithmic sophistication. In the spirit of "big data beats
algorithms", we present a very simple baseline which outperforms
state-of-the-art approaches, including Bing Maps and Baidu Maps (whose APIs
permit large scale experimentation). Such a travel time estimation baseline has
several important uses, such as navigation (fast travel time estimates can
serve as approximate heuristics for A search variants for path finding) and
trip planning (which uses operating hours for popular destinations along with
travel time estimates to create an itinerary).Comment: 12 page
Outlier Detection from Network Data with Subnetwork Interpretation
Detecting a small number of outliers from a set of data observations is
always challenging. This problem is more difficult in the setting of multiple
network samples, where computing the anomalous degree of a network sample is
generally not sufficient. In fact, explaining why the network is exceptional,
expressed in the form of subnetwork, is also equally important. In this paper,
we develop a novel algorithm to address these two key problems. We treat each
network sample as a potential outlier and identify subnetworks that mostly
discriminate it from nearby regular samples. The algorithm is developed in the
framework of network regression combined with the constraints on both network
topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
goes beyond subspace/subgraph discovery and we show that it converges to a
global optimum. Evaluation on various real-world network datasets demonstrates
that our algorithm not only outperforms baselines in both network and high
dimensional setting, but also discovers highly relevant and interpretable local
subnetworks, further enhancing our understanding of anomalous networks
- …