4 research outputs found
PROCESSING AND ANALYSIS OF BIG TRAJECTORY DATA USING MAPREDUCE
In this paper, we present research work related to processing and analysis of big trajectory data using MapReduce framework. We describe the MapReduce-based algorithms and applications implemented on Hadoop for processing spatial join between big trajectory data and set of POI regions and appropriate aggregation of join results. The experimental evaluation and results in detecting trajectory patterns of particular users and the most popular places in the city demonstrate the feasibility of our approach. The visual analytics of MapReduce job output improve the trajectory and movement analysis
Big data-driven prediction of airspace congestion
Air Navigation Service Providers (ANSP) worldwide have been making a
considerable effort for the development of a better method to measure and
predict aircraft counts within a particular airspace, also referred to as
airspace density. An accurate measurement and prediction of airspace density is
crucial for a better managed airspace, both strategically and tactically,
yielding a higher level of automation and thereby reducing the air traffic
controller's workload. Although the prior approaches have been able to address
the problem to some extent, data management and query processing of
ever-increasing vast volume of air traffic data at high rates, for various
analytics purposes such as predicting aircraft counts, still remains a
challenge especially when only linear prediction models are used.
In this paper, we present a novel data management and prediction system that
accurately predicts aircraft counts for a particular airspace sector within the
National Airspace System (NAS). The incoming Traffic Flow Management (TFM) data
is streaming, big, uncorrelated and noisy. In the preprocessing step, the
system continuously processes the incoming raw data, reduces it to a compact
size, and stores it in a NoSQL database, where it makes the data available for
efficient query processing. In the prediction step, the system learns from
historical trajectories and uses their segments to collect key features such as
sector boundary crossings, weather parameters, and other air traffic data. The
features are fed into various regression models, including linear, non-linear
and ensemble models, and the best performing model is used for prediction.
Evaluation on an extensive set of real track, weather, and air traffic data
including boundary crossings in the U.S. verify that our system efficiently and
accurately predicts aircraft counts in each airspace sector.Comment: Submitted to the 2023 IEEE/AIAA Digital Aviation Systems Conference
(DASC
ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data Management
University of Minnesota Ph.D. dissertation.May 2019. Major: Computer Science. Advisor: Mohamed Mokbel. 1 computer file (PDF); x, 123 pages.Apache Hadoop, employing the MapReduce programming paradigm, that has been widely accepted as the standard framework for analyzing big data in distributed environments. Unfortunately, this rich framework was not genuinely exploited towards processing large scale spatio-temporal data, especially with the emergence and popularity of applications that create them in large-scale. The huge volumes of spatio-temporal data come from applications, like Taxi fleet in urban computing, Asteroids in astronomy research studies, animal movements in habitat studies, neuron analysis in neuroscience research studies, and contents of social networks (e.g., Twitter or Facebook). Managing space and time are two fundamental characteristics that raised the demand for processing spatio-temporal data created by these applications. Besides the massive size of data, the complexity of shapes and formats associated with these data raised many challenges in managing spatio-temporal data. The goal of the dissertation is centered on establishing a full-fledged big spatio-temporal data management system that serves the need for a wide range of spatio-temporal applications. This involves indexing, querying, and analyzing spatio-temporal data. We propose ST-Hadoop; the first full-fledged open-source system with native support for big spatio-temporal data, available to download http://st-hadoop.cs.umn.edu/. ST- Hadoop injects spatio-temporal data awareness inside the highly popular Hadoop system that is considered state-of-the-art for off-line analysis of big data systems. Considering a distributed environment, we focus on the following: (1) indexing spatio-temporal data and (2) Supporting various fundamental spatio-temporal operations, such as range, kNN, and join (3) Supporting indexing and querying trajectories, which is considered as a special class of spatio-temporal data that require special handling. Throughout this dissertation, we will touch base on the background and related work, motivate for the proposed system, and highlight our contributions