2,796 research outputs found
A Data-driven Methodology Towards Mobility- and Traffic-related Big Spatiotemporal Data Frameworks
Human population is increasing at unprecedented rates, particularly in urban areas. This increase, along with the rise of a more economically empowered middle class, brings new and complex challenges to the mobility of people within urban areas. To tackle such challenges, transportation and mobility authorities and operators are trying to adopt innovative Big Data-driven Mobility- and Traffic-related solutions. Such solutions will help decision-making processes that aim to ease the load on an already overloaded transport infrastructure. The information collected from day-to-day mobility and traffic can help to mitigate some of such mobility challenges in urban areas.
Road infrastructure and traffic management operators (RITMOs) face several limitations to effectively extract value from the exponentially growing volumes of mobility- and traffic-related Big Spatiotemporal Data (MobiTrafficBD) that are being acquired and gathered. Research about the topics of Big Data, Spatiotemporal Data and specially MobiTrafficBD is scattered, and existing literature does not offer a concrete, common methodological approach to setup, configure, deploy and use a complete Big Data-based framework to manage the lifecycle of mobility-related spatiotemporal data, mainly focused on geo-referenced time series (GRTS) and spatiotemporal events (ST Events), extract value from it and support decision-making
processes of RITMOs.
This doctoral thesis proposes a data-driven, prescriptive methodological approach towards the design, development and deployment of MobiTrafficBD Frameworks focused on GRTS and ST Events. Besides a thorough literature review on Spatiotemporal Data, Big Data and the merging of these two fields through MobiTraffiBD, the methodological approach comprises a set of general characteristics, technical requirements, logical components, data flows and technological infrastructure models, as well as guidelines and best practices that aim to guide researchers, practitioners and stakeholders, such as RITMOs, throughout the design, development and deployment phases of any MobiTrafficBD Framework.
This work is intended to be a supporting methodological guide, based on widely used
Reference Architectures and guidelines for Big Data, but enriched with inherent characteristics
and concerns brought about by Big Spatiotemporal Data, such as in the case of GRTS and ST
Events. The proposed methodology was evaluated and demonstrated in various real-world
use cases that deployed MobiTrafficBD-based Data Management, Processing, Analytics and
Visualisation methods, tools and technologies, under the umbrella of several research projects
funded by the European Commission and the Portuguese Government.A população humana cresce a um ritmo sem precedentes, particularmente nas áreas urbanas.
Este aumento, aliado ao robustecimento de uma classe média com maior poder económico,
introduzem novos e complexos desafios na mobilidade de pessoas em áreas urbanas. Para
abordar estes desafios, autoridades e operadores de transportes e mobilidade estão a adotar
soluções inovadoras no domínio dos sistemas de Dados em Larga Escala nos domínios da
Mobilidade e Tráfego. Estas soluções irão apoiar os processos de decisão com o intuito de libertar uma infraestrutura de estradas e transportes já sobrecarregada. A informação colecionada da mobilidade diária e da utilização da infraestrutura de estradas pode ajudar na mitigação de alguns dos desafios da mobilidade urbana.
Os operadores de gestão de trânsito e de infraestruturas de estradas (em inglês, road infrastructure and traffic management operators — RITMOs) estão limitados no que toca a extrair valor de um sempre crescente volume de Dados Espaciotemporais em Larga Escala no domínio da Mobilidade e Tráfego (em inglês, Mobility- and Traffic-related Big Spatiotemporal Data —MobiTrafficBD) que estão a ser colecionados e recolhidos. Os trabalhos de investigação sobre os tópicos de Big Data, Dados Espaciotemporais e, especialmente, de MobiTrafficBD, estão dispersos, e a literatura existente não oferece uma metodologia comum e concreta para preparar, configurar, implementar e usar uma plataforma (framework) baseada em tecnologias Big Data para gerir o ciclo de vida de dados espaciotemporais em larga escala, com ênfase nas série temporais georreferenciadas (em inglês, geo-referenced time series — GRTS) e eventos espacio-
temporais (em inglês, spatiotemporal events — ST Events), extrair valor destes dados e apoiar os
RITMOs nos seus processos de decisão.
Esta dissertação doutoral propõe uma metodologia prescritiva orientada a dados, para o design, desenvolvimento e implementação de plataformas de MobiTrafficBD, focadas em GRTS e ST Events. Além de uma revisão de literatura completa nas áreas de Dados Espaciotemporais, Big Data e na junção destas áreas através do conceito de MobiTrafficBD, a metodologia proposta contem um conjunto de características gerais, requisitos técnicos, componentes lógicos, fluxos de dados e modelos de infraestrutura tecnológica, bem como diretrizes e boas
práticas para investigadores, profissionais e outras partes interessadas, como RITMOs, com o
objetivo de guiá-los pelas fases de design, desenvolvimento e implementação de qualquer pla-
taforma MobiTrafficBD.
Este trabalho deve ser visto como um guia metodológico de suporte, baseado em Arqui-
teturas de Referência e diretrizes amplamente utilizadas, mas enriquecido com as característi-
cas e assuntos implícitos relacionados com Dados Espaciotemporais em Larga Escala, como
no caso de GRTS e ST Events. A metodologia proposta foi avaliada e demonstrada em vários
cenários reais no âmbito de projetos de investigação financiados pela Comissão Europeia e
pelo Governo português, nos quais foram implementados métodos, ferramentas e tecnologias
nas áreas de Gestão de Dados, Processamento de Dados e Ciência e Visualização de Dados em
plataformas MobiTrafficB
Multi-scale 3D Convolution Network for Video Based Person Re-Identification
This paper proposes a two-stream convolution network to extract spatial and
temporal cues for video based person Re-Identification (ReID). A temporal
stream in this network is constructed by inserting several Multi-scale 3D (M3D)
convolution layers into a 2D CNN network. The resulting M3D convolution network
introduces a fraction of parameters into the 2D CNN, but gains the ability of
multi-scale temporal feature learning. With this compact architecture, M3D
convolution network is also more efficient and easier to optimize than existing
3D convolution networks. The temporal stream further involves Residual
Attention Layers (RAL) to refine the temporal features. By jointly learning
spatial-temporal attention masks in a residual manner, RAL identifies the
discriminative spatial regions and temporal cues. The other stream in our
network is implemented with a 2D CNN for spatial feature extraction. The
spatial and temporal features from two streams are finally fused for the video
based person ReID. Evaluations on three widely used benchmarks datasets, i.e.,
MARS, PRID2011, and iLIDS-VID demonstrate the substantial advantages of our
method over existing 3D convolution networks and state-of-art methods.Comment: AAAI, 201
Spatial Data Quality in the IoT Era:Management and Exploitation
Within the rapidly expanding Internet of Things (IoT), growing amounts of spatially referenced data are being generated. Due to the dynamic, decentralized, and heterogeneous nature of the IoT, spatial IoT data (SID) quality has attracted considerable attention in academia and industry. How to invent and use technologies for managing spatial data quality and exploiting low-quality spatial data are key challenges in the IoT. In this tutorial, we highlight the SID consumption requirements in applications and offer an overview of spatial data quality in the IoT setting. In addition, we review pertinent technologies for quality management and low-quality data exploitation, and we identify trends and future directions for quality-aware SID management and utilization. The tutorial aims to not only help researchers and practitioners to better comprehend SID quality challenges and solutions, but also offer insights that may enable innovative research and applications
Performance assessment of real-time data management on wireless sensor networks
Technological advances in recent years have allowed the maturity of Wireless Sensor Networks
(WSNs), which aim at performing environmental monitoring and data collection. This sort of
network is composed of hundreds, thousands or probably even millions of tiny smart computers
known as wireless sensor nodes, which may be battery powered, equipped with sensors, a radio
transceiver, a Central Processing Unit (CPU) and some memory. However due to the small size and
the requirements of low-cost nodes, these sensor node resources such as processing power, storage
and especially energy are very limited.
Once the sensors perform their measurements from the environment, the problem of data
storing and querying arises. In fact, the sensors have restricted storage capacity and the on-going
interaction between sensors and environment results huge amounts of data. Techniques for data
storage and query in WSN can be based on either external storage or local storage. The external
storage, called warehousing approach, is a centralized system on which the data gathered by the
sensors are periodically sent to a central database server where user queries are processed. The
local storage, in the other hand called distributed approach, exploits the capabilities of sensors
calculation and the sensors act as local databases. The data is stored in a central database server
and in the devices themselves, enabling one to query both.
The WSNs are used in a wide variety of applications, which may perform certain operations on
collected sensor data. However, for certain applications, such as real-time applications, the sensor
data must closely reflect the current state of the targeted environment. However, the environment
changes constantly and the data is collected in discreet moments of time. As such, the collected
data has a temporal validity, and as time advances, it becomes less accurate, until it does not
reflect the state of the environment any longer. Thus, these applications must query and analyze
the data in a bounded time in order to make decisions and to react efficiently, such as industrial
automation, aviation, sensors network, and so on. In this context, the design of efficient real-time
data management solutions is necessary to deal with both time constraints and energy consumption.
This thesis studies the real-time data management techniques for WSNs. It particularly it focuses
on the study of the challenges in handling real-time data storage and query for WSNs and on the
efficient real-time data management solutions for WSNs.
First, the main specifications of real-time data management are identified and the available
real-time data management solutions for WSNs in the literature are presented. Secondly, in order to
provide an energy-efficient real-time data management solution, the techniques used to manage
data and queries in WSNs based on the distributed paradigm are deeply studied. In fact, many
research works argue that the distributed approach is the most energy-efficient way of managing
data and queries in WSNs, instead of performing the warehousing. In addition, this approach can provide quasi real-time query processing because the most current data will be retrieved from the
network.
Thirdly, based on these two studies and considering the complexity of developing, testing, and
debugging this kind of complex system, a model for a simulation framework of the real-time
databases management on WSN that uses a distributed approach and its implementation are
proposed. This will help to explore various solutions of real-time database techniques on WSNs
before deployment for economizing money and time. Moreover, one may improve the proposed
model by adding the simulation of protocols or place part of this simulator on another available
simulator. For validating the model, a case study considering real-time constraints as well as energy
constraints is discussed.
Fourth, a new architecture that combines statistical modeling techniques with the distributed
approach and a query processing algorithm to optimize the real-time user query processing are
proposed. This combination allows performing a query processing algorithm based on admission
control that uses the error tolerance and the probabilistic confidence interval as admission
parameters. The experiments based on real world data sets as well as synthetic data sets
demonstrate that the proposed solution optimizes the real-time query processing to save more
energy while meeting low latency.Fundação para a Ciência e Tecnologi
Design and Implementation of a Middleware for Uniform, Federated and Dynamic Event Processing
In recent years, real-time processing of massive event streams has become an important topic in the area of data analytics. It will become even more important in the future due to cheap sensors, a growing amount of devices and their ubiquitous inter-connection also known as the Internet of Things (IoT). Academia, industry and the open source community have developed several event processing (EP) systems that allow users to define, manage and execute continuous queries over event streams. They achieve a significantly better performance than the traditional store-then-process'' approach in which events are first stored and indexed in a database. Because EP systems have different roots and because of the lack of standardization, the system landscape became highly heterogenous. Today's EP systems differ in APIs, execution behaviors and query languages. This thesis presents the design and implementation of a novel middleware that abstracts from different EP systems and provides a uniform API, execution behavior and query language to users and developers. As a consequence, the presented middleware overcomes the problem of vendor lock-in and different EP systems are enabled to cooperate with each other. In practice, event streams differ dramatically in volume and velocity. We show therefore how the middleware can connect to not only different EP systems, but also database systems and a native implementation. Emerging applications such as the IoT raise novel challenges and require EP to be more dynamic. We present extensions to the middleware that enable self-adaptivity which is needed in context-sensitive applications and those that deal with constantly varying sets of event producers and consumers. Lastly, we extend the middleware to fully support the processing of events containing spatial data and to be able to run distributed in the form of a federation of heterogenous EP systems
Modified EPPXGBOOST for Effective Data Stream Mining in Cloud
In today’s technology-driven landscape, the perva- sive use of online services across diverse domains has led to the generation of vast datasets, necessitating advanced data mining techniques for meaningful insights. The advent of data streams, characterized by continuous and dynamic data flows, presents a significant challenge, prompting the evolution of data stream mining. This field addresses issues such as rapid changes in streaming data and the need for quick algorithms. To tackle these challenges, an innovative approach named (Effective Privacy Preserving eXtreme Gradient Boosting) EPPXGBOOST is proposed, combining Adaptive XGBOOST for continuous learning from evolving data streams with PPXGBOOST for privacy preservation
- …