2,630 research outputs found
System Support For Stream Processing In Collaborative Cloud-Edge Environment
Stream processing is a critical technique to process huge amount of data in real-time manner.
Cloud computing has been used for stream processing due to its unlimited computation
resources. At the same time, we are entering the era of Internet of Everything (IoE). The emerging
edge computing benefits low-latency applications by leveraging computation resources at
the proximity of data sources. Billions of sensors and actuators are being deployed worldwide
and huge amount of data generated by things are immersed in our daily life. It has become
essential for organizations to be able to stream and analyze data, and provide low-latency analytics
on streaming data. However, cloud computing is inefficient to process all data in a centralized
environment in terms of the network bandwidth cost and response latency. Although
edge computing offloads computation from the cloud to the edge of the Internet, there is not
a data sharing and processing framework that efficiently utilizes computation resources in the
cloud and the edge. Furthermore, the heterogeneity of edge devices brings more difficulty to the development of collaborative cloud-edge applications.
To explore and attack the challenges of stream processing system in collaborative cloudedge
environment, in this dissertation we design and develop a series of systems to support
stream processing applications in hybrid cloud-edge analytics. Specifically, we develop an
hierarchical and hybrid outlier detection model for multivariate time series streams that automatically
selects the best model for different time series. We optimize one of the stream
processing system (i.e., Spark Streaming) to reduce the end-to-end latency. To facilitate the
development of collaborative cloud-edge applications, we propose and implement a new computing
framework, Firework that allows stakeholders to share and process data by leveraging
both the cloud and the edge. A vision-based cloud-edge application is implemented to demonstrate
the capabilities of Firework. By combining all these studies, we provide comprehensive
system support for stream processing in collaborative cloud-edge environment
Data science applications to connected vehicles: Key barriers to overcome
The connected vehicles will generate huge amount of pervasive and real time data, at very high frequencies. This poses new challenges for Data science. How to analyse these data and how to address short-term and long-term storage are some of the key barriers to overcome.JRC.C.6-Economics of Climate Change, Energy and Transpor
Data semantic enrichment for complex event processing over IoT Data Streams
This thesis generalizes techniques for processing IoT data streams, semantically enrich data with contextual information, as well as complex event processing in IoT applications. A case study for ECG anomaly detection and signal classification was conducted to validate the knowledge foundation
Data Mining Applications in Big Data
Data mining is a process of extracting hidden, unknown, but potentially useful information from massive data. Big Data has great impacts on scientific discoveries and value creation. This paper introduces methods in data mining and technologies in Big Data. Challenges of data mining and data mining with big data are discussed. Some technology progress of data mining and data mining with big data are also presented
IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective
With the wide spread of sensors and smart devices in recent years, the data
generation speed of the Internet of Things (IoT) systems has increased
dramatically. In IoT systems, massive volumes of data must be processed,
transformed, and analyzed on a frequent basis to enable various IoT services
and functionalities. Machine Learning (ML) approaches have shown their capacity
for IoT data analytics. However, applying ML models to IoT data analytics tasks
still faces many difficulties and challenges, specifically, effective model
selection, design/tuning, and updating, which have brought massive demand for
experienced data scientists. Additionally, the dynamic nature of IoT data may
introduce concept drift issues, causing model performance degradation. To
reduce human efforts, Automated Machine Learning (AutoML) has become a popular
field that aims to automatically select, construct, tune, and update machine
learning models to achieve the best performance on specified tasks. In this
paper, we conduct a review of existing methods in the model selection, tuning,
and updating procedures in the area of AutoML in order to identify and
summarize the optimal solutions for every step of applying ML algorithms to IoT
data analytics. To justify our findings and help industrial users and
researchers better implement AutoML approaches, a case study of applying AutoML
to IoT anomaly detection problems is conducted in this work. Lastly, we discuss
and classify the challenges and research directions for this domain.Comment: Published in Engineering Applications of Artificial Intelligence
(Elsevier, IF:7.8); Code/An AutoML tutorial is available at Github link:
https://github.com/Western-OC2-Lab/AutoML-Implementation-for-Static-and-Dynamic-Data-Analytic
- …