6,163 research outputs found
Outlier Detection of Time Series with A Novel Hybrid Method in Cloud Computing
In the wake of the development in science and technology, Cloud Computing has obtained more attention in different field. Meanwhile, outlier detection for data mining in Cloud Computing is playing more and more significant role in different research domains and massive research works have devoted to outlier detection, which includes distance-based, density-based and clustering-based outlier detection. However, the existing available methods spend high computation time. Therefore, the improved algorithm of outlier detection, which has higher performance to detect outlier is presented. In this paper, the proposed method, which is an improved spectral clustering algorithm (SKM++), is fit for handling outliers. Then, pruning data can reduce computational complexity and combine distance-based method Manhattan Distance (distm) to obtain outlier score. Finally, the method confirms the outlier by extreme analysis. This paper validates the presented method by experiments with a real collected data by sensors and comparison against the existing approaches, the experimental results turn out that our proposed method precedes the existing
System Support For Stream Processing In Collaborative Cloud-Edge Environment
Stream processing is a critical technique to process huge amount of data in real-time manner.
Cloud computing has been used for stream processing due to its unlimited computation
resources. At the same time, we are entering the era of Internet of Everything (IoE). The emerging
edge computing benefits low-latency applications by leveraging computation resources at
the proximity of data sources. Billions of sensors and actuators are being deployed worldwide
and huge amount of data generated by things are immersed in our daily life. It has become
essential for organizations to be able to stream and analyze data, and provide low-latency analytics
on streaming data. However, cloud computing is inefficient to process all data in a centralized
environment in terms of the network bandwidth cost and response latency. Although
edge computing offloads computation from the cloud to the edge of the Internet, there is not
a data sharing and processing framework that efficiently utilizes computation resources in the
cloud and the edge. Furthermore, the heterogeneity of edge devices brings more difficulty to the development of collaborative cloud-edge applications.
To explore and attack the challenges of stream processing system in collaborative cloudedge
environment, in this dissertation we design and develop a series of systems to support
stream processing applications in hybrid cloud-edge analytics. Specifically, we develop an
hierarchical and hybrid outlier detection model for multivariate time series streams that automatically
selects the best model for different time series. We optimize one of the stream
processing system (i.e., Spark Streaming) to reduce the end-to-end latency. To facilitate the
development of collaborative cloud-edge applications, we propose and implement a new computing
framework, Firework that allows stakeholders to share and process data by leveraging
both the cloud and the edge. A vision-based cloud-edge application is implemented to demonstrate
the capabilities of Firework. By combining all these studies, we provide comprehensive
system support for stream processing in collaborative cloud-edge environment
Artificial Intelligence based Anomaly Detection of Energy Consumption in Buildings: A Review, Current Trends and New Perspectives
Enormous amounts of data are being produced everyday by sub-meters and smart
sensors installed in residential buildings. If leveraged properly, that data
could assist end-users, energy producers and utility companies in detecting
anomalous power consumption and understanding the causes of each anomaly.
Therefore, anomaly detection could stop a minor problem becoming overwhelming.
Moreover, it will aid in better decision-making to reduce wasted energy and
promote sustainable and energy efficient behavior. In this regard, this paper
is an in-depth review of existing anomaly detection frameworks for building
energy consumption based on artificial intelligence. Specifically, an extensive
survey is presented, in which a comprehensive taxonomy is introduced to
classify existing algorithms based on different modules and parameters adopted,
such as machine learning algorithms, feature extraction approaches, anomaly
detection levels, computing platforms and application scenarios. To the best of
the authors' knowledge, this is the first review article that discusses anomaly
detection in building energy consumption. Moving forward, important findings
along with domain-specific problems, difficulties and challenges that remain
unresolved are thoroughly discussed, including the absence of: (i) precise
definitions of anomalous power consumption, (ii) annotated datasets, (iii)
unified metrics to assess the performance of existing solutions, (iv) platforms
for reproducibility and (v) privacy-preservation. Following, insights about
current research trends are discussed to widen the applications and
effectiveness of the anomaly detection technology before deriving future
directions attracting significant attention. This article serves as a
comprehensive reference to understand the current technological progress in
anomaly detection of energy consumption based on artificial intelligence.Comment: 11 Figures, 3 Table
Towards Real-Time Detection and Tracking of Spatio-Temporal Features: Blob-Filaments in Fusion Plasma
A novel algorithm and implementation of real-time identification and tracking
of blob-filaments in fusion reactor data is presented. Similar spatio-temporal
features are important in many other applications, for example, ignition
kernels in combustion and tumor cells in a medical image. This work presents an
approach for extracting these features by dividing the overall task into three
steps: local identification of feature cells, grouping feature cells into
extended feature, and tracking movement of feature through overlapping in
space. Through our extensive work in parallelization, we demonstrate that this
approach can effectively make use of a large number of compute nodes to detect
and track blob-filaments in real time in fusion plasma. On a set of 30GB fusion
simulation data, we observed linear speedup on 1024 processes and completed
blob detection in less than three milliseconds using Edison, a Cray XC30 system
at NERSC.Comment: 14 pages, 40 figure
A survey of outlier detection methodologies
Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review
- …