159,225 research outputs found
What broke where for distributed and parallel applications — a whodunit story
Detection, diagnosis and mitigation of performance problems in today\u27s large-scale distributed and parallel systems is a difficult task. These large distributed and parallel systems are composed of various complex software and hardware components. When the system experiences some performance or correctness problem, developers struggle to understand the root cause of the problem and fix in a timely manner. In my thesis, I address these three components of the performance problems in computer systems. First, we focus on diagnosing performance problems in large-scale parallel applications running on supercomputers. We developed techniques to localize the performance problem for root-cause analysis. Parallel applications, most of which are complex scientific simulations running in supercomputers, can create up to millions of parallel tasks that run on different machines and communicate using the message passing paradigm. We developed a highly scalable and accurate automated debugging tool called PRODOMETER, which uses sophisticated algorithms to first, create a logical progress dependency graph of the tasks to highlight how the problem spread through the system manifesting as a system-wide performance issue. Second, uses this logical progress dependence graph to identify the task where the problem originated. Finally, PRODOMETER pinpoints the code region corresponding to the origin of the bug. Second, we developed a tool-chain that can detect performance anomaly using machine-learning techniques and can achieve very low false positive rate. Our input-aware performance anomaly detection system consists of a scalable data collection framework to collect performance related metrics from different granularity of code regions, an offline model creation and prediction-error characterization technique, and a threshold based anomaly-detection-engine for production runs. Our system requires few training runs and can handle unknown inputs and parameter combinations by dynamically calibrating the anomaly detection threshold according to the characteristics of the input data and the characteristics of the prediction-error of the models. Third, we developed performance problem mitigation scheme for erasure-coded distributed storage systems. Repair operations of the failed blocks in erasure-coded distributed storage system take really long time in networked constrained data-centers. The reason being, during the repair operation for erasure-coded distributed storage, a lot of data from multiple nodes are gathered into a single node and then a mathematical operation is performed to reconstruct the missing part. This process severely congests the links toward the destination where newly recreated data is to be hosted. We proposed a novel distributed repair technique, called Partial-Parallel-Repair (PPR) that performs this reconstruction in parallel on multiple nodes and eliminates network bottlenecks, and as a result, greatly speeds up the repair process. Fourth, we study how for a class of applications, performance can be improved (or performance problems can be mitigated) by selectively approximating some of the computations. For many applications, the main computation happens inside a loop that can be logically divided into a few temporal segments, we call phases. We found that while approximating the initial phases might severely degrade the quality of the results, approximating the computation for the later phases have very small impact on the final quality of the result. Based on this observation, we developed an optimization framework that for a given budget of quality-loss, would find the best approximation settings for each phase in the execution
Cross-layer design of multi-hop wireless networks
MULTI -hop wireless networks are usually defined as a collection of nodes
equipped with radio transmitters, which not only have the capability to
communicate each other in a multi-hop fashion, but also to route each others’ data
packets. The distributed nature of such networks makes them suitable for a variety of
applications where there are no assumed reliable central entities, or controllers, and
may significantly improve the scalability issues of conventional single-hop wireless
networks.
This Ph.D. dissertation mainly investigates two aspects of the research issues
related to the efficient multi-hop wireless networks design, namely: (a) network
protocols and (b) network management, both in cross-layer design paradigms to
ensure the notion of service quality, such as quality of service (QoS) in wireless mesh
networks (WMNs) for backhaul applications and quality of information (QoI) in
wireless sensor networks (WSNs) for sensing tasks. Throughout the presentation of
this Ph.D. dissertation, different network settings are used as illustrative examples,
however the proposed algorithms, methodologies, protocols, and models are not
restricted in the considered networks, but rather have wide applicability.
First, this dissertation proposes a cross-layer design framework integrating
a distributed proportional-fair scheduler and a QoS routing algorithm, while using
WMNs as an illustrative example. The proposed approach has significant performance
gain compared with other network protocols. Second, this dissertation proposes
a generic admission control methodology for any packet network, wired and
wireless, by modeling the network as a black box, and using a generic mathematical
0. Abstract 3
function and Taylor expansion to capture the admission impact. Third, this dissertation
further enhances the previous designs by proposing a negotiation process,
to bridge the applications’ service quality demands and the resource management,
while using WSNs as an illustrative example. This approach allows the negotiation
among different service classes and WSN resource allocations to reach the optimal
operational status. Finally, the guarantees of the service quality are extended to
the environment of multiple, disconnected, mobile subnetworks, where the question
of how to maintain communications using dynamically controlled, unmanned data
ferries is investigated
Recommended from our members
Data-Driven Quickest Change Detection
The quickest change detection (QCD) problem is to detect abrupt changes in a sensing environment as quickly as possible in real time while limiting the risk of false alarm. Statistical inference about the monitored stochastic process is performed through observations acquired sequentially over time. After each observation, QCD algorithm either stops and declares a change or continues to have a further observation in the next time interval. There is an inherent tradeoff between speed and accuracy in the decision making process. The design goal is to optimally balance the average detection delay and the false alarm rate to have a timely and accurate response to abrupt changes.
The objective of this thesis is to investigate effective and scalable QCD approaches for real-world data streams. The classical QCD framework is model-based, that is, statistical data model is assumed to be known for both the pre- and post-change cases. However, real-world data often exhibit significant challenges for data modeling such as high dimensionality, complex multivariate nature, lack of parametric models, unknown post-change (e.g., attack or anomaly) patterns, and complex temporal correlation. Further, in some cases, data is privacy-sensitive and distributed over a system, and it is not fully available to QCD algorithm. This thesis addresses these challenges and proposes novel data-driven QCD approaches that are robust to data model mismatch and hence widely applicable to a variety of practical settings.
In Chapter 2, online cyber-attack detection in the smart power grid is formulated as a partially observable Markov decision process (POMDP) problem based on the QCD framework. A universal robust online cyber-attack detection algorithm is proposed using the model-free reinforcement learning (RL) for POMDPs. In Chapter 3, online anomaly detection for big data streams is studied where the nominal (i.e., pre-change) and anomalous (i.e., post-change) high-dimensional statistical data models are unknown. A data-driven solution approach is proposed, where firstly a set of useful univariate summary statistics is computed from a nominal dataset in an offline phase and next, online summary statistics are evaluated for a persistent deviation from the nominal statistics.
In Chapter 4, a generic data-driven QCD procedure is proposed, called DeepQCD, that learns the change detection rule directly from the observed raw data via deep recurrent neural networks. With sufficient amount of training data including both pre- and post-change samples, DeepQCD can effectively learn the change detection rule for all complex, high-dimensional, and temporally correlated data streams. Finally, in Chapter 5, online privacy-preserving anomaly detection is studied in a setting where the data is distributed over a network and locally sensitive to each node, and its statistical model is unknown. A data-driven differentially private distributed detection scheme is proposed, which infers network-wide anomalies based on the perturbed and encrypted statistics received from nodes. Furthermore, analytical privacy-security tradeoff in the network-wide anomaly detection problem is investigated
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
Wireless sensor networks monitor dynamic environments that change rapidly
over time. This dynamic behavior is either caused by external factors or
initiated by the system designers themselves. To adapt to such conditions,
sensor networks often adopt machine learning techniques to eliminate the need
for unnecessary redesign. Machine learning also inspires many practical
solutions that maximize resource utilization and prolong the lifespan of the
network. In this paper, we present an extensive literature review over the
period 2002-2013 of machine learning methods that were used to address common
issues in wireless sensor networks (WSNs). The advantages and disadvantages of
each proposed algorithm are evaluated against the corresponding problem. We
also provide a comparative guide to aid WSN designers in developing suitable
machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial
Data Leak Detection As a Service: Challenges and Solutions
We describe a network-based data-leak detection (DLD)
technique, the main feature of which is that the detection
does not require the data owner to reveal the content of the
sensitive data. Instead, only a small amount of specialized
digests are needed. Our technique – referred to as the fuzzy
fingerprint – can be used to detect accidental data leaks due
to human errors or application flaws. The privacy-preserving
feature of our algorithms minimizes the exposure of sensitive
data and enables the data owner to safely delegate the
detection to others.We describe how cloud providers can offer
their customers data-leak detection as an add-on service
with strong privacy guarantees.
We perform extensive experimental evaluation on the privacy,
efficiency, accuracy and noise tolerance of our techniques.
Our evaluation results under various data-leak scenarios
and setups show that our method can support accurate
detection with very small number of false alarms, even
when the presentation of the data has been transformed. It
also indicates that the detection accuracy does not degrade
when partial digests are used. We further provide a quantifiable
method to measure the privacy guarantee offered by our
fuzzy fingerprint framework
DeepMarks: A Digital Fingerprinting Framework for Deep Neural Networks
This paper proposes DeepMarks, a novel end-to-end framework for systematic
fingerprinting in the context of Deep Learning (DL). Remarkable progress has
been made in the area of deep learning. Sharing the trained DL models has
become a trend that is ubiquitous in various fields ranging from biomedical
diagnosis to stock prediction. As the availability and popularity of
pre-trained models are increasing, it is critical to protect the Intellectual
Property (IP) of the model owner. DeepMarks introduces the first fingerprinting
methodology that enables the model owner to embed unique fingerprints within
the parameters (weights) of her model and later identify undesired usages of
her distributed models. The proposed framework embeds the fingerprints in the
Probability Density Function (pdf) of trainable weights by leveraging the extra
capacity available in contemporary DL models. DeepMarks is robust against
fingerprints collusion as well as network transformation attacks, including
model compression and model fine-tuning. Extensive proof-of-concept evaluations
on MNIST and CIFAR10 datasets, as well as a wide variety of deep neural
networks architectures such as Wide Residual Networks (WRNs) and Convolutional
Neural Networks (CNNs), corroborate the effectiveness and robustness of
DeepMarks framework
Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection
Multi-label image classification is a fundamental but challenging task
towards general visual understanding. Existing methods found the region-level
cues (e.g., features from RoIs) can facilitate multi-label classification.
Nevertheless, such methods usually require laborious object-level annotations
(i.e., object labels and bounding boxes) for effective learning of the
object-level visual features. In this paper, we propose a novel and efficient
deep framework to boost multi-label classification by distilling knowledge from
weakly-supervised detection task without bounding box annotations.
Specifically, given the image-level annotations, (1) we first develop a
weakly-supervised detection (WSD) model, and then (2) construct an end-to-end
multi-label image classification framework augmented by a knowledge
distillation module that guides the classification model by the WSD model
according to the class-level predictions for the whole image and the
object-level visual features for object RoIs. The WSD model is the teacher
model and the classification model is the student model. After this cross-task
knowledge distillation, the performance of the classification model is
significantly improved and the efficiency is maintained since the WSD model can
be safely discarded in the test phase. Extensive experiments on two large-scale
datasets (MS-COCO and NUS-WIDE) show that our framework achieves superior
performances over the state-of-the-art methods on both performance and
efficiency.Comment: accepted by ACM Multimedia 2018, 9 pages, 4 figures, 5 table
A survey of self organisation in future cellular networks
This article surveys the literature over the period of the last decade on the emerging field of self organisation as applied to wireless cellular communication networks. Self organisation has been extensively studied and applied in adhoc networks, wireless sensor networks and autonomic computer networks; however in the context of wireless cellular networks, this is the first attempt to put in perspective the various efforts in form of a tutorial/survey. We provide a comprehensive survey of the existing literature, projects and standards in self organising cellular networks. Additionally, we also aim to present a clear understanding of this active research area, identifying a clear taxonomy and guidelines for design of self organising mechanisms. We compare strength and weakness of existing solutions and highlight the key research areas for further development. This paper serves as a guide and a starting point for anyone willing to delve into research on self organisation in wireless cellular communication networks
- …