453,614 research outputs found
Use of Service Oriented Architecture for Scada Networks
Supervisory Control and Data Acquisition (SCADA) systems involve the use of distributed processing to operate geographically dispersed endpoint hardware components. They manage the control networks used to monitor and direct large-scale operations such as utilities and transit systems that are essential to national infrastructure. SCADA industrial control networks (ICNs) have long operated in obscurity and been kept isolated largely through strong physical security. Today, Internet technologies are increasingly being utilized to access control networks, giving rise to a growing concern that they are becoming more vulnerable to attack. Like SCADA, distributed processing is also central to cloud computing or, more formally, the Service Oriented Architecture (SOA) computing model. Certain distinctive properties differentiate ICNs from the enterprise networks that cloud computing developments have focused on. The objective of this project is to determine if modern cloud computing technologies can be also applied to improving dated SCADA distributed processing systems. Extensive research was performed regarding control network requirements as compared to those of general enterprise networks. Research was also conducted into the benefits, implementation, and performance of SOA to determine its merits for application to control networks. The conclusion developed is that some aspects of cloud computing might be usefully applied to SCADA systems but that SOA fails to meet ICN requirements in a certain essential areas. The lack of current standards for SOA security presents an unacceptable risk to SCADA systems that manage dangerous equipment or essential services. SOA network performance is also not sufficiently deterministic to suit many real-time hardware control applications. Finally, SOA environments cannot as yet address the regulatory compliance assurance requirements of critical infrastructure SCADA systems
The Family of MapReduce and Large Scale Data Processing Systems
In the last two decades, the continuous increase of computational power has
produced an overwhelming flow of data which has called for a paradigm shift in
the computing architecture and large scale data processing mechanisms.
MapReduce is a simple and powerful programming model that enables easy
development of scalable parallel applications to process vast amounts of data
on large clusters of commodity machines. It isolates the application from the
details of running a distributed program such as issues on data distribution,
scheduling and fault tolerance. However, the original implementation of the
MapReduce framework had some limitations that have been tackled by many
research efforts in several followup works after its introduction. This article
provides a comprehensive survey for a family of approaches and mechanisms of
large scale data processing mechanisms that have been implemented based on the
original idea of the MapReduce framework and are currently gaining a lot of
momentum in both research and industrial communities. We also cover a set of
introduced systems that have been implemented to provide declarative
programming interfaces on top of the MapReduce framework. In addition, we
review several large scale data processing systems that resemble some of the
ideas of the MapReduce framework for different purposes and application
scenarios. Finally, we discuss some of the future research directions for
implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author
Recommended from our members
Ray: A Distributed Execution Engine for the Machine Learning Ecosystem
In recent years, growing data volumes and more sophisticated computational procedures have greatly increased the demand for computational power. Machine learning and artificial intelligence applications, for example, are notorious for their computational requirements. At the same time, Moores law is ending and processor speeds are stalling. As a result, distributed computing has become ubiquitous. While the cloud makes distributed hardware infrastructure widely accessible and therefore offers the potential of horizontal scale, developing these distributed algorithms and applications remains surprisingly hard. This is due to the inherent complexity of concurrent algorithms, the engineering challenges that arise when communicating between many machines, the requirements like fault tolerance and straggler mitigation that arise at large scale and the lack of a general-purpose distributed execution engine that can support a wide variety of applications.In this thesis, we study the requirements for a general-purpose distributed computation model and present a solution that is easy to use yet expressive and resilient to faults. At its core our model takes familiar concepts from serial programming, namely functions and classes, and generalizes them to the distributed world, therefore unifying stateless and stateful distributed computation. This model not only supports many machine learning workloads like training or serving, but is also a good t for cross-cutting machine learning applications like reinforcement learning and data processing applications like streaming or graph processing. We implement this computational model as an open-source system called Ray, which matches or exceeds the performance of specialized systems in many application domains, while also offering horizontally scalability and strong fault tolerance properties
Recommended from our members
Enabling Privacy and Trust in Edge AI Systems
Recent advances in mobile computing and the Internet of Things (IoT) enable the global integration of heterogeneous smart devices via wireless networks. A common characteristic across these modern day systems is their ability to collect and communicate streaming data, making machine learning (ML) appealing for processing, reasoning, and predicting about the environment. More recently, low network latency requirements have made offloading intelligence to the cloud undesirable. These novel requirements have led to the emergence of edge computing, an approach that brings computation closer to the device with low latency, high throughput, and enhanced reliability. Together, they enable ML-powered information processing and control pipelines spanning end devices, edge computing, and cloud environments. However, continuous collaboration between cloud, edge and device is susceptible to information leakage and loss, leading to insecure and unreliable operation. This raises an important question: how can we design, develop, and evaluate high-performing ML systems that are trustworthy and privacy-preserving in resource-constrained edge environments? In this thesis, I address this question by designing and implementing privacy-preserving and trustworthy ML systems for distributed applications. I first introduce a system that establishes trust in the explanations generated from a popular visualization technique, saliency maps, using counterfactual reasoning. Through the proposed evaluation system, I assess the degree to which hypothesized explanations correspond to the semantics of edge-based reinforcement learning environments. Second, I examine the privacy implications of personalized models in distributed mobile services by proposing time-series based model inversion attacks. To thwart such attacks, I present a distributed framework, Pelican, that learns and deploys transfer learning-based personalized ML models in a privacy preserving manner on resource-constrained mobile devices. Third, I investigate ML models that are deployed on local devices for inference and highlight the ease with which proprietary information embedded in these models can be exposed. For mitigating such attacks, I present a secure on-device application framework, SODA, which is supported by real-time adversarial detection. Finally, I present an end-to-end privacy-aware system for a real-world application to model group interaction behavior via mobility sensing. The proposed system, W4-Groups, distributes computation across device, edge, and cloud resources to strengthen its privacy and trustworthiness guarantees
Vispark: GPU-accelerated distributed visual computing using spark
With the growing need of big-data processing in diverse application domains, MapReduce (e.g., Hadoop) has become one of the standard computing paradigms for large-scale computing on a cluster system. Despite its popularity, the current MapReduce framework suffers from inflexibility and inefficiency inherent to its programming model and system architecture. In order to address these problems, we propose Vispark, a novel extension of Spark for GPU-accelerated MapReduce processing on array-based scientific computing and image processing tasks. Vispark provides an easy-to-use, Python-like high-level language syntax and a novel data abstraction for MapReduce programming on a GPU cluster system. Vispark introduces a programming abstraction for accessing neighbor data in the mapper function, which greatly simplifies many image processing tasks using MapReduce by reducing memory footprints and bypassing the reduce stage. Vispark provides socket-based halo communication that synchronizes between data partitions transparently from the users, which is necessary for many scientific computing problems in distributed systems. Vispark also provides domain-specific functions and language supports specifically designed for high-performance computing and image processing applications. We demonstrate the performance of our prototype system on several visual computing tasks, such as image processing, volume rendering, K-means clustering, and heat transfer simulation.clos
A Model for Scientific Workflows with Parallel and Distributed Computing
In the last decade we witnessed an immense evolution of the computing infrastructures
in terms of processing, storage and communication. On one hand, developments in hardware architectures have made it possible to run multiple virtual machines on a single physical machine. On the other hand, the increase of the available network communication bandwidth has enabled the widespread use of distributed computing infrastructures, for example based on clusters, grids and clouds. The above factors enabled different scientific communities to aim for the development and implementation of complex scientific applications possibly involving large amounts of data. However, due to their structural complexity, these applications require decomposition models to allow multiple tasks running in parallel and distributed environments.
The scientific workflow concept arises naturally as a way to model applications composed of multiple activities. In fact, in the past decades many initiatives have been
undertaken to model application development using the workflow paradigm, both in
the business and in scientific domains. However, despite such intensive efforts, current
scientific workflow systems and tools still have limitations, which pose difficulties to the
development of emerging large-scale, distributed and dynamic applications.
This dissertation proposes the AWARD model for scientific workflows with parallel
and distributed computing. AWARD is an acronym for Autonomic Workflow Activities
Reconfigurable and Dynamic.
The AWARD model has the following main characteristics.
It is based on a decentralized execution control model where multiple autonomic
workflow activities interact by exchanging tokens through input and output ports. The
activities can be executed separately in diverse computing environments, such as in a
single computer or on multiple virtual machines running on distributed infrastructures,
such as clusters and clouds.
It provides basic workflow patterns for parallel and distributed application decomposition and other useful patterns supporting feedback loops and load balancing. The model is suitable to express applications based on a finite or infinite number of iterations, thus allowing to model long-running workflows, which are typical in scientific experimention. A distintive contribution of the AWARD model is the support for dynamic reconfiguration
of long-running workflows. A dynamic reconfiguration allows to modify the
structure of the workflow, for example, to introduce new activities, modify the connections
between activity input and output ports. The activity behavior can also be modified,
for example, by dynamically replacing the activity algorithm.
In addition to the proposal of a new workflow model, this dissertation presents the
implementation of a fully functional software architecture that supports the AWARD
model. The implemented prototype was used to validate and refine the model across
multiple workflow scenarios whose usefulness has been demonstrated in practice clearly, through experimental results, demonstrating the advantages of the major characteristics and contributions of the AWARD model. The implemented prototype was also used to develop application cases, such as a workflow to support the implementation of the MapReduce model and a workflow to support a text mining application developed by an external user.
The extensive experimental work confirmed the adequacy of the AWARD model and
its implementation for developing applications that exploit parallelism and distribution
using the scientific workflows paradigm
Improving performance of blackboard systems
In this thesis, we deal with blackboard system performance issues. We show that
blackboard system performance can be improved using parallel processing strategies
and a novel blackboard architecture.We study traditional blackboard architectures using a novel performance frame¬
work. This is a useful tool for directing system optimisation efforts. We present the
analysis of four blackboard systems present in the literature.nalysis of four blackboard systems present in the literature.
Besides localised optimisation efforts, one of the most promising approaches for
improving blackboard system performance is the use of parallel processing techniques.
However, traditional blackboard architectures present both data and control contention
when implemented in parallel.In this thesis we present a novel blackboard architecture, the Active Blackboard
Architecture (ABB). We based ABB on a novel variation of the traditional "Blackboard
and Experts" metaphor, called "Blackboard, Experts and Desks". This new metaphor
introduces a new element, the desks, used by the experts to perform their work.The ABB architecture is based on an active blackboard, capable of processing on its
own, and a decentralised control model. This avoids control contention and bottlenecks.
We describe this architecture using the Z specification language, and implemented
and evaluated in the EPCC Meiko Computing Surface, a multi-transputer distributed
memory parallel machine.The ABB Parallel prototype is an object oriented implementation of the ABB model
that overcomes both data and control bottlenecks by having a distributed blackboard
and using the ABB control model. Based on a series of experiments, we show that the
new architecture allows to achieve much greater effective parallelism in a blackboard
system. We also present some ways in which the system can be tailored to specific
application needs, improving in this way its overall performance
Combining K-Means and XGBoost Models for Anomaly Detection Using Log Datasets
Abstract: Computing and networking systems traditionally record their activity in log files, which have been used for multiple purposes, such as troubleshooting, accounting, post-incident analysis of security breaches, capacity planning and anomaly detection. In earlier systems those log files were processed manually by system administrators, or with the support of basic applications for filtering, compiling and pre-processing the logs for specific purposes. However, as the volume of these log files continues to grow (more logs per system, more systems per domain), it is becoming increasingly difficult to process those logs using traditional tools, especially for less straightforward purposes such as anomaly detection. On the other hand, as systems continue to become more complex, the potential of using large datasets built of logs from heterogeneous sources for detecting anomalies without prior domain knowledge becomes higher. Anomaly detection tools for such scenarios face two challenges. First, devising appropriate data analysis solutions for effectively detecting anomalies from large data sources, possibly without prior domain knowledge. Second, adopting data processing platforms able to cope with the large datasets and complex data analysis algorithms required for such purposes. In this paper we address those challenges by proposing an integrated scalable framework that aims at efficiently detecting anomalous events on large amounts of unlabeled data logs. Detection is supported by clustering and classification methods that take advantage of parallel computing environments. We validate our approach using the the well known NASA Hypertext Transfer Protocol (HTTP) logs datasets. Fourteen features were extracted in order to train a k-means model for separating anomalous and normal events in highly coherent clusters. A second model, making use of the XGBoost system implementing a gradient tree boosting algorithm, uses the previous binary clustered data for producing a set of simple interpretable rules. These rules represent the rationale for generalizing its application over a massive number of unseen events in a distributed computing environment. The classified anomaly events produced by our framework can be used, for instance, as candidates for further forensic and compliance auditing analysis in security management.info:eu-repo/semantics/publishedVersio
Recommended from our members
A grid computing framework for commercial simulation packages
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.An increased need for collaborative research among different organizations, together with continuing advances in communication technology and computer hardware, has facilitated the development of distributed systems that can provide users non-trivial access to geographically dispersed computing resources (processors, storage, applications, data, instruments, etc.) that are administered in multiple computer domains. The term grid computing or grids is popularly used to refer to such distributed systems. A broader definition of grid computing includes the use of computing resources within an organization for running organization-specific applications. This research is in the context of using grid computing within an enterprise to maximize the use of available hardware and software resources for processing enterprise applications. Large scale scientific simulations have traditionally been the primary benefactor of grid computing. The application of this technology to simulation in industry has, however, been negligible. This research investigates how grid technology can be effectively exploited by simulation practitioners using Windows-based commercially available simulation packages to model simulations in industry. These packages are commonly referred to as Commercial Off-The-Shelf (COTS) Simulation Packages (CSPs). The study identifies several higher level grid services that could be potentially used to support the practise of simulation in industry. It proposes a grid computing framework to investigate these services in the context of CSP-based simulations. This framework is called the CSP-Grid Computing (CSP-GC) Framework. Each identified higher level grid service in this framework is referred to as a CSP-specific service. A total of six case studies are presented to experimentally evaluate how grid computing technologies can be used together with unmodified simulation packages to support some of the CSP-specific services. The contribution of this thesis is the CSP-GC framework that identifies how simulation practise in industry may benefit from the use of grid technology. A further contribution is the recognition of specific grid computing software (grid middleware) that can possibly be used together with existing CSPs to provide grid support. With its focus on end-users and end-user tools, it is intended that this research will encourage wider adoption of grid computing in the workplace and that simulation users will derive benefit from using this technology
Theory of Resource Allocation for Robust Distributed Computing
Lately, distributed computing (DC) has emerged in several application scenarios such as grid computing, high-performance and reconfigurable computing, wireless sensor networks, battle management systems, peer-to-peer networks, and donation grids. When DC is performed in these scenarios, the distributed computing system (DCS) supporting the applications not only exhibits heterogeneous computing resources and a significant communication latency, but also becomes highly dynamic due to the communication network as well as the computing servers are affected by a wide class of anomalies that change the topology of the system in a random fashion. These anomalies exhibit spatial and/or temporal correlation when they result, for instance, from wide-area power or network outages These correlated failures may not only inflict a large amount of damage to the system, but they may also induce further failures in other servers as a result of the lack of reliable communication between the components of the DCS. In order to provide a robust DC environment in the presence of component failures, it is key to develop a general framework for accurately modeling the complex dynamics of a DCS. In this dissertation a novel approach has been undertaken for modeling a general class of DCSs and for analytically characterizing the performance and reliability of parallel applications executed on such systems. A general probabilistic model has been constructed by assuming that the random times governing the dynamics of the DCS follow arbitrary probability distributions with heterogeneous parameters. Auxiliary age variables have been introduced in the modeling of a DCS and a hybrid continuous and discrete state-space model the system has been constructed. This hybrid model has enabled the development of an age-dependent stochastic regeneration theory, which, in turn, has been employed to analytically characterize the average execution time, the quality-of-service and the reliability in serving an application. These are three metrics of performance and reliability of practical interest in DC. Analytical approximations as well as mathematical lower and upper bounds for these metrics have also been derived in an attempt to reduce the amount of computational resources demanded by the exact characterizations. In order to systematically assess the reliability of DCSs in the presence of correlated component failures, a novel probabilistic model for spatially correlated failures has been developed. The model, based on graph theory and Markov random fields, captures both geographical and logical correlations induced by the arbitrary topology of the communication network of a DCS. The modeling framework, in conjunction with a general class of dynamic task reallocation (DTR) control policies, has been used to optimize the performance and reliability of applications in the presence of independent as well as spatially correlated anomalies. Theoretical predictions, Monte- Carlo simulations as well as experimental results have shown that optimizing these metrics can significantly impact the performance of a DCS. Moreover, the general setting developed here has shed insights on: (i) the effect of different stochastic mod- els on the accuracy of the performance and reliability metrics, (ii) the dependence of the DTR policies on system parameters such as failure rates and task-processing rates, (iii) the severe impact of correlated failures on the reliability of DCSs, (iv) the dependence of the DTR policies on degree of correlation in the failures, and (v) the fundamental trade-off between minimizing the execution time of an application and maximizing its reliability
- …