81,785 research outputs found
Software engineering and middleware: a roadmap (Invited talk)
The construction of a large class of distributed systems can be simplified by leveraging middleware, which is layered between network operating systems and application components. Middleware resolves heterogeneity and facilitates communication and coordination of distributed components. Existing middleware products enable software engineers to build systems that are distributed across a local-area network. State-of-the-art middleware research aims to push this boundary towards Internet-scale distribution, adaptive and reconfigurable middleware and middleware for dependable and wireless systems. The challenge for software engineering research is to devise notations, techniques, methods and tools for distributed system construction that systematically build and exploit the capabilities that middleware deliver
Classes of Byzantine Fault-Tolerant Algorithms for Dependable Distributed Systems.
This thesis concentrates on the design of new algorithms for fault-tolerant systems based on system-level hardware masking redundancy. It is argued that any system in which a reliability improvement of at least a factor 100 is required should be based on system-level hardware masking redundancy. The technique of system-level hardware masking redundancy is applicable in a redundant system consisting of a number of processors, in which the system services are replicated on the different processors, and provides resilience to a limited number of faulty processors in the system. The technique is most effective in a distributed system, since the autonomous nature and geographical distribution of the processors in such a system largely contribute to achieve independency between failures of different processors, which improves the reliability of the system
Multiparty interactions in dependable distributed systems
PhD ThesisWith the expansion of computer networks, activities involving computer communication
are becoming more and more distributed. Such distribution can
include processing, control, data, network management, and security. Although
distribution can improve the reliability of a system by replicating
components, sometimes an increase in distribution can introduce some undesirable
faults. To reduce the risks of introducing, and to improve the chances
of removing and tolerating faults when distributing applications, it is important
that distributed systems are implemented in an organized way.
As in sequential programming, complexity in distributed, in particular
parallel, program development can be managed by providing appropriate
programming language constructs. Language constructs can help both by
supporting encapsulation so as to prevent unwanted interactions between
program components and by providing higher-level abstractions that reduce
programmer effort by allowing compilers to handle mundane, error-prone
aspects of parallel program implementation.
A language construct that supports encapsulation of interactions between
multiple parties (objects or processes) is referred in the literature as multiparty
interaction. In a multiparty interaction, several parties somehow "come
together" to produce an intermediate and temporary combined state, use this
state to execute some activity, and then leave the interaction and continue
their normal execution.
There has been a lot of work in the past years on multiparty interaction,
but most of it has been concerned with synchronisation, or handshaking,
between parties rather than the encapsulation of several activities executed
in parallel by the interaction participants. The programmer is therefore left
responsible for ensuring that the processes involved in a cooperative activity
do not interfere with, or suffer interference from, other processes not involved
in the activity.
Furthermore, none of this work has discussed the provision of features
that would facilitate the design of multiparty interactions that are expected
to cope with faults - whether in the environment that the computer system
has to deal with, in the operation of the underlying computer hardware or
software, or in the design of the processes that are involved in the interaction.
In this thesis the concept of multiparty interaction is integrated with
the concept of exception handling in concurrent activities. The final result
is a language in which the concept of multiparty interaction is extended
by providing it with a mechanism to handle concurrent exceptions. This
extended concept is called dependable multiparty interaction.
The features and requirements for multiparty interaction and exception
handling provided in a set of languages surveyed in this thesis, are integrated
to describe the new dependable multiparty interaction construct. Additionally,
object-oriented architectures for dependable multiparty interactions are
described, and a full implementation of one of the architectures is provided.
This implementation is then applied to a set of case studies. The case studies
show how dependable multiparty interactions can be used to design and
implement a safety-critical system, a multiparty programming abstraction,
and a parallel computation model.Brazilian Research Agency CNPq
Dependable Control for Wireless Distributed Control Systems
The use of wireless communications for real-time control applications poses several problems related to the comparatively low reliability of the communication channels. This paper is concerned with adaptive and predictive application-level strategies for ameliorating the effects of packet losses and burst errors in industrial sampled-data Distributed Control Systems (DCSs), which are implemented via one or more wireless and/or wired links, possibly spanning multiple hops. The paper describes an adaptive compensator that reconstructs the best estimates (in a least squares sense) of a sequence of one or more missing sensor node data packets in the controller node. At each sample time, the controller node calculates the current control, and a prediction of future controls to apply over a short time horizon; these controls are forwarded to the actuator node every sample time step. A simple design method for a digital Proportional Integral Derivative (PID)-like adaptive controller is also described for use in the controller node. Together these mechanisms give robustness to packet losses around the control loop; in addition, the majority of the computational overhead resides in the controller node. An implementation of the proposed techniques is applied to a case study using a Hardware in the Loop (HIL) test facility, and favorable results (in terms of both performance and computational overheads) are found when compared to an existing robust control method for a DCS experiencing artificially induced burst errors
Towards Adaptable and Adaptive Policy-Free Middleware
We believe that to fully support adaptive distributed applications,
middleware must itself be adaptable, adaptive and policy-free. In this paper we
present a new language-independent adaptable and adaptive policy framework
suitable for integration in a wide variety of middleware systems. This
framework facilitates the construction of adaptive distributed applications.
The framework addresses adaptability through its ability to represent a wide
range of specific middleware policies. Adaptiveness is supported by a rich
contextual model, through which an application programmer may control precisely
how policies should be selected for any particular interaction with the
middleware. A contextual pattern mechanism facilitates the succinct expression
of both coarse- and fine-grain policy contexts. Policies may be specified and
altered dynamically, and may themselves take account of dynamic conditions. The
framework contains no hard-wired policies; instead, all policies can be
configured.Comment: Submitted to Dependable and Adaptive Distributed Systems Track, ACM
SAC 200
Recommended from our members
Survivor: An Approach for Adding Dependability to Legacy Workflow Systems
Although they often provide critical services, most workflow systems are not dependable. There has been much literature on dependable/survivable distributed systems; most is concerned with developing new architectures, not adapting pre-existing ones. Additionally, the literature is focused on hardening, security-based defense, as opposed to recovery. For deployed systems, it is often infeasible to completely replace existing infrastructures; what is more pragmatic are ways in which existing distributed systems can be adapted to offer better dependability. In this paper, we outline a general architecture that can easily be retrofitted to legacy workflow systems in order to improve dependability and fault tolerance. We do this by monitoring enactment and replicating partial workflow states as tools for detection, analysis and recovery. We discuss some policies that can guide these mechanisms. Finally, we describe and evaluate our implementation, Survivor, which modified an existing workflow system provided by the Naval Research Lab
Solving k-Set Agreement with Stable Skeleton Graphs
In this paper we consider the k-set agreement problem in distributed
message-passing systems using a round-based approach: Both synchrony of
communication and failures are captured just by means of the messages that
arrive within a round, resulting in round-by-round communication graphs that
can be characterized by simple communication predicates. We introduce the weak
communication predicate PSources(k) and show that it is tight for k-set
agreement, in the following sense: We (i) prove that there is no algorithm for
solving (k-1)-set agreement in systems characterized by PSources(k), and (ii)
present a novel distributed algorithm that achieves k-set agreement in runs
where PSources(k) holds. Our algorithm uses local approximations of the stable
skeleton graph, which reflects the underlying perpetual synchrony of a run. We
prove that this approximation is correct in all runs, regardless of the
communication predicate, and show that graph-theoretic properties of the stable
skeleton graph can be used to solve k-set agreement if PSources(k) holds.Comment: to appear in 16th IEEE Workshop on Dependable Parallel, Distributed
and Network-Centric System
- …