24,441 research outputs found
A peer-to-peer infrastructure for resilient web services
This work is funded by GR/M78403 âSupporting Internet Computation in Arbitrary Geographical Locationsâ and GR/R51872 âReflective Application Framework for Distributed Architecturesâ, and by Nuffield Grant URB/01597/G âPeer-to-Peer Infrastructure for Autonomic Storage ArchitecturesâThis paper describes an infrastructure for the deployment and use of Web Services that are resilient to the failure of the nodes that host those services. The infrastructure presents a single interface that provides mechanisms for users to publish services and to find hosted services. The infrastructure supports the autonomic deployment of services and the brokerage of hosts on which services may be deployed. Once deployed, services are autonomically managed in a number of aspects including load balancing, availability, failure detection and recovery, and lifetime management. Services are published and deployed with associated metadata describing the service type. This same metadata may be used subsequently by interested parties to discover services. The infrastructure uses peer-to-peer (P2P) overlay technologies to abstract over the underlying network to deploy and locate instances of those services. It takes advantage of the P2P network to replicate directory services used to locate service instances (for using a service), Service Hosts (for deployment of services) and Autonomic Managers which manage the deployed services. The P2P overlay network is itself constructed using novel Web Services-based middleware and a variation of the Chord P2P protocol, which is self-managing.Postprin
A Peer-to-Peer Middleware Framework for Resilient Persistent Programming
The persistent programming systems of the 1980s offered a programming model
that integrated computation and long-term storage. In these systems, reliable
applications could be engineered without requiring the programmer to write
translation code to manage the transfer of data to and from non-volatile
storage. More importantly, it simplified the programmer's conceptual model of
an application, and avoided the many coherency problems that result from
multiple cached copies of the same information. Although technically
innovative, persistent languages were not widely adopted, perhaps due in part
to their closed-world model. Each persistent store was located on a single
host, and there were no flexible mechanisms for communication or transfer of
data between separate stores. Here we re-open the work on persistence and
combine it with modern peer-to-peer techniques in order to provide support for
orthogonal persistence in resilient and potentially long-running distributed
applications. Our vision is of an infrastructure within which an application
can be developed and distributed with minimal modification, whereupon the
application becomes resilient to certain failure modes. If a node, or the
connection to it, fails during execution of the application, the objects are
re-instantiated from distributed replicas, without their reference holders
being aware of the failure. Furthermore, we believe that this can be achieved
within a spectrum of application programmer intervention, ranging from minimal
to totally prescriptive, as desired. The same mechanisms encompass an
orthogonally persistent programming model. We outline our approach to
implementing this vision, and describe current progress.Comment: Submitted to EuroSys 200
A recursive approach to network management
Nowadays there is an increasing need for a general management paradigm which can simplify network management and further enable network innovations. In this paper, in response to limitations of current Software Defined Networking (SDN) management solutions, we propose a recursive approach to enterprise network management, where network management is done through managing various Virtual Transport Networks (VTNs). Different from the traditional virtual network model which mainly focuses on routing/tunneling, our VTN provides communication service with explicit Quality-of-Service (QoS) support for applications via transport flows, and it involves all mechanisms (e:g:, routing, addressing, error and flow control, resource allocation) needed to support such transport flows. Based on this approach, we design and implement a management layer, which recurses the same VTN-based management mechanism for enterprise network management. Comparing with an SDN-based management approach, our experimental results show that our management layer achieves better network performance
Proactive cloud management for highly heterogeneous multi-cloud infrastructures
Various literature studies demonstrated that the cloud computing paradigm can help to improve availability and performance of applications subject to the problem of software anomalies. Indeed, the cloud resource provisioning model enables users to rapidly access new processing resources, even distributed over different geographical regions, that can be promptly used in the case of, e.g., crashes or hangs of running machines, as well as to balance the load in the case of overloaded machines. Nevertheless, managing a complex geographically-distributed cloud deploy could be a complex and time-consuming task. Autonomic Cloud Manager (ACM) Framework is an autonomic framework for supporting proactive management of applications deployed over multiple cloud regions. It uses machine learning models to predict failures of virtual machines and to proactively redirect the load to healthy machines/cloud regions. In this paper, we study different policies to perform efficient proactive load balancing across cloud regions in order to mitigate the effect of software anomalies. These policies use predictions about the mean time to failure of virtual machines. We consider the case of heterogeneous cloud regions, i.e regions with different amount of resources, and we provide an experimental assessment of these policies in the context of ACM Framework
Unattended network operations technology assessment study. Technical support for defining advanced satellite systems concepts
The results are summarized of an unattended network operations technology assessment study for the Space Exploration Initiative (SEI). The scope of the work included: (1) identified possible enhancements due to the proposed Mars communications network; (2) identified network operations on Mars; (3) performed a technology assessment of possible supporting technologies based on current and future approaches to network operations; and (4) developed a plan for the testing and development of these technologies. The most important results obtained are as follows: (1) addition of a third Mars Relay Satellite (MRS) and MRS cross link capabilities will enhance the network's fault tolerance capabilities through improved connectivity; (2) network functions can be divided into the six basic ISO network functional groups; (3) distributed artificial intelligence technologies will augment more traditional network management technologies to form the technological infrastructure of a virtually unattended network; and (4) a great effort is required to bring the current network technology levels for manned space communications up to the level needed for an automated fault tolerance Mars communications network
Multi-layer virtual transport network management
Nowadays there is an increasing need for a general paradigm which can simplify network management and further enable network innovations. Software Defined Networking (SDN) is an efficient way to make the network programmable and reduce management complexity, however it is plagued with limitations inherited from the legacy Internet (TCP/IP) architecture. In this paper, in response to limitations of current Software Defined Networking (SDN) management solutions, we propose a recursive approach to enterprise network management, where network management is done through managing various Virtual Transport Networks (VTNs) over different scopes (i.e., regions of operation). Different from the traditional virtual network model which mainly focuses on routing/tunneling, our VTN provides communication service with explicit Quality-of-Service (QoS) support for applications via transport flows, and it involves all mechanisms (e.g., addressing, routing, error and flow control, resource allocation) needed to support such transport flows. Based on this approach, we design and implement a management architecture, which recurses the same VTN-based management mechanism for enterprise network management. Our experimental results show that our management architecture achieves better performance.National Science Foundation awards: CNS-0963974 and CNS-1346688
Resilience markers for safer systems and organisations
If computer systems are to be designed to foster resilient
performance it is important to be able to identify contributors to resilience. The
emerging practice of Resilience Engineering has identified that people are still a
primary source of resilience, and that the design of distributed systems should
provide ways of helping people and organisations to cope with complexity.
Although resilience has been identified as a desired property, researchers and
practitioners do not have a clear understanding of what manifestations of
resilience look like. This paper discusses some examples of strategies that
people can adopt that improve the resilience of a system. Critically, analysis
reveals that the generation of these strategies is only possible if the system
facilitates them. As an example, this paper discusses practices, such as
reflection, that are known to encourage resilient behavior in people. Reflection
allows systems to better prepare for oncoming demands. We show that
contributors to the practice of reflection manifest themselves at different levels
of abstraction: from individual strategies to practices in, for example, control
room environments. The analysis of interaction at these levels enables resilient
properties of a system to be âseenâ, so that systems can be designed to explicitly
support them. We then present an analysis of resilience at an organisational
level within the nuclear domain. This highlights some of the challenges facing
the Resilience Engineering approach and the need for using a collective
language to articulate knowledge of resilient practices across domains
- âŠ