Search CORE

24,441 research outputs found

Managing Response Time in a Call-Routing Problem with Service Failure

Author: Francis de Véricourt
Heskett J.
Lin W.
Mandelbaum A.
Yong-Pin Zhou
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date
Field of study

A peer-to-peer infrastructure for resilient web services

Author: Dearle Alan
Kirby Graham N. C.
Norcross Stuart J.
Walker Scott M.
Publication venue: IEEE Computer Society
Publication date: 01/01/2005
Field of study

This work is funded by GR/M78403 “Supporting Internet Computation in Arbitrary Geographical Locations” and GR/R51872 “Reflective Application Framework for Distributed Architectures”, and by Nuffield Grant URB/01597/G “Peer-to-Peer Infrastructure for Autonomic Storage Architectures”This paper describes an infrastructure for the deployment and use of Web Services that are resilient to the failure of the nodes that host those services. The infrastructure presents a single interface that provides mechanisms for users to publish services and to find hosted services. The infrastructure supports the autonomic deployment of services and the brokerage of hosts on which services may be deployed. Once deployed, services are autonomically managed in a number of aspects including load balancing, availability, failure detection and recovery, and lifetime management. Services are published and deployed with associated metadata describing the service type. This same metadata may be used subsequently by interested parties to discover services. The infrastructure uses peer-to-peer (P2P) overlay technologies to abstract over the underlying network to deploy and locate instances of those services. It takes advantage of the P2P network to replicate directory services used to locate service instances (for using a service), Service Hosts (for deployment of services) and Autonomic Managers which manage the deployed services. The P2P overlay network is itself constructed using novel Web Services-based middleware and a variation of the Chord P2P protocol, which is self-managing.Postprin

CiteSeerX

University of St. Andrews - Pure

St Andrews Research Repository

A Peer-to-Peer Middleware Framework for Resilient Persistent Programming

Author: Dearle Alan
Kirby Graham
McCarthy Andrew
Norcross Stuart
Publication venue
Publication date: 18/06/2010
Field of study

The persistent programming systems of the 1980s offered a programming model that integrated computation and long-term storage. In these systems, reliable applications could be engineered without requiring the programmer to write translation code to manage the transfer of data to and from non-volatile storage. More importantly, it simplified the programmer's conceptual model of an application, and avoided the many coherency problems that result from multiple cached copies of the same information. Although technically innovative, persistent languages were not widely adopted, perhaps due in part to their closed-world model. Each persistent store was located on a single host, and there were no flexible mechanisms for communication or transfer of data between separate stores. Here we re-open the work on persistence and combine it with modern peer-to-peer techniques in order to provide support for orthogonal persistence in resilient and potentially long-running distributed applications. Our vision is of an infrastructure within which an application can be developed and distributed with minimal modification, whereupon the application becomes resilient to certain failure modes. If a node, or the connection to it, fails during execution of the application, the objects are re-instantiated from distributed replicas, without their reference holders being aware of the failure. Furthermore, we believe that this can be achieved within a spectrum of application programmer intervention, ranging from minimal to totally prescriptive, as desired. The same mechanisms encompass an orthogonally persistent programming model. We outline our approach to implementing this vision, and describe current progress.Comment: Submitted to EuroSys 200

arXiv.org e-Print Archive

St Andrews Research Repository

A recursive approach to network management

Author: Matta Ibrahim
Wang Yuefeng
Publication venue: Computer Science Department, Boston University
Publication date: 14/12/2015
Field of study

Nowadays there is an increasing need for a general management paradigm which can simplify network management and further enable network innovations. In this paper, in response to limitations of current Software Defined Networking (SDN) management solutions, we propose a recursive approach to enterprise network management, where network management is done through managing various Virtual Transport Networks (VTNs). Different from the traditional virtual network model which mainly focuses on routing/tunneling, our VTN provides communication service with explicit Quality-of-Service (QoS) support for applications via transport flows, and it involves all mechanisms (e:g:, routing, addressing, error and flow control, resource allocation) needed to support such transport flows. Based on this approach, we design and implement a management layer, which recurses the same VTN-based management mechanism for enterprise network management. Comparing with an SDN-based management approach, our experimental results show that our management layer achieves better network performance

Boston University Institutional Repository (OpenBU)

Proactive cloud management for highly heterogeneous multi-cloud infrastructures

Author: Avresky Dimiter R.
DI SANZO Pierangelo
Pellegrini Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Various literature studies demonstrated that the cloud computing paradigm can help to improve availability and performance of applications subject to the problem of software anomalies. Indeed, the cloud resource provisioning model enables users to rapidly access new processing resources, even distributed over different geographical regions, that can be promptly used in the case of, e.g., crashes or hangs of running machines, as well as to balance the load in the case of overloaded machines. Nevertheless, managing a complex geographically-distributed cloud deploy could be a complex and time-consuming task. Autonomic Cloud Manager (ACM) Framework is an autonomic framework for supporting proactive management of applications deployed over multiple cloud regions. It uses machine learning models to predict failures of virtual machines and to proactively redirect the load to healthy machines/cloud regions. In this paper, we study different policies to perform efficient proactive load balancing across cloud regions in order to mitigate the effect of software anomalies. These policies use predictions about the mean time to failure of virtual machines. We consider the case of heterogeneous cloud regions, i.e regions with different amount of resources, and we provide an experimental assessment of these policies in the context of ACM Framework

Archivio della Ricerca - Università di Roma 3

ART

Archivio della ricerca- Università di Roma La Sapienza

Unattended network operations technology assessment study. Technical support for defining advanced satellite systems concepts

Author: Holdridge Mark
Jaworski Allan
Morgan Herbert K.
Odubiyi Jide
Price Kent M.
Publication venue
Publication date
Field of study

The results are summarized of an unattended network operations technology assessment study for the Space Exploration Initiative (SEI). The scope of the work included: (1) identified possible enhancements due to the proposed Mars communications network; (2) identified network operations on Mars; (3) performed a technology assessment of possible supporting technologies based on current and future approaches to network operations; and (4) developed a plan for the testing and development of these technologies. The most important results obtained are as follows: (1) addition of a third Mars Relay Satellite (MRS) and MRS cross link capabilities will enhance the network's fault tolerance capabilities through improved connectivity; (2) network functions can be divided into the six basic ISO network functional groups; (3) distributed artificial intelligence technologies will augment more traditional network management technologies to form the technological infrastructure of a virtually unattended network; and (4) a great effort is required to bring the current network technology levels for manned space communications up to the level needed for an automated fault tolerance Mars communications network

NASA Technical Reports Server

Multi-layer virtual transport network management

Author: Matta Ibrahim
Wang Yuefeng
Publication venue: Department of Computer Science, Boston University
Publication date: 01/01/2017
Field of study

Nowadays there is an increasing need for a general paradigm which can simplify network management and further enable network innovations. Software Defined Networking (SDN) is an efficient way to make the network programmable and reduce management complexity, however it is plagued with limitations inherited from the legacy Internet (TCP/IP) architecture. In this paper, in response to limitations of current Software Defined Networking (SDN) management solutions, we propose a recursive approach to enterprise network management, where network management is done through managing various Virtual Transport Networks (VTNs) over different scopes (i.e., regions of operation). Different from the traditional virtual network model which mainly focuses on routing/tunneling, our VTN provides communication service with explicit Quality-of-Service (QoS) support for applications via transport flows, and it involves all mechanisms (e.g., addressing, routing, error and flow control, resource allocation) needed to support such transport flows. Based on this approach, we design and implement a management architecture, which recurses the same VTN-based management mechanism for enterprise network management. Our experimental results show that our management architecture achieves better performance.National Science Foundation awards: CNS-0963974 and CNS-1346688

Boston University Institutional Repository (OpenBU)

Resilience markers for safer systems and organisations

Author: Back J.
Blandford A.
Furniss D.
Hildebrandt M.
Publication venue: Springer Verlag
Publication date: 20/09/2008
Field of study

If computer systems are to be designed to foster resilient performance it is important to be able to identify contributors to resilience. The emerging practice of Resilience Engineering has identified that people are still a primary source of resilience, and that the design of distributed systems should provide ways of helping people and organisations to cope with complexity. Although resilience has been identified as a desired property, researchers and practitioners do not have a clear understanding of what manifestations of resilience look like. This paper discusses some examples of strategies that people can adopt that improve the resilience of a system. Critically, analysis reveals that the generation of these strategies is only possible if the system facilitates them. As an example, this paper discusses practices, such as reflection, that are known to encourage resilient behavior in people. Reflection allows systems to better prepare for oncoming demands. We show that contributors to the practice of reflection manifest themselves at different levels of abstraction: from individual strategies to practices in, for example, control room environments. The analysis of interaction at these levels enables resilient properties of a system to be ‘seen’, so that systems can be designed to explicitly support them. We then present an analysis of resilience at an organisational level within the nuclear domain. This highlights some of the challenges facing the Resilience Engineering approach and the need for using a collective language to articulate knowledge of resilient practices across domains

UCL Discovery