484 research outputs found

    Fast and Reliable IP Recovery for Overlay Routing in Mission Critical Message Oriented Middleware

    Get PDF

    Resilient and Efficient Delivery over Message Oriented Middleware.

    Get PDF
    PhDThe publish/subscribe paradigm is used to support a many-to-many model that allows an efficient dissemination of messages across a distributed system. Message Oriented Middleware (MOM) is a middleware that provides an asynchronous method of passing information between networked applications. MOMs can be based on a publish/subscribe model, which offers a robust paradigm for message delivery. This research is concerned with this specific type of MOM. Recently, systems using MOMs have been used to integrate enterprise systems over geographically distributed areas, like the ones used in financial services, telecommunication applications, transportation and health-care systems. However, the reliability of a MOM system must be verified and consideration given to reachability to all intended destinations typically with to guarantees of delivery. The research in this thesis provides an automated means of checking the (re)configuration of a publish/subscribe MOM system by building a model and using Linear-time Temporal Logic and Computation Tree Logic rules to verify certain constraints. The verification includes the checking of the reachability of different topics, the rules for regulating the working of the system, and checking the configuration and reconfiguration after a failure. The novelty of this work is the creation and the optimization of a symbolic model checker that abstracts the end-to-end network configuration and reconfiguration behaviour and using it to verify reachability and loop detection. In addition a GUI interface, a code generator and a sub-paths detector are implemented to make the system checking more user-friendly and efficient. The research then explores another aspect of reliability. The requirements of mission critical service delivery over a MOM infrastructure is considered and we propose a new way of supporting rapid recovery from failures using pre-calculated routing Abstract tables and coloured flows that can operate across multiple Autonomous System domains. The approach is critically appraised in relation to other published schemes

    Overlay networks for smart grids

    Get PDF

    Management of Temporally and Spatially Correlated Failures in Federated Message Oriented Middleware for Resilient and QoS-Aware Messaging Services.

    Get PDF
    PhDMessage Oriented Middleware (MOM) is widely recognized as a promising solution for the communications between heterogeneous distributed systems. Because the resilience and quality-of-service of the messaging substrate plays a critical role in the overall system performance, the evolution of these distributed systems has introduced new requirements for MOM, such as inter domain federation, resilience and QoS support. This thesis focuses on a management frame work that enhances the Resilience and QoS-awareness of MOM, called RQMOM, for federated enterprise systems. A common hierarchical MOM architecture for the federated messaging service is assumed. Each bottom level local domain comprises a cluster of neighbouring brokers that carry a local messaging service, and inter domain messaging are routed through the gateway brokers of the different local domains over the top level federated overlay. Some challenges and solutions for the intra and inter domain messaging are researched. In local domain messaging the common cause of performance degradation is often the fluctuation of workloads which might result in surge of total workload on a broker and overload its processing capacity, since a local domain is often within a well connected network. Against performance degradation, a combination of novel proactive risk-aware workload allocation, which exploits the co-variation between workloads, in addition to existing reactive load balancing is designed and evaluated. In federated inter domain messaging an overlay network of federated gateway brokers distributed in separated geographical locations, on top of the heterogeneous physical network is considered. Geographical correlated failures are threats to cause major interruptions and damages to such systems. To mitigate this rarely addressed challenge, a novel geographical location aware route selection algorithm to support uninterrupted messaging is introduced. It is used with existing overlay routing mechanisms, to maintain routes and hence provide more resilient messaging against geographical correlated failures

    Reliable and timely event notification for publish/subscribe services over the internet

    Get PDF
    The publish/subscribe paradigm is gaining attention for the development of several applications in wide area networks (WANs) due to its intrinsic time, space, and synchronization decoupling properties that meet the scalability and asynchrony requirements of those applications. However, while the communication in a WAN may be affected by the unpredictable behavior of the network, with messages that can be dropped or delayed, existing publish/subscribe solutions pay just a little attention to addressing these issues. On the contrary, applications such as business intelligence, critical infrastructures, and financial services require delivery guarantees with strict temporal deadlines. In this paper, we propose a framework that enforces both reliability and timeliness for publish/subscribe services over WAN. Specifically, we combine two different approaches: gossiping, to retrieve missing packets in case of incomplete information, and network coding, to reduce the number of retransmissions and, consequently, the latency. We provide an analytical model that describes the information recovery capabilities of our algorithm and a simulation-based study, taking into account a real workload from the Air Traffic Control domain, which evidences how the proposed solution is able to ensure reliable event notification over a WAN within a reasonable bounded time window. © 2013 IEEE

    Preliminary Specification of Services and Protocols

    Get PDF
    This document describes the preliminary specification of services and protocols for the Crutial Architecture. The Crutial Architecture definition, first addressed in Crutial Project Technical Report D4 (January 2007), intends to reply to a grand challenge of computer science and control engineering: how to achieve resilience of critical information infrastructures, in particular in the electrical sector. The definitions herein elaborate on the major architectural options and components established in the Preliminary Architecture Specification (D4), with special relevance to the Crutial middleware building blocks, and are based on the fault, synchrony and topological models defined in the same document. The document, in general lines, describes the Runtime Support Services and APIs, and the Middleware Services and APIs. Then, it delves into the protocols, describing: Runtime Support Protocols, and Middleware Services Protocols. The Runtime Support Services and APIs chapter features as a main component, the Proactive-Reactive Recovery Service, whose aim is to guarantee perpetual execution of any components it protects. The Middleware Services and APIs chapter describes our approach to intrusion-tolerant middleware. The middleware comprises several layers. The Multipoint Network layer is the lowest layer of CRUTIAL's middleware, and features an abstraction of basic communication services, such as provided by standard protocols, like IP, IPsec, UDP, TCP and SSL/TLS. The Communication Support Services feature two important building blocks: the Randomized Intrusion-Tolerant Services (RITAS), and the Overlay Protection Layer (OPL) against DoS attacks. The Activity Support Services currently defined comprise the CIS Protection service, and the Access Control and Authorization service. Protection as described in this report is implemented by mechanisms and protocols residing on a device called Crutial Information Switch (CIS). The Access Control and Authorization service is implemented through PolyOrBAC, which defines the rules for information exchange and collaboration between sub-modules of the architecture, corresponding in fact to different facilities of the CII's organizations.The Monitoring and Failure Detection layer contains a preliminary definition of the middleware services devoted to monitoring and failure detection activities. The remaining chapters describe the protocols implementing the above-mentioned services: Runtime Support Protocols, and Middleware Services Protocol

    Ordering, timeliness and reliability for publish/subscribe systems over WAN

    Get PDF
    In the last few years, the increasing use of the Internet and geo-political, sociological and financial changes induced by globalization, are paving the way for a connected world where the information is always available at the right place and the right time. As such, applications previously deployed for ``closed'' environmets, are now federating into geographically distributed systems connected through a Wide Area Network (WAN). By this evolution, in the near future no system will be isolated: every system will be composed by interconnected systems, i.e., it will be a System of Systems (SoS). Example of SoS are the Large-scale Complex Critical Infrastructure (LCCIs), such as power grids, transport infrastructures (airports and seaports), financial infrastructures, next generation intelligence platforms, to cite a few. In these systems, multiple sources of information generate a high volume of events that need to be delivered to all intended destinations by respecting several Quality of Service (QoS) constraints imposed by the critical nature of LCCIs. As such, particular attention is devoted to the middleware solution used to disseminate information in the SoS. Due to its inherently scalability provided by space, time and synchronization decoupling properties, the publish/subscribe paradigm is becoming attractive for the implementation of a middleware service for LCCIs. However, scalability is not the only requirement exhibited by SoS. Several services need to control a broader set of QoS requirements, such as timeliness, ordering and reliability. Unfortunately, current middleware solutions do not address QoS constraints required by SoS. Current publish/subscribe middleware solutions for the WAN environment offer only a best effort event dissemination, with no additional control on QoS. Just a few implementations try to address some isolated QoS policy, making them not suitable for a SoS scenario. The contribution of this thesis is to devise a QoS layer that can be posed on top of a generic publish/subscribe middleware that enriches its service by addressing: (i) ordering, (ii) reliability and (iii) timeliness in event dissemination in SoS over WAN. Specifically, we first analyze several real case studies, by highlighting their QoS requirements in terms of ordering, reliability and timeliness, and compare these requirements with both current research prototypes and commercial systems. Then, we fill the gap by proposing novel algorithms to address those requirements. The proposed protocols can also be combined together in order to provide the QoS level required by the particular application. In this way, QoS issues do not need to be addressed at application level, so as to leave applications to implement just their native functionalities

    Ordering, timeliness and reliability for publish/subscribe systems over WAN

    Get PDF
    In the last few years, the increasing use of the Internet and geo-political, sociological and financial changes induced by globalization, are paving the way for a connected world where the information is always available at the right place and the right time. As such, applications previously deployed for ``closed'' environmets, are now federating into geographically distributed systems connected through a Wide Area Network (WAN). By this evolution, in the near future no system will be isolated: every system will be composed by interconnected systems, i.e., it will be a System of Systems (SoS). Example of SoS are the Large-scale Complex Critical Infrastructure (LCCIs), such as power grids, transport infrastructures (airports and seaports), financial infrastructures, next generation intelligence platforms, to cite a few. In these systems, multiple sources of information generate a high volume of events that need to be delivered to all intended destinations by respecting several Quality of Service (QoS) constraints imposed by the critical nature of LCCIs. As such, particular attention is devoted to the middleware solution used to disseminate information in the SoS. Due to its inherently scalability provided by space, time and synchronization decoupling properties, the publish/subscribe paradigm is becoming attractive for the implementation of a middleware service for LCCIs. However, scalability is not the only requirement exhibited by SoS. Several services need to control a broader set of QoS requirements, such as timeliness, ordering and reliability. Unfortunately, current middleware solutions do not address QoS constraints required by SoS. Current publish/subscribe middleware solutions for the WAN environment offer only a best effort event dissemination, with no additional control on QoS. Just a few implementations try to address some isolated QoS policy, making them not suitable for a SoS scenario. The contribution of this thesis is to devise a QoS layer that can be posed on top of a generic publish/subscribe middleware that enriches its service by addressing: (i) ordering, (ii) reliability and (iii) timeliness in event dissemination in SoS over WAN. Specifically, we first analyze several real case studies, by highlighting their QoS requirements in terms of ordering, reliability and timeliness, and compare these requirements with both current research prototypes and commercial systems. Then, we fill the gap by proposing novel algorithms to address those requirements. The proposed protocols can also be combined together in order to provide the QoS level required by the particular application. In this way, QoS issues do not need to be addressed at application level, so as to leave applications to implement just their native functionalities

    Congestion avoidance in overlay networks through multipath routing

    Get PDF
    Overlay networks relying on traditional multicast routing approaches use only a single path between a sender and a receiver. This path is selected based on latency, with the goal of achieving fast delivery. Content is routed through links with low latency, ignoring slower links of the network which remain unused. With the increasing size of content on the Internet, this leads to congestion, messages are dropped and have to be retransmitted. A multicast multipath congestion-avoidance routing scheme which uses multiple bottleneck-disjoint paths between senders and receivers was developed, as was a linear programming model of the network to distribute messages intelligently across these paths according to two goals: minimum network usage and load-balancing. The former aims to use as few links as possible to perform routing, while the latter spreads messages across as many links as possible, evenly distributing the traffic. Another technique, called message splitting, was also used. This allows nodes to send a single copy of a message with multiple receivers, which will then be duplicated by a node closer to the receivers and sent along separate paths only when required. The model considers all of the messages in the network and is a global optimisation. Nevertheless, it can be solved quickly for large networks and workloads, with the cost of routing remaining almost entirely the cost of finding multiple paths between senders and receivers. The Gurobi linear programming solver was used to find solutions to the model. This routing approach was implemented in the NS-3 network simulator. The work is presented as a messaging middleware scheme, which can be applied to any overlay messaging network.Open Acces

    Architecture, Services and Protocols for CRUTIAL

    Get PDF
    This document describes the complete specification of the architecture, services and protocols of the project CRUTIAL. The CRUTIAL Architecture intends to reply to a grand challenge of computer science and control engineering: how to achieve resilience of critical information infrastructures (CII), in particular in the electrical sector. In general lines, the document starts by presenting the main architectural options and components of the architecture, with a special emphasis on a protection device called the CRUTIAL Information Switch (CIS). Given the various criticality levels of the equipments that have to be protected, and the cost of using a replicated device, we define a hierarchy of CIS designs incrementally more resilient. The different CIS designs offer various trade offs in terms of capabilities to prevent and tolerate intrusions, both in the device itself and in the information infrastructure. The Middleware Services, APIs and Protocols chapter describes our approach to intrusion tolerant middleware. The CRUTIAL middleware comprises several building blocks that are organized on a set of layers. The Multipoint Network layer is the lowest layer of the middleware, and features an abstraction of basic communication services, such as provided by standard protocols, like IP, IPsec, UDP, TCP and SSL/TLS. The Communication Support layer features three important building blocks: the Randomized Intrusion-Tolerant Services (RITAS), the CIS Communication service and the Fosel service for mitigating DoS attacks. The Activity Support layer comprises the CIS Protection service, and the Access Control and Authorization service. The Access Control and Authorization service is implemented through PolyOrBAC, which defines the rules for information exchange and collaboration between sub-modules of the architecture, corresponding in fact to different facilities of the CII’s organizations. The Monitoring and Failure Detection layer contains a definition of the services devoted to monitoring and failure detection activities. The Runtime Support Services, APIs, and Protocols chapter features as a main component the Proactive-Reactive Recovery service, whose aim is to guarantee perpetual correct execution of any components it protects.Project co-funded by the European Commission within the Sixth Frame-work Programme (2002-2006
    • …
    corecore