Search CORE

4,936 research outputs found

OpenPING: A Reflective Middleware for the Construction of Adaptive Networked Game Applications

Author: Blair Gordon S.
Okanda P.
Publication venue
Publication date: 01/01/2004
Field of study

The emergence of distributed Virtual Reality (VR) applications that run over the Internet has presented networked game application designers with new challenges. In an environment where the public internet streams multimedia data and is constantly under pressure to deliver over widely heterogeneous user-platforms, there has been a growing need that distributed VR applications be aware of and adapt to frequent variations in their context of execution. In this paper, we argue that in contrast to research efforts targeted at improvement of scalability, persistence and responsiveness capabilities, much less attempts have been aimed at addressing the flexibility, maintainability and extensibility requirements in contemporary distributed VR platforms. We propose the use of structural reflection as an approach that not only addresses these requirements but also offers added value in the form of providing a framework for scalability, persistence and responsiveness that is itself flexible, maintainable and extensible. We also present an adaptive middleware platform implementation called OpenPING1 that supports our proposal in addressing these requirements

Africa Digital Repository

Lancaster E-Prints

Fault-Tolerant Adaptive Parallel and Distributed Simulation

Author: Armaroli Lorenzo
D'Angelo Gabriele
Ferretti Stefano
Marzolla Moreno
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Discrete Event Simulation is a widely used technique that is used to model and analyze complex systems in many fields of science and engineering. The increasingly large size of simulation models poses a serious computational challenge, since the time needed to run a simulation can be prohibitively large. For this reason, Parallel and Distributes Simulation techniques have been proposed to take advantage of multiple execution units which are found in multicore processors, cluster of workstations or HPC systems. The current generation of HPC systems includes hundreds of thousands of computing nodes and a vast amount of ancillary components. Despite improvements in manufacturing processes, failures of some components are frequent, and the situation will get worse as larger systems are built. In this paper we describe FT-GAIA, a software-based fault-tolerant extension of the GAIA/ART\`IS parallel simulation middleware. FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes; furthermore, FT-GAIA offers some protection against byzantine failures since synchronization messages are replicated as well, so that the receiving entity can identify and discard corrupted messages. We provide an experimental evaluation of FT-GAIA on a running prototype. Results show that a high degree of fault tolerance can be achieved, at the cost of a moderate increase in the computational load of the execution units.Comment: Proceedings of the IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2016

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication

Author: D'Angelo Gabriele
Ferretti Stefano
Marzolla Moreno
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

This paper presents FT-GAIA, a software-based fault-tolerant parallel and distributed simulation middleware. FT-GAIA has being designed to reliably handle Parallel And Distributed Simulation (PADS) models, which are needed to properly simulate and analyze complex systems arising in any kind of scientific or engineering field. PADS takes advantage of multiple execution units run in multicore processors, cluster of workstations or HPC systems. However, large computing systems, such as HPC systems that include hundreds of thousands of computing nodes, have to handle frequent failures of some components. To cope with this issue, FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes. Moreover, FT-GAIA offers some protection against Byzantine failures, since interaction messages among the simulated entities are replicated as well, so that the receiving entity can identify and discard corrupted messages. Results from an analytical model and from an experimental evaluation show that FT-GAIA provides a high degree of fault tolerance, at the cost of a moderate increase in the computational load of the execution units.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0731

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Urbino

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Resource Intensive Traffic-Aware Scheme for Cluster-based Energy Conservation in Wireless Devices

Author: Charalambous Marios C.
Mavromoustakis Constandinos X.
Yassein Muneer Bani
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/09/2012
Field of study

Wireless traffic that is destined for a certain device in a network, can be exploited in order to minimize the availability and delay trade-offs, and mitigate the Energy consumption. The Energy Conservation (EC) mechanism can be node-centric by considering the traversed nodal traffic in order to prolong the network lifetime. This work describes a quantitative traffic-based approach where a clustered Sleep-Proxy mechanism takes place in order to enable each node to sleep according to the time duration of the active traffic that each node expects and experiences. Sleep-proxies within the clusters are created according to pairwise active-time comparison, where each node expects during the active periods, a requested traffic. For resource availability and recovery purposes, the caching mechanism takes place in case where the node for which the traffic is destined is not available. The proposed scheme uses Role-based nodes which are assigned to manipulate the traffic in a cluster, through the time-oriented backward difference traffic evaluation scheme. Simulation study is carried out for the proposed backward estimation scheme and the effectiveness of the end-to-end EC mechanism taking into account a number of metrics and measures for the effects while incrementing the sleep time duration under the proposed framework. Comparative simulation results show that the proposed scheme could be applied to infrastructure-less systems, providing energy-efficient resource exchange with significant minimization in the power consumption of each device.Comment: 6 pages, 8 figures, To appear in the proceedings of IEEE 14th International Conference on High Performance Computing and Communications (HPCC-2012) of the Third International Workshop on Wireless Networks and Multimedia (WNM-2012), 25-27 June 2012, Liverpool, U

arXiv.org e-Print Archive

Crossref

Middleware-based Database Replication: The Gaps between Theory and Practice

Author: Ailamaki Anastasia
Candea George
Cecchet Emmanuel
Publication venue
Publication date: 01/01/2008
Field of study

The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, June 200

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Recommended from our members

Fault tolerance via diversity for off-the-shelf products: A study with SQL database servers

Author: Gashi I.
Popov P. T.
Strigini L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2007
Field of study

If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present

City Research Online

Crossref

Using real options to select stable Middleware-induced software architectures

Author: Bahsoon R.
Emmerich W.
Macke J.
Publication venue
Publication date: 01/01/2005
Field of study

The requirements that force decisions towards building distributed system architectures are usually of a non-functional nature. Scalability, openness, heterogeneity, and fault-tolerance are examples of such non-functional requirements. The current trend is to build distributed systems with middleware, which provide the application developer with primitives for managing the complexity of distribution, system resources, and for realising many of the non-functional requirements. As non-functional requirements evolve, the `coupling' between the middleware and architecture becomes the focal point for understanding the stability of the distributed software system architecture in the face of change. It is hypothesised that the choice of a stable distributed software architecture depends on the choice of the underlying middleware and its flexibility in responding to future changes in non-functional requirements. Drawing on a case study that adequately represents a medium-size component-based distributed architecture, it is reported how a likely future change in scalability could impact the architectural structure of two versions, each induced with a distinct middleware: one with CORBA and the other with J2EE. An option-based model is derived to value the flexibility of the induced-architectures and to guide the selection. The hypothesis is verified to be true for the given change. The paper concludes with some observations that could stimulate future research in the area of relating requirements to software architectures

UCL Discovery

Recommended from our members

Fault diversity among off-the-shelf SQL database servers

Author: Gashi I.
Popov P. T.
Strigini L.
Publication venue
Publication date: 01/01/2004
Field of study

Fault tolerance is often the only viable way of obtaining the required system dependability from systems built out of "off-the-shelf" (OTS) products. We have studied a sample of bug reports from four off-the-shelf SQL servers so as to estimate the possible advantages of software fault tolerance - in the form of modular redundancy with diversity - in complex off-the-shelf software. We checked whether these bugs would cause coincident failures in more than one of the servers. We found that very few bugs affected two of the four servers, and none caused failures in more than two. We also found that only four of these bugs would cause identical, undetectable failures in two servers. Therefore, a fault-tolerant server, built with diverse off-the-shelf servers, seems to have a good chance of delivering improvements in availability and failure rates compared with the individual off-the-shelf servers or their replicated, nondiverse configurations

City Research Online

Crossref

A reflective middleware approach to the provision of grid middleware

Author: Blair Gordon S.
Cai W.
Coulson G.
Parlavantzas N.
Yeung W. K.
Publication venue
Publication date: 01/06/2003
Field of study

Lancaster E-Prints

Software engineering and middleware: a roadmap (Invited talk)

Author: Emmerich W.
Publication venue: ACM Press
Publication date: 01/01/2000
Field of study

The construction of a large class of distributed systems can be simplified by leveraging middleware, which is layered between network operating systems and application components. Middleware resolves heterogeneity and facilitates communication and coordination of distributed components. Existing middleware products enable software engineers to build systems that are distributed across a local-area network. State-of-the-art middleware research aims to push this boundary towards Internet-scale distribution, adaptive and reconfigurable middleware and middleware for dependable and wireless systems. The challenge for software engineering research is to devise notations, techniques, methods and tools for distributed system construction that systematically build and exploit the capabilities that middleware deliver

UCL Discovery