1,395 research outputs found

    Fault-Tolerant Adaptive Parallel and Distributed Simulation

    Full text link
    Discrete Event Simulation is a widely used technique that is used to model and analyze complex systems in many fields of science and engineering. The increasingly large size of simulation models poses a serious computational challenge, since the time needed to run a simulation can be prohibitively large. For this reason, Parallel and Distributes Simulation techniques have been proposed to take advantage of multiple execution units which are found in multicore processors, cluster of workstations or HPC systems. The current generation of HPC systems includes hundreds of thousands of computing nodes and a vast amount of ancillary components. Despite improvements in manufacturing processes, failures of some components are frequent, and the situation will get worse as larger systems are built. In this paper we describe FT-GAIA, a software-based fault-tolerant extension of the GAIA/ART\`IS parallel simulation middleware. FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes; furthermore, FT-GAIA offers some protection against byzantine failures since synchronization messages are replicated as well, so that the receiving entity can identify and discard corrupted messages. We provide an experimental evaluation of FT-GAIA on a running prototype. Results show that a high degree of fault tolerance can be achieved, at the cost of a moderate increase in the computational load of the execution units.Comment: Proceedings of the IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2016

    Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication

    Full text link
    This paper presents FT-GAIA, a software-based fault-tolerant parallel and distributed simulation middleware. FT-GAIA has being designed to reliably handle Parallel And Distributed Simulation (PADS) models, which are needed to properly simulate and analyze complex systems arising in any kind of scientific or engineering field. PADS takes advantage of multiple execution units run in multicore processors, cluster of workstations or HPC systems. However, large computing systems, such as HPC systems that include hundreds of thousands of computing nodes, have to handle frequent failures of some components. To cope with this issue, FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes. Moreover, FT-GAIA offers some protection against Byzantine failures, since interaction messages among the simulated entities are replicated as well, so that the receiving entity can identify and discard corrupted messages. Results from an analytical model and from an experimental evaluation show that FT-GAIA provides a high degree of fault tolerance, at the cost of a moderate increase in the computational load of the execution units.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0731

    Mobile Computing in Digital Ecosystems: Design Issues and Challenges

    Full text link
    In this paper we argue that the set of wireless, mobile devices (e.g., portable telephones, tablet PCs, GPS navigators, media players) commonly used by human users enables the construction of what we term a digital ecosystem, i.e., an ecosystem constructed out of so-called digital organisms (see below), that can foster the development of novel distributed services. In this context, a human user equipped with his/her own mobile devices, can be though of as a digital organism (DO), a subsystem characterized by a set of peculiar features and resources it can offer to the rest of the ecosystem for use from its peer DOs. The internal organization of the DO must address issues of management of its own resources, including power consumption. Inside the DO and among DOs, peer-to-peer interaction mechanisms can be conveniently deployed to favor resource sharing and data dissemination. Throughout this paper, we show that most of the solutions and technologies needed to construct a digital ecosystem are already available. What is still missing is a framework (i.e., mechanisms, protocols, services) that can support effectively the integration and cooperation of these technologies. In addition, in the following we show that that framework can be implemented as a middleware subsystem that enables novel and ubiquitous forms of computation and communication. Finally, in order to illustrate the effectiveness of our approach, we introduce some experimental results we have obtained from preliminary implementations of (parts of) that subsystem.Comment: Proceedings of the 7th International wireless Communications and Mobile Computing conference (IWCMC-2011), Emergency Management: Communication and Computing Platforms Worksho

    Cross-Layer Peer-to-Peer Track Identification and Optimization Based on Active Networking

    Get PDF
    P2P applications appear to emerge as ultimate killer applications due to their ability to construct highly dynamic overlay topologies with rapidly-varying and unpredictable traffic dynamics, which can constitute a serious challenge even for significantly over-provisioned IP networks. As a result, ISPs are facing new, severe network management problems that are not guaranteed to be addressed by statically deployed network engineering mechanisms. As a first step to a more complete solution to these problems, this paper proposes a P2P measurement, identification and optimisation architecture, designed to cope with the dynamicity and unpredictability of existing, well-known and future, unknown P2P systems. The purpose of this architecture is to provide to the ISPs an effective and scalable approach to control and optimise the traffic produced by P2P applications in their networks. This can be achieved through a combination of different application and network-level programmable techniques, leading to a crosslayer identification and optimisation process. These techniques can be applied using Active Networking platforms, which are able to quickly and easily deploy architectural components on demand. This flexibility of the optimisation architecture is essential to address the rapid development of new P2P protocols and the variation of known protocols

    The state of peer-to-peer network simulators

    Get PDF
    Networking research often relies on simulation in order to test and evaluate new ideas. An important requirement of this process is that results must be reproducible so that other researchers can replicate, validate and extend existing work. We look at the landscape of simulators for research in peer-to-peer (P2P) networks by conducting a survey of a combined total of over 280 papers from before and after 2007 (the year of the last survey in this area), and comment on the large quantity of research using bespoke, closed-source simulators. We propose a set of criteria that P2P simulators should meet, and poll the P2P research community for their agreement. We aim to drive the community towards performing their experiments on simulators that allow for others to validate their results

    Distributed aop middleware for large-scale scenarios

    Get PDF
    En aquesta tesi doctoral presentem una proposta de middleware distribuït pel desenvolupament d'aplicacions de gran escala. La nostra motivació principal és permetre que les responsabilitats distribuïdes d'aquestes aplicacions, com per exemple la replicació, puguin integrar-se de forma transparent i independent. El nostre enfoc es basa en la implementació d'aquestes responsabilitats mitjançant el paradigma d'aspectes distribuïts i es beneficia dels substrats de les xarxes peer-to-peer (P2P) i de la programació orientada a aspectes (AOP) per realitzar-ho de forma descentralitzada, desacoblada, eficient i transparent. La nostra arquitectura middleware es divideix en dues capes: un model de composició i una plataforma escalable de desplegament d'aspectes distribuïts. Per últim, es demostra la viabilitat i aplicabilitat del nostre model mitjançant la implementació i experimentació de prototipus en xarxes de gran escala reals.In this PhD dissertation we present a distributed middleware proposal for large-scale application development. Our main aim is to separate the distributed concerns of these applications, like replication, which can be integrated independently and transparently. Our approach is based on the implementation of these concerns using the paradigm of distributed aspects. In addition, our proposal benefits from the peer-to-peer (P2P) networks and aspect-oriented programming (AOP) substrates to provide these concerns in a decentralized, decoupled, efficient, and transparent way. Our middleware architecture is divided into two layers: a composition model and a scalable deployment platform for distributed aspects. Finally, we demonstrate the viability and applicability of our model via implementation and experimentation of prototypes in real large-scale networks

    Design and Evaluation of Distributed Algorithms for Placement of Network Services

    Get PDF
    Network services play an important role in the Internet today. They serve as data caches for websites, servers for multiplayer games and relay nodes for Voice over IP: VoIP) conversations. While much research has focused on the design of such services, little attention has been focused on their actual placement. This placement can impact the quality of the service, especially if low latency is a requirement. These services can be located on nodes in the network itself, making these nodes supernodes. Typically supernodes are selected in either a proprietary or ad hoc fashion, where a study of this placement is either unavailable or unnecessary. Previous research dealt with the only pieces of the problem, such as finding the location of caches for a static topology, or selecting better routes for relays in VoIP. However, a comprehensive solution is needed for dynamic applications such as multiplayer games or P2P VoIP services. These applications adapt quickly and need solutions based on the immediate demands of the network. In this thesis we develop distributed algorithms to assign nodes the role of a supernode. This research first builds off of prior work by modifying an existing assignment algorithm and implementing it in a distributed system called Supernode Placement in Overlay Topologies: SPOT). New algorithms are developed to assign nodes the supernode role. These algorithms are then evaluated in SPOT to demonstrate improved SN assignment and scalability. Through a series of simulation, emulation, and experimentation insight is gained into the critical issues associated with allocating resources to perform the role of supernodes. Our contributions include distributed algorithms to assign nodes as supernodes, an open source fully functional distributed supernode allocation system, an evaluation of the system in diverse networking environments, and a simulator called SPOTsim which demonstrates the scalability of the system to thousands of nodes. An example of an application deploying such a system is also presented along with the empirical results
    corecore