39,249 research outputs found

    Multi-agent systems for power engineering applications - part 1 : Concepts, approaches and technical challenges

    Get PDF
    This is the first part of a 2-part paper that has arisen from the work of the IEEE Power Engineering Society's Multi-Agent Systems (MAS) Working Group. Part 1 of the paper examines the potential value of MAS technology to the power industry. In terms of contribution, it describes fundamental concepts and approaches within the field of multi-agent systems that are appropriate to power engineering applications. As well as presenting a comprehensive review of the meaningful power engineering applications for which MAS are being investigated, it also defines the technical issues which must be addressed in order to accelerate and facilitate the uptake of the technology within the power and energy sector. Part 2 of the paper explores the decisions inherent in engineering multi-agent systems for applications in the power and energy sector and offers guidance and recommendations on how MAS can be designed and implemented

    Fault-Tolerant Adaptive Parallel and Distributed Simulation

    Full text link
    Discrete Event Simulation is a widely used technique that is used to model and analyze complex systems in many fields of science and engineering. The increasingly large size of simulation models poses a serious computational challenge, since the time needed to run a simulation can be prohibitively large. For this reason, Parallel and Distributes Simulation techniques have been proposed to take advantage of multiple execution units which are found in multicore processors, cluster of workstations or HPC systems. The current generation of HPC systems includes hundreds of thousands of computing nodes and a vast amount of ancillary components. Despite improvements in manufacturing processes, failures of some components are frequent, and the situation will get worse as larger systems are built. In this paper we describe FT-GAIA, a software-based fault-tolerant extension of the GAIA/ART\`IS parallel simulation middleware. FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes; furthermore, FT-GAIA offers some protection against byzantine failures since synchronization messages are replicated as well, so that the receiving entity can identify and discard corrupted messages. We provide an experimental evaluation of FT-GAIA on a running prototype. Results show that a high degree of fault tolerance can be achieved, at the cost of a moderate increase in the computational load of the execution units.Comment: Proceedings of the IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2016

    MACRM: A Multi-agent Cluster Resource Management System

    Get PDF
    The falling cost of cluster computing has significantly increased its use in the last decade. As a result, the number of users, the size of clusters, and the diversity of jobs that are submitted to clusters have grown. These changes lead to a quest for redesigning of clusters' resource management systems. The growth in the number of users and increase in the size of clusters require a more scalable approach to resource management. Moreover, ever-increasing use of clusters for carrying out a diverse range of computations demands fault-tolerant and highly available cluster management systems. Last, but not the least, serving highly parallel and interactive jobs in a cluster with hundreds of nodes, requires high throughput scheduling with a very short service time. This research presents MACRM, a multi-agent cluster resource management system. MACRM is an adaptive distributed/centralized resource management system which addresses the requirements of scalability, fault-tolerance, high availability, and high throughput scheduling. It breaks up resource management responsibilities and delegates it to different agents to be scalable in various aspects. Also, modularity in MACRM's design increases fault-tolerance because components are replicable and recoverable. Furthermore, MACRM has a very short service time in different loads. It can maintain an average service time of less than 15ms by adaptively switching between centralized and distributed decision making based on a cluster's load. Comparing MACRM with representative centralized and distributed systems (YARN [67] and Sparrow [52]) shows several advantages. We show that MACRM scales better when the number of resources, users, or jobs increase in a cluster. As well, MACRM has faster and less expensive failure recovery mechanisms compared with the two other systems. And finally, our experiments show that MACRM's average service time beats the other systems, particularly in high loads

    Exploiting multi-agent system technology within an autonomous regional active network management system

    Get PDF
    This paper describes the proposed application of multi-agent system (MAS) technology within AuRA-NMS, an autonomous regional network management system currently being developed in the UK through a partnership between several UK universities, distribution network operators (DNO) and a major equipment manufacturer. The paper begins by describing the challenges facing utilities and why those challenges have led the utilities, a major manufacturer and the UK government to invest in the development of a flexible and extensible active network management system. The requirements the utilities have for a network automation system they wish to deploy on their distribution networks are discussed in detail. With those requirements in mind the rationale behind the use of multi-agent systems (MAS) within AuRA-NMS is presented and the inherent research and design challenges highlighted including: the issues associated with robustness of distributed MAS platforms; the arbitration of different control functions; and the relationship between the ontological requirements of Foundation for Intelligent Physical Agent (FIPA) compliant multi-agent systems, legacy protocols and standards such as IEC 61850 and the common information model (CIM)

    Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication

    Full text link
    This paper presents FT-GAIA, a software-based fault-tolerant parallel and distributed simulation middleware. FT-GAIA has being designed to reliably handle Parallel And Distributed Simulation (PADS) models, which are needed to properly simulate and analyze complex systems arising in any kind of scientific or engineering field. PADS takes advantage of multiple execution units run in multicore processors, cluster of workstations or HPC systems. However, large computing systems, such as HPC systems that include hundreds of thousands of computing nodes, have to handle frequent failures of some components. To cope with this issue, FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes. Moreover, FT-GAIA offers some protection against Byzantine failures, since interaction messages among the simulated entities are replicated as well, so that the receiving entity can identify and discard corrupted messages. Results from an analytical model and from an experimental evaluation show that FT-GAIA provides a high degree of fault tolerance, at the cost of a moderate increase in the computational load of the execution units.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0731

    Fault-tolerant formation driving mechanism designed for heterogeneous MAVs-UGVs groups

    Get PDF
    A fault-tolerant method for stabilization and navigation of 3D heterogeneous formations is proposed in this paper. The presented Model Predictive Control (MPC) based approach enables to deploy compact formations of closely cooperating autonomous aerial and ground robots in surveillance scenarios without the necessity of a precise external localization. Instead, the proposed method relies on a top-view visual relative localization provided by the micro aerial vehicles flying above the ground robots and on a simple yet stable visual based navigation using images from an onboard monocular camera. The MPC based schema together with a fault detection and recovery mechanism provide a robust solution applicable in complex environments with static and dynamic obstacles. The core of the proposed leader-follower based formation driving method consists in a representation of the entire 3D formation as a convex hull projected along a desired path that has to be followed by the group. Such an approach provides non-collision solution and respects requirements of the direct visibility between the team members. The uninterrupted visibility is crucial for the employed top-view localization and therefore for the stabilization of the group. The proposed formation driving method and the fault recovery mechanisms are verified by simulations and hardware experiments presented in the paper
    corecore