15 research outputs found

    An Optimizing Java Translation Framework for Automated Checkpointing and Strong Mobility

    Get PDF
    Long-running programs, e.g., in high-performance computing, need to write periodic checkpoints of their execution state to disk to allow them to recover from node failure. Manually adding checkpointing code to an application, however, is very tedious. The mechanisms needed for writing the execution state of a program to disk and restoring it are similar to those needed for migrating a running thread or a mobile object. We have extended a source-to-source translation scheme that allows the migration of mobile Java objects with running threads to make it more general and allow it to be used for automated checkpointing. Our translation scheme allows serializable threads to be written to disk or migrated with a mobile agent to a remote machine. The translator generates code that maintains a serializable run-time stack for each thread as a Java data structure. While this results in significant run-time overhead, it allows the checkpointing code to be generated automatically. We improved the locking mechanism that is needed to protect the run-time stack as well as the translation scheme. Our experimental results demonstrate an speedup of the generated code over the original translator and show that the approach is feasible in practice

    Heterogeneous Strong Computation Migration

    Full text link
    The continuous increase in performance requirements, for both scientific computation and industry, motivates the need of a powerful computing infrastructure. The Grid appeared as a solution for inexpensive execution of heavy applications in a parallel and distributed manner. It allows combining resources independently of their physical location and architecture to form a global resource pool available to all grid users. However, grid environments are highly unstable and unpredictable. Adaptability is a crucial issue in this context, in order to guarantee an appropriate quality of service to users. Migration is a technique frequently used for achieving adaptation. The objective of this report is to survey the problem of strong migration in heterogeneous environments like the grids', the related implementation issues and the current solutions.Comment: This is the pre-peer reviewed version of the following article: Milan\'es, A., Rodriguez, N. and Schulze, B. (2008), State of the art in heterogeneous strong migration of computations. Concurrency and Computation: Practice and Experience, 20: 1485-1508, which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1002/cpe.1287/abstrac

    The migration process of mobile agents: implementation, classification, and optimization

    Get PDF
    Mobile Agenten stellen ein neues faszinierendes Design-Paradigma für den Aufbau und die Programmierung von verteilten Systemen dar. Ein mobiler Agent ist eine Software-Entität, die von ihrem Besitzer mit einem Auftrag auf einem Knoten eines verteilten Systems gestartet wird und dann zur Laufzeit auf andere Knoten des Netzwerkes migriert. Diese Arbeit konzentriert sich auf den Migrationsprozess für mobile Agenten, dem in der Literatur bisher wenig Aufmerksamkeit geschenkt wurde, obwohl er die Ausführungsgeschwindigkeit eines Agenten entscheidend beeinflusst. Eine detaillierte Analyse der Netzbelastung von mobilen Agenten im Vergleich zum traditionellen Client-Server Ansatz in mehreren typischen Anwendungsszenarien zeigt das Potential von mobilen Agenten zur Verringerung von Verarbeitungszeiten. Allerdings zeigt die Analyse ebenso die Nachteile der in heutigen Agentensystemen verwendenten sehr einfachen Migrationstechniken. Es wird ein neues Migrationsmodell mit Namen Kalong vorgestellt, das den Nachteil der fehlenden Anpassbarkeit heutiger Agentensysteme beseitigt und dem Programmierer eines mobilen Agenten eine sehr flexible Technik für die Migration zur Verfügung stellt

    Applications of agent architectures to decision support in distributed simulation and training systems

    Get PDF
    This work develops the approach and presents the results of a new model for applying intelligent agents to complex distributed interactive simulation for command and control. In the framework of tactical command, control communications, computers and intelligence (C4I), software agents provide a novel approach for efficient decision support and distributed interactive mission training. An agent-based architecture for decision support is designed, implemented and is applied in a distributed interactive simulation to significantly enhance the command and control training during simulated exercises. The architecture is based on monitoring, evaluation, and advice agents, which cooperate to provide alternatives to the dec ision-maker in a time and resource constrained environment. The architecture is implemented and tested within the context of an AWACS Weapons Director trainer tool. The foundation of the work required a wide range of preliminary research topics to be covered, including real-time systems, resource allocation, agent-based computing, decision support systems, and distributed interactive simulations. The major contribution of our work is the construction of a multi-agent architecture and its application to an operational decision support system for command and control interactive simulation. The architectural design for the multi-agent system was drafted in the first stage of the work. In the next stage rules of engagement, objective and cost functions were determined in the AWACS (Airforce command and control) decision support domain. Finally, the multi-agent architecture was implemented and evaluated inside a distributed interactive simulation test-bed for AWACS Vv\u27Ds. The evaluation process combined individual and team use of the decision support system to improve the performance results of WD trainees. The decision support system is designed and implemented a distributed architecture for performance-oriented management of software agents. The approach provides new agent interaction protocols and utilizes agent performance monitoring and remote synchronization mechanisms. This multi-agent architecture enables direct and indirect agent communication as well as dynamic hierarchical agent coordination. Inter-agent communications use predefined interfaces, protocols, and open channels with specified ontology and semantics. Services can be requested and responses with results received over such communication modes. Both traditional (functional) parameters and nonfunctional (e.g. QoS, deadline, etc.) requirements and captured in service requests

    Models of higher-order, type-safe, distributed computation over autonomous persistent object stores

    Get PDF
    A remote procedure call (RPC) mechanism permits the calling of procedures in another address space. RPC is a simple but highly effective mechanism for interprocess communication and enjoys nowadays a great popularity as a tool for building distributed applications. This popularity is partly a result of their overall simplicity but also partly a consequence of more than 20 years of research in transpaxent distribution that have failed to deliver systems that meet the expectations of real-world application programmers. During the same 20 years, persistent systems have proved their suitability for building complex database applications by seamlessly integrating features traditionally found in database management systems into the programming language itself. Some research. effort has been invested on distributed persistent systems, but the outcomes commonly suffer from the same problems found with transparent distribution. In this thesis I claim that a higher-order persistent RPC is useful for building distributed persistent applications. The proposed mechanism is: realistic in the sense that it uses current technology and tolerates partial failures; understandable by application programmers; and general to support the development of many classes of distributed persistent applications. In order to demonstrate the validity of these claims, I propose and have implemented three models for distributed higher-order computation over autonomous persistent stores. Each model has successively exposed new problems which have then been overcome by the next model. Together, the three models provide a general yet simple higher-order persistent RPC that is able to operate in realistic environments with partial failures. The real strength of this thesis is the demonstration of realism and simplicity. A higherorder persistent RPC was not only implemented but also used by programmers without experience of programming distributed applications. Furthermore, a distributed persistent application has been built using these models which would not have been feasible with a traditional (non-persistent) programming language

    Functional programming languages in computing clouds: practical and theoretical explorations

    Get PDF
    Cloud platforms must integrate three pillars: messaging, coordination of workers and data. This research investigates whether functional programming languages have any special merit when it comes to the implementation of cloud computing platforms. This thesis presents the lightweight message queue CMQ and the DSL CWMWL for the coordination of workers that we use as artefact to proof or disproof the special merit of functional programming languages in computing clouds. We have detailed the design and implementation with the broad aim to match the notions and the requirements of computing clouds. Our approach to evaluate these aims is based on evaluation criteria that are based on a series of comprehensive rationales and specifics that allow the FPL Haskell to be thoroughly analysed. We find that Haskell is excellent for use cases that do not require the distribution of the application across the boundaries of (physical or virtual) systems, but not appropriate as a whole for the development of distributed cloud based workloads that require communication with the far side and coordination of decoupled workloads. However, Haskell may be able to qualify as a suitable vehicle in the future with future developments of formal mechanisms that embrace non-determinism in the underlying distributed environments leading to applications that are anti-fragile rather than applications that insist on strict determinism that can only be guaranteed on the local system or via slow blocking communication mechanisms

    Functional programming languages in computing clouds: practical and theoretical explorations

    Get PDF
    Cloud platforms must integrate three pillars: messaging, coordination of workers and data. This research investigates whether functional programming languages have any special merit when it comes to the implementation of cloud computing platforms. This thesis presents the lightweight message queue CMQ and the DSL CWMWL for the coordination of workers that we use as artefact to proof or disproof the special merit of functional programming languages in computing clouds. We have detailed the design and implementation with the broad aim to match the notions and the requirements of computing clouds. Our approach to evaluate these aims is based on evaluation criteria that are based on a series of comprehensive rationales and specifics that allow the FPL Haskell to be thoroughly analysed. We find that Haskell is excellent for use cases that do not require the distribution of the application across the boundaries of (physical or virtual) systems, but not appropriate as a whole for the development of distributed cloud based workloads that require communication with the far side and coordination of decoupled workloads. However, Haskell may be able to qualify as a suitable vehicle in the future with future developments of formal mechanisms that embrace non-determinism in the underlying distributed environments leading to applications that are anti-fragile rather than applications that insist on strict determinism that can only be guaranteed on the local system or via slow blocking communication mechanisms

    Proceedings of Monterey Workshop 2001 Engineering Automation for Sofware Intensive System Integration

    Get PDF
    The 2001 Monterey Workshop on Engineering Automation for Software Intensive System Integration was sponsored by the Office of Naval Research, Air Force Office of Scientific Research, Army Research Office and the Defense Advance Research Projects Agency. It is our pleasure to thank the workshop advisory and sponsors for their vision of a principled engineering solution for software and for their many-year tireless effort in supporting a series of workshops to bring everyone together.This workshop is the 8 in a series of International workshops. The workshop was held in Monterey Beach Hotel, Monterey, California during June 18-22, 2001. The general theme of the workshop has been to present and discuss research works that aims at increasing the practical impact of formal methods for software and systems engineering. The particular focus of this workshop was "Engineering Automation for Software Intensive System Integration". Previous workshops have been focused on issues including, "Real-time & Concurrent Systems", "Software Merging and Slicing", "Software Evolution", "Software Architecture", "Requirements Targeting Software" and "Modeling Software System Structures in a fastly moving scenario".Office of Naval ResearchAir Force Office of Scientific Research Army Research OfficeDefense Advanced Research Projects AgencyApproved for public release, distribution unlimite

    Mesh-Mon: a Monitoring and Management System for Wireless Mesh Networks

    Get PDF
    A mesh network is a network of wireless routers that employ multi-hop routing and can be used to provide network access for mobile clients. Mobile mesh networks can be deployed rapidly to provide an alternate communication infrastructure for emergency response operations in areas with limited or damaged infrastructure. In this dissertation, we present Dart-Mesh: a Linux-based layer-3 dual-radio two-tiered mesh network that provides complete 802.11b coverage in the Sudikoff Lab for Computer Science at Dartmouth College. We faced several challenges in building, testing, monitoring and managing this network. These challenges motivated us to design and implement Mesh-Mon, a network monitoring system to aid system administrators in the management of a mobile mesh network. Mesh-Mon is a scalable, distributed and decentralized management system in which mesh nodes cooperate in a proactive manner to help detect, diagnose and resolve network problems automatically. Mesh-Mon is independent of the routing protocol used by the mesh routing layer and can function even if the routing protocol fails. We demonstrate this feature by running Mesh-Mon on two versions of Dart-Mesh, one running on AODV (a reactive mesh routing protocol) and the second running on OLSR (a proactive mesh routing protocol) in separate experiments. Mobility can cause links to break, leading to disconnected partitions. We identify critical nodes in the network, whose failure may cause a partition. We introduce two new metrics based on social-network analysis: the Localized Bridging Centrality (LBC) metric and the Localized Load-aware Bridging Centrality (LLBC) metric, that can identify critical nodes efficiently and in a fully distributed manner. We run a monitoring component on client nodes, called Mesh-Mon-Ami, which also assists Mesh-Mon nodes in the dissemination of management information between physically disconnected partitions, by acting as carriers for management data. We conclude, from our experimental evaluation on our 16-node Dart-Mesh testbed, that our system solves several management challenges in a scalable manner, and is a useful and effective tool for monitoring and managing real-world mesh networks
    corecore