Search CORE

286 research outputs found

Systematic composition of distributed objects: Processes and sessions

Author: Chandy K. Mani
Rifkin Adam
Publication venue
Publication date: 01/08/1997
Field of study

We consider a system with the infrastructure for the creation and interconnection of large numbers of distributed persistent objects. This system is exemplified by the Internet: potentially, every appliance and document on the Internet has both persistent state and the ability to interact with large numbers of other appliances and documents on the Internet. This paper elucidates the characteristics of such a system, and proposes the compositional requirements of its corresponding infrastructure. We explore the problems of specifying, composing, reasoning about and implementing applications in such a system. A specific concern of our research is developing the infrastructure to support structuring distributed applications by using sequential, choice and parallel composition, in the anarchic environment where application compositions may be unforeseeable and interactions may be unknown prior to actually occurring. The structuring concepts discussed are relevant to a wide range of distributed applications; our implementation is illustrated with collaborative Java processes interacting over the Internet, but the methodology provided can be applied independent of specific platforms

Caltech Authors

NASA Constellation Distributed Simulation Middleware Trade Study

Author: Bowman James D.
Cures Edwin Z.
Cutts Dannie
Fisher Nancy
Hasan David
Publication venue
Publication date
Field of study

This paper presents the results of a trade study designed to assess three distributed simulation middleware technologies for support of the NASA Constellation Distributed Space Exploration Simulation (DSES) project and Test and Verification Distributed System Integration Laboratory (DSIL). The technologies are the High Level Architecture (HLA), the Test and Training Enabling Architecture (TENA), and an XML-based variant of Distributed Interactive Simulation (DIS-XML) coupled with the Extensible Messaging and Presence Protocol (XMPP). According to the criteria and weights determined in this study, HLA scores better than the other two for DSES as well as the DSIL

NASA Technical Reports Server

Integrated analysis of error detection and recovery

Author: Lee Y. H.
Shin K. G.
Publication venue
Publication date
Field of study

An integrated modeling and analysis of error detection and recovery is presented. When fault latency and/or error latency exist, the system may suffer from multiple faults or error propagations which seriously deteriorate the fault-tolerant capability. Several detection models that enable analysis of the effect of detection mechanisms on the subsequent error handling operations and the overall system reliability were developed. Following detection of the faulty unit and reconfiguration of the system, the contaminated processes or tasks have to be recovered. The strategies of error recovery employed depend on the detection mechanisms and the available redundancy. Several recovery methods including the rollback recovery are considered. The recovery overhead is evaluated as an index of the capabilities of the detection and reconfiguration mechanisms

NASA Technical Reports Server

A Survey on Transactional Stream Processing

Author: Markl Volker
Soto Juan
Zhang Shuhao
Publication venue
Publication date: 31/05/2023
Field of study

Transactional stream processing (TSP) strives to create a cohesive model that merges the advantages of both transactional and stream-oriented guarantees. Over the past decade, numerous endeavors have contributed to the evolution of TSP solutions, uncovering similarities and distinctions among them. Despite these advances, a universally accepted standard approach for integrating transactional functionality with stream processing remains to be established. Existing TSP solutions predominantly concentrate on specific application characteristics and involve complex design trade-offs. This survey intends to introduce TSP and present our perspective on its future progression. Our primary goals are twofold: to provide insights into the diverse TSP requirements and methodologies, and to inspire the design and development of groundbreaking TSP systems

arXiv.org e-Print Archive

Standart-konformes Snapshotting für SystemC Virtuelle Plattformen

Author: Farkas Bastian
Publication venue
Publication date: 01/01/2019
Field of study

The steady increase in complexity of high-end embedded systems goes along with an increasingly complex design process. We are currently still in a transition phase from Hardware-Description Language (HDL) based design towards virtual-platform-based design of embedded systems. As design complexity rises faster than developer productivity a gap forms. Restoring productivity while at the same time managing increased design complexity can also be achieved through focussing on the development of new tools and design methodologies. In most application areas, high-level modelling languages such as SystemC are used in early design phases. In modern software development Continuous Integration (CI) is used to automatically test if a submitted piece of code breaks functionality. Application of the CI concept to embedded system design and testing requires fast build and test execution times from the virtual platform framework. For this use case the ability to save a specific state of a virtual platform becomes necessary. The saving and restoring of specific states of a simulation requires the ability to serialize all data structures within the simulation models. Improving the frameworks and establishing better methods will only help to narrow the design gap, if these changes are introduced with the needs of the engineers and developers in mind. Ultimately, it is their productivity that shall be improved. The ability to save the state of a virtual platform enables developers to run longer test campaigns that can even contain randomized test stimuli. If the saved states are modifiable the developers can inject faulty states into the simulation models. This work contributes an extension to the SoCRocket virtual platform framework to enable snapshotting. The snapshotting extension can be considered a reference implementation as the utilization of current SystemC/TLM standards makes it compatible to other frameworkds. Furthermore, integrating the UVM SystemC library into the framework enables test driven development and fast validation of SystemC/TLM models using snapshots. These extensions narrow the design gap by supporting designers, testers and developers to work more efficiently.Die stetige Steigerung der Komplexität eingebetteter Systeme geht einher mit einer ebenso steigenden Komplexität des Entwurfsprozesses. Wir befinden uns momentan in der Übergangsphase vom Entwurf von eingebetteten Systemen basierend auf Hardware-Beschreibungssprachen hin zum Entwurf ebendieser basierend auf virtuellen Plattformen. Da die Entwurfskomplexität rasanter steigt als die Produktivität der Entwickler, entsteht eine Kluft. Die Produktivität wiederherzustellen und gleichzeitig die gesteigerte Entwurfskomplexität zu bewältigen, kann auch erreicht werden, indem der Fokus auf die Entwicklung neuer Werkzeuge und Entwurfsmethoden gelegt wird. In den meisten Anwendungsgebieten werden Modellierungssprachen auf hoher Ebene, wie zum Beispiel SystemC, in den frühen Entwurfsphasen benutzt. In der modernen Software-Entwicklung wird Continuous Integration (CI) benutzt um automatisiert zu überprüfen, ob eine eingespielte Änderung am Quelltext bestehende Funktionalitäten beeinträchtigt. Die Anwendung des CI-Konzepts auf den Entwurf und das Testen von eingebetteten Systemen fordert schnelle Bau- und Test-Ausführungszeiten von dem genutzten Framework für virtuelle Plattformen. Für diesen Anwendungsfall wird auch die Fähigkeit, einen bestimmten Zustand der virtuellen Plattform zu speichern, erforderlich. Das Speichern und Wiederherstellen der Zustände einer Simulation erfordert die Serialisierung aller Datenstrukturen, die sich in den Simulationsmodellen befinden. Das Verbessern von Frameworks und Etablieren besserer Methodiken hilft nur die Entwurfs-Kluft zu verringern, wenn diese Änderungen mit Berücksichtigung der Bedürfnisse der Entwickler und Ingenieure eingeführt werden. Letztendlich ist es ihre Produktivität, die gesteigert werden soll. Die Fähigkeit den Zustand einer virtuellen Plattform zu speichern, ermöglicht es den Entwicklern, längere Testkampagnen laufen zu lassen, die auch zufällig erzeugte Teststimuli beinhalten können oder, falls die gespeicherten Zustände modifizierbar sind, fehlerbehaftete Zustände in die Simulationsmodelle zu injizieren. Mein mit dieser Arbeit geleisteter Beitrag beinhaltet die Erweiterung des SoCRocket Frameworks um Checkpointing Funktionalität im Sinne einer Referenzimplementierung. Weiterhin ermöglicht die Integration der UVM SystemC Bibliothek in das Framework die Umsetzung der testgetriebenen Entwicklung und schnelle Validierung von SystemC/TLM Modellen mit Hilfe von Snapshots

Digitale Bibliothek Braunschweig

A Self-managed Mesos Cluster for Data Analytics with QoS Guarantees

Author: Alfonso Laguna Carlos De
Blanquer Espert Ignacio
Caballer Fernández Miguel
Calatrava Arroyo Amanda
López-Huguet Sergio
Moltó Germán
Pérez-González Alfonso María
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

[EN] This article describes the development of an automated configuration of a software platform for Data Analytics that supports horizontal and vertical elasticity to guarantee meeting a specific deadline. It specifies all the components, software dependencies and configurations required to build up the cluster, and analyses the deployment times of different instances, as well as the horizontal and vertical elasticity. The approach followed builds up self-managed hybrid clusters that can deal with different workloads and network requirements. The article describes the structure of the recipes, points out to public repositories where the code is available and discusses the limitations of the approach as well as the results of several experiments.The work presented in this article has been partially funded by a research grant from the regional government of the Comunitat Valenciana (Spain), co-funded by the European Union ERDF funds (European Regional Development Fund) of the Comunitat Valenciana 2014-2020, with reference IDIFEDER/2018/032 (High-Performance Algorithms for the Modelling, Simulation and early Detection of diseases in Personalized Medicine). The authors would also like to thank the Spanish "Ministerio de Economia, Industria y Competitividad" for the project "BigCLOE" with reference number TIN2016-79951-R.López-Huguet, S.; Pérez-González, AM.; Calatrava Arroyo, A.; Alfonso Laguna, CD.; Caballer Fernández, M.; Moltó, G.; Blanquer Espert, I. (2019). A Self-managed Mesos Cluster for Data Analytics with QoS Guarantees. Future Generation Computer Systems. 96:449-461. https://doi.org/10.1016/j.future.2019.02.047S4494619

RiuNet

Efficient Parallel Application Execution on Opportunistic Desktop Grids

Author: Alfredo Goldman
Daniel Batista
Fabio Costa
Fabio Kon
Francisco Silva
Raphael Camargo
Publication venue: 'IntechOpen'
Publication date: 16/05/2012
Field of study

IntechOpen

Reliability -aware optimal checkpoint /restart model in high performance computing

Author: Liu Yudan
Publication venue: Louisiana Tech Digital Commons
Publication date: 01/04/2007
Field of study

Computational power demand for large challenging problems has increasingly driven the physical size of High Performance Computing (HPC) systems. As the system gets larger, it requires more and more components (processor, memory, disk, switch, power supply and so on). Thus, challenges arise in handling reliability of such large-scale systems. In order to minimize the performance loss due to unexpected failures, fault tolerant mechanisms are vital to sustain computational power in such environment. Checkpoint/restart is a common fault tolerant technique which has been widely applied in the single computer system. However, checkpointing in a large-scale HPC environment is much more challenging due to complexity, coordination, and timing issues. In this dissertation, we present a reliability-aware method for an optimal checkpoint/restart strategy. Our scheme aims to address the fault tolerance challenge, especially in a large-scale HPC system, by providing optimal checkpoint placement techniques derived from the actual system reliability. Unlike existing checkpoint models, which can only handle Poisson failure and a constant checkpoint interval, our model can perform a varying checkpoint interval and deal with different failure distributions. In addition, the approach considers optimality for both checkpoint overhead and rollback time. Our validation results suggest a significant improvement over existing techniques

Louisiana Tech Digital Commons

Simulation of Main Memory Database Recovery

Author: C. Fan
Chris H. Corti
Deitel H.
Demurjian S.
DeWitt D.
Eich M.
Garcia-Molina H.
Gawlick D.
Ken Salem
Le Gruenwald
Le Gruenwald
Lehman T.
Margaret H. Eich
Peterson J.
Sacco M.
Salem K.
Toby Lehman
Vijay Kumar and Albert Berger
Wilkinson W.
Publication venue: 'SAGE Publications'
Publication date: 01/01/1993
Field of study

In a main memory database (MMDB), the primary copy of the database may reside permanently in a volatile memory. When a system failure occurs, the database must be reloaded efficiently from archive memory into main memory. This paper presents four different reload schemes and the simulation models constructed to compare the algorithms. Simulation results indicate that the reload scheme based on freguency of data access gives the best overall performance in terms of transaction response time and system throughput.Yeshttps://us.sagepub.com/en-us/nam/manuscript-submission-guideline

Crossref

SHAREOK repository