Search CORE

280 research outputs found

Recommended from our members

Global-Scale Data Management with Strong Consistency Guarantees

Author: Nawab Faisal
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Global-scale data management(GSDM) empowers systems by providing higher levels of fault-tolerance, read availability, and efficiency in utilizing cloud resources. This has led to the emergence of global-scale data management and event processing. However, the Wide-Area Network (WAN) latency separating datacenters is orders of magnitude larger than typical network latencies, and this requires a reevaluation of many of the traditional design trade-offs of data management systems. Therefore, data management problems must be revisited to account for the new design space. In this dissertation, we propose theoretical foundations to understand the limits imposed by WAN latency on GSDM, and propose practical systems and protocols to minimize the overhead caused by WAN latency. The presented work spans global-scale transaction processing, communication, analytics, and machine learning. In all these directions, the focus is on the trade-off between consistency and latency, where we ask the question: what is the best performance (often latency) we can achieve without compromising the consistency and integrity of data? For transaction processing, we propose a lower-bound formulation for transaction latency that is imposed by the WAN latency. Also, we propose a new paradigm for transaction processing (proactive coordination) that inspired out two proposed protocols, Message Futures and Helios, which can achieve the lower-bound latency. We also propose a communication framework, called Chariots, to scale multi-datacenter communication. Chariots is carefully designed to allow scaling communication while providing a consistent view of the communicated information. Finally, we explore challenges in global-scale analytics and machine learning. Specifically, we propose Ogre, a scalable system for global-scale heterogeneous transactional and analytics workloads. Also, we propose COP, a system designed to speed up machine learning on globally generated data

eScholarship - University of California

Event-driven architectures using Apache Kafka

Author: Liarokapis Alexandros Aris
Λιαροκάπης Αλέξανδρος Άρης
Publication venue
Publication date: 24/11/2022
Field of study

DSpace at NTUA

Mariner Mars 1971 project. Volume 3: Mission operations system implementation and standard mission flight operations

Author
Publication venue
Publication date
Field of study

The Mariner Mars 1971 mission which was another step in the continuing program of planetary exploration in search of evidence of exobiological activity, information on the origin and evolution of the solar system, and basic science data related to the study of planetary physics, geology, planetology, and cosmology is reported. The mission plan was designed for two spacecraft, each performing a separate but complementary mission. However, a single mission plan was actually used for Mariner 9 because of failure of the launch vehicle for the first spacecraft. The implementation is described, of the Mission Operations System, including organization, training, and data processing development and operations, and Mariner 9 spacecraft cruise and orbital operations through completion of the standard mission from launch to solar occultation in April 1972 are discussed

NASA Technical Reports Server

Time4: Time for SDN

Author: Mizrahi Tal
Moses Yoram
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/02/2016
Field of study

With the rise of Software Defined Networks (SDN), there is growing interest in dynamic and centralized traffic engineering, where decisions about forwarding paths are taken dynamically from a network-wide perspective. Frequent path reconfiguration can significantly improve the network performance, but should be handled with care, so as to minimize disruptions that may occur during network updates. In this paper we introduce Time4, an approach that uses accurate time to coordinate network updates. Time4 is a powerful tool in softwarized environments, that can be used for various network update scenarios. Specifically, we characterize a set of update scenarios called flow swaps, for which Time4 is the optimal update approach, yielding less packet loss than existing update approaches. We define the lossless flow allocation problem, and formally show that in environments with frequent path allocation, scenarios that require simultaneous changes at multiple network devices are inevitable. We present the design, implementation, and evaluation of a Time4-enabled OpenFlow prototype. The prototype is publicly available as open source. Our work includes an extension to the OpenFlow protocol that has been adopted by the Open Networking Foundation (ONF), and is now included in OpenFlow 1.5. Our experimental results show the significant advantages of Time4 compared to other network update approaches, and demonstrate an SDN use case that is infeasible without Time4.Comment: This report is an extended version of "Software Defined Networks: It's About Time", which was accepted to IEEE INFOCOM 2016. A preliminary version of this report was published in arXiv in May, 201

arXiv.org e-Print Archive

CiteSeerX

Space programs summary no. 37-62, volume 2 for the period 1 January - 28 February 1970. The Deep Space Network

Author
Publication venue
Publication date
Field of study

Deep Space Network operations revie

NASA Technical Reports Server

Models of higher-order, type-safe, distributed computation over autonomous persistent object stores

Author: Mira da Silva Miguel Leitão Bignolas
Publication venue
Publication date: 01/01/1996
Field of study

A remote procedure call (RPC) mechanism permits the calling of procedures in another address space. RPC is a simple but highly effective mechanism for interprocess communication and enjoys nowadays a great popularity as a tool for building distributed applications. This popularity is partly a result of their overall simplicity but also partly a consequence of more than 20 years of research in transpaxent distribution that have failed to deliver systems that meet the expectations of real-world application programmers. During the same 20 years, persistent systems have proved their suitability for building complex database applications by seamlessly integrating features traditionally found in database management systems into the programming language itself. Some research. effort has been invested on distributed persistent systems, but the outcomes commonly suffer from the same problems found with transparent distribution. In this thesis I claim that a higher-order persistent RPC is useful for building distributed persistent applications. The proposed mechanism is: realistic in the sense that it uses current technology and tolerates partial failures; understandable by application programmers; and general to support the development of many classes of distributed persistent applications. In order to demonstrate the validity of these claims, I propose and have implemented three models for distributed higher-order computation over autonomous persistent stores. Each model has successively exposed new problems which have then been overcome by the next model. Together, the three models provide a general yet simple higher-order persistent RPC that is able to operate in realistic environments with partial failures. The real strength of this thesis is the demonstration of realism and simplicity. A higherorder persistent RPC was not only implemented but also used by programmers without experience of programming distributed applications. Furthermore, a distributed persistent application has been built using these models which would not have been feasible with a traditional (non-persistent) programming language

Glasgow Theses Service

Fault tolerance distributed computing

Author: LeBlanc Richard Joseph
Publication venue: Georgia Institute of Technology
Publication date: 01/01/1986
Field of study

Issued as Funds expenditure reports [nos. 1-4], Quarterly progress reports [nos. 1-3], and Final report, Project no. G-36-63

Scholarly Materials And Research @ Georgia Tech

Fehlertolerante Mehrkernprozessoren für gemischt-kritische Echtzeitsysteme

Author: Rambo Eberle Andrey
Publication venue
Publication date: 01/01/2019
Field of study

Current and future computing systems must be appropriately designed to cope with random hardware faults in order to provide a dependable service and correct functionality. Dependability has many facets to be addressed when designing a system and that is specially challenging in mixed-critical real-time systems, where safety standards play an important role and where responding in time can be as important as responding correctly or even responding at all. The thesis addresses the dependability of mixed-critical real-time systems, considering three important requirements: integrity, resilience and real-time. More specifically, it looks into the architectural and performance aspects of achieving dependability, concentrating its scope on error detection and handling in hardware -- more specifically in the Network-on-Chip (NoC), the backbone of modern MPSoC -- and on the performance of error handling and recovery in software. The thesis starts by looking at the impacts of random hardware faults on the NoC and on the system, with special focus on soft errors. Then, it addresses the uncovered weaknesses in the NoC by proposing a resilient NoC for mixed-critical real-time systems that is able to provide a highly reliable service with transparent protection for the applications. Formal communication time analysis is provided with common ARQ protocols modeled for NoCs and including a novel ARQ-based protocol optimized for DMAs. After addressing the efficient use of ARQ-based protocols in NoCs, the thesis proposes the Advanced Integrity Q-service (AIQ), a low-overhead mechanism to achieve integrity and real-time guarantees of NoC transactions on an End-to-End (E2E) basis. Inspired by transactions in distributed systems, the mechanism differs from the previous approach in that it does not provide error recovery in hardware but delegates the task to software, making use of existing functionality in cross-layer fault-tolerance solutions. Finally, the thesis addresses error handling in software as seen in cross-layer approaches. It addresses the performance of replicated software execution in many-core platforms. Replicated software execution provides protection to the system against random hardware faults. It relies on hardware-supported error detection and error handling in software. The replica-aware co-scheduling is proposed to achieve high performance with replicated execution, which is not possible with standard real-time schedulers.Um einen zuverlässigen Betrieb und korrekte Funktionalität zu gewährleisten, müssen aktuelle und zukünftige Computersysteme so ausgelegt werden, dass sie mit diesen Fehlern umgehen können. Zuverlässigkeit hat viele Aspekte, die bei der Entwicklung eines Systems berücksichtigt werden müssen. Das gilt insbesondere für Echtzeitsysteme mit gemischter Kritikalität, bei denen Sicherheitsstandards, die ein korrektes und rechtzeitiges Verhalten fordern, eine wichtige Rolle spielen. Diese Dissertation befasst sich mit der Zuverlässigkeit von gemischt-kritischen Echtzeitsystemen unter Berücksichtigung von drei wichtigen Anforderungen: Integrität, Resilienz und Echtzeit. Genauer gesagt, behandelt sie Architektur- und Leistungsaspekte die notwendig sind um Zuverlässigkeit zu erreichen, wobei der Schwerpunkt auf der Fehlererkennung und -behandlung in der Hardware – genauer gesagt im Network-on-Chip (NoC), dem Rückgrat des modernen MPSoC – und auf der Leistung der Fehlerbehandlung und -behebung in der Software liegt. Die Arbeit beginnt mit der Untersuchung der Auswirkung von zufälligen Hardwarefehlern auf das NoC und das System, wobei der Schwerpunkt auf weichen Fehler (soft errors) liegt. Anschließend werden die aufgedeckten Schwachstellen im NoC behoben, indem ein widerstandsfähiges NoC für gemischt-kritische Echtzeitsysteme vorgeschlagen wird, das in der Lage ist, einen höchst zuverlässigen Betrieb mit transparentem Schutz für die Anwendungen zu bieten. Nach der Auseinandersetzung mit der effizienten Nutzung von ARQ-basierten Protokolle in NoCs, wird der Advanced Integrity Q-Service (AIQ) vorgestellt, der ein Mechanismus mit geringem Overhead ist, um Integrität und Echtzeit-Garantien von NoC-Transaktionen auf Ende-zu-Ende (E2E)-Basis zu erreichen. Inspiriert von Transaktionen in verteilten Systemen unterscheidet sich der Mechanismus vom bisherigen Konzept dadurch, dass er keine Fehlerbehebung in der Hardware vorsieht, sondern diese Aufgabe an die Software delegiert. Schließlich befasst sich die Dissertation mit der Fehlerbehandlung in Software, wie sie in schichtübergreifenden Methoden zu sehen ist. Sie behandelt die Leistung der replizierten Software-Ausführung in Many-Core-Plattformen. Es setzt auf hardwaregestützte Fehlererkennung und Fehlerbehandlung in der Software. Das Replika-bewusste Co-Scheduling wird vorgeschlagen, um eine hohe Performance bei replizierter Ausführung zu erreichen, was mit Standard-Echtzeit-Schedulern nicht möglich ist

Digitale Bibliothek Braunschweig

A comparative study of structured and un-structured remote data access in distributed computing systems

Author: Tang Wai Chung
Publication venue
Publication date: 01/01/1990
Field of study

Recently, the use of distributed computing systems has been growing rapidly due to the result of cheap and advanced microelectronic technology. In addition to the decrease in hardware costs, the tremendous development in machine to machine communication interfaces, especially in local area networking, also favours the use of distributed systems. Distributed systems often require remote access to data stored at different sites. Generally, two models of access to remote data storage exist: the un structured and structured models. In the former, data is simply stored as row of bytes, whereas in the latter, data is stored along with the associated access codes. The objective of this thesis is to compare these two models and hence determines the tradeoffs of each model. First of all, an extended review of the field of distributed data access is provided which addressing key issues such as the basic design principles of distributed computing systems, the notions of abstract data types, data inheritance, data type system and data persistence. Secondly, a distributed system is implemented using the persistent programming language PS-algol and the high level language C in conjunction with the remote procedure call facilities available in Unix(^1) 4.2 BSD operating system. This distributed system makes extensive use of Unix's software tools and hence it is called DCSUNIX for Distributed Computing System on UNIX. Thirdly, two specific applications which employ the implemented system will be given so that a comparison can be made between the two remote data access models mentioned above. Finally, the implemented system is compared with the criteria established earlier in the thesis. keywords: abstract data types, class, database management, data persistence, information hiding, inheritance, object oriented programming, programming languages, remote procedure calls, transparency, and type checking

Durham e-Theses

The deep space network

Author
Publication venue
Publication date
Field of study

A report is given of the Deep Space Networks progress in (1) flight project support, (2) tracking and data acquisition research and technology, (3) network engineering, (4) hardware and software implementation, and (5) operations

NASA Technical Reports Server